Causal-augmented Source-free Domain Adaptation with Scale-free Transformer for Schizophrenia Classification

Yixin Ji¹ Vince D. Calhoun² Qi Zhu¹ Zhengwang Xia³ Jin Zhang⁴ Shengrong Li¹ Theo G. M. Van Erp⁵ Daniel H. Mathalon⁶ Si Yong Yeo⁷ Shile Qi¹^(✉) Daoqiang Zhang¹

¹Nanjing University of Aeronautics and Astronautics, Nanjing, China ²Tri-Institutional Center for Translational Research in Neuroimaging and Data Science (TReNDS), Georgia Institute of Technology, Atlanta, GA, USA ³Changzhou Institute of Technology, Changzhou, China ⁴Northwestern Polytechnical University, Xi'an, China ⁵University of California, Irvine, CA, USA ⁶San Francisco VA Medical Center and University of California San Francisco, San Francisco, CA, USA ⁷Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore

📄 Paper 📑 PDF

Abstract

Multi-site schizophrenia classification from resting-state fMRI faces a practical adaptation challenge: source-domain data are often unavailable during deployment due to privacy and data-sharing constraints. Source-free domain adaptation (SFDA) addresses this setting by adapting a pretrained source model to an unlabeled target domain without revisiting source data, but this requires robust feature extraction and effective target-domain regularization.

We propose a two-stage framework for schizophrenia classification based on brain functional networks (BFNs). In Stage 1, a scale-free transformer is pretrained on labeled source-domain BFNs. Its attention mechanism is guided by a scale-free prior that biases attention toward high-degree hub nodes, reflecting the topology commonly observed in functional brain networks and improving discriminative feature learning. In Stage 2, the pretrained encoder and classifier are transferred to the unlabeled target domain for source-free adaptation.

To improve robustness under target-domain shift, we introduce a causal-augmented adaptation strategy. Latent representations are used to construct causal graphs, which are then perturbed through random permutation and counterfactual interventions to produce diverse augmented views. Together with entropy minimization, these augmentations encourage more stable target-domain decision boundaries.

Experiments on FBIRN and BSNIP show that the proposed method achieves strong target-domain performance, reaching 87.18%±0.91% and 88.39%±0.13% accuracy, respectively.

Key Contributions

A source-free domain adaptation framework for multi-site schizophrenia classification that adapts to unlabeled target domains without revisiting source-domain data.
A scale-free transformer encoder that incorporates a scale-free prior into self-attention, emphasizing high-degree hub nodes in brain functional networks.
A causal-augmented adaptation strategy that constructs causal graphs from target-domain latent features and generates augmented views through random permutation and counterfactual intervention.
A joint optimization scheme combining causal augmentation and entropy minimization to improve robustness and target-domain generalization.

Method Overview

The framework operates in two stages. In Stage 1, BFNs constructed from resting-state fMRI across multiple source sites are used to train a transformer encoder and classifier. A scale-free prior is injected into the attention mechanism so that the encoder focuses more strongly on hub-like regions, which are important for global information integration in functional brain networks.

In Stage 2, the pretrained encoder and classifier are transferred to an unlabeled target site under the source-free domain adaptation setting. The encoder extracts latent features that are used to construct causal graphs. These graphs are perturbed by random permutation and counterfactual intervention to generate diverse target-domain augmentations. Entropy minimization is further applied to encourage confident target predictions. Together, these components improve robustness to target-domain distribution shift without requiring access to source data.

Figure 1. Flowchart of the proposed source-free DA framework. The framework consists of two stages: 1) A pretrained source model, where the transformer encoder was trained on labeled source domains to extract discriminative BFN features with a scale-free prior that biases the attention distribution toward high-degree nodes. This stage includes (a) BFN construction, (b) feature extraction, and (c) classification, with the encoder and classifier parameters preserved and used to initialize the target domain model. 2) Causal-augmented source-free DA, where the encoder in the unlabeled target domain learned latent representations that capture inter-regional interactions for causal graph construction. These graphs were perturbed through random permutation and counterfactual interventions to enhance robustness. The resulting augmentations together with entropy minimization jointly optimized the encoder and classifier and promoted stable and site-invariant decision boundaries.

Results & Performance

The proposed framework was evaluated under a multi-site adaptation setting on two independent schizophrenia datasets, FBIRN and BSNIP. The method was compared against multiple transformer variants and domain adaptation baselines, with classification accuracy measured on the target site.

Classification performance comparison with different transformer models

Figure 2. Classification performance in comparison with different transformer variants on FBIRN and BSNIP. The proposed scale-free transformer with causal-augmented source-free adaptation achieves the strongest overall performance.

Figure 3 presents the ablation study across target domains. The results show that the scale-free prior, causal graph construction, random permutation, counterfactual intervention, and entropy minimization each contribute positively to the final performance.

Ablation study results on different target domains

Figure 3. Ablation study across different target domains on FBIRN and BSNIP. Removing the scale-free prior, causal graph construction, random permutation, counterfactual intervention, or entropy minimization leads to consistent performance degradation.

Figure 4 shows parameter sensitivity across target sites. The method remains stable over a broad range of hyperparameter settings, indicating good robustness in practical adaptation scenarios.

Classification accuracies with varying parameter values

Figure 4. Sensitivity analysis of key hyperparameters across target domains on FBIRN and BSNIP. The proposed method maintains stable accuracy over a broad parameter range.

On the two benchmark datasets, the proposed framework achieved 87.18%±0.91% accuracy on FBIRN and 88.39%±0.13% accuracy on BSNIP.

Citation

@article{ji2026causal,
  title={Causal-augmented Source-free Domain Adaptation with Scale-free Transformer for Schizophrenia Classification},
  author={Ji, Yixin and Calhoun, Vince D and Zhu, Qi and Xia, Zhengwang and Zhang, Jin and Li, Shengrong and Van Erp, Theo G M and Mathalon, Daniel H and Yeo, Si Yong and Qi, Shile and Zhang, Daoqiang},
  journal={IEEE Transactions on Neural Systems and Rehabilitation Engineering},
  year={2026},
  doi={10.1109/TNSRE.2026.3676767},
  publisher={IEEE}
}

Published in IEEE Transactions on Neural Systems and Rehabilitation Engineering, 2026

DOI: 10.1109/TNSRE.2026.3676767