Understanding the intrinsic patterns of human brain is important to make inferences about the mind and brain-behavior association. Electrophysiological methods (i.e. MEG/EEG) provide direct measures of neural activity without the effect of vascular confounds. The blood oxygenated level-dependent (BOLD) signal of functional MRI (fMRI) reveals the spatial and temporal brain activity across different brain regions. However, it is unclear how to associate the high temporal resolution Electrophysiological measures with high spatial resolution fMRI signals. Here, we present a novel interpretable model for coupling the structure and function activity of brain based on heterogeneous contrastive graph representation. The proposed method is able to link manifest variables of the brain (i.e. MEG, MRI, fMRI and behavior performance) and quantify the intrinsic coupling strength of different modal signals. The proposed method learns the heterogeneous node and graph representations by contrasting the structural and temporal views through the mind to multimodal brain data. The first experiment with 1200 subjects from Human connectome Project (HCP) shows that the proposed method outperforms the existing approaches in predicting individual gender and enabling the location of the importance of brain regions with sex difference. The second experiment associates the structure and temporal views between the low-level sensory regions and high-level cognitive ones. The experimental results demonstrate that the dependence of structural and temporal views varied spatially through different modal variants. The proposed method enables the heterogeneous biomarkers explanation for different brain measurements.