Abstract:
Self-supervised graph pre-training frameworks have shown effective for in-domain knowledge
transfer, where a model is typically pre-trained on unlabeled massive graph data to learn
the general transferable knowledge before being fine-tuned for other specific tasks. However,
their capability to learn domain-invariant knowledge for cross-domain transfer remains
unknown. Moreover, the information of how individuals are positioned across the entire
graph is largely overlooked. To bridge this gap, we propose Graph Efficient (GrapE)
pre-training framework to seamlessly integrate the augmented graphs with complementary
positioning information to enhance domain-invariant knowledge learning in a self-supervised
manner for cross-domain graph transfer. First, to obtain global positioning perspective,
we propose to augment the original graph with a component graph, which reveals the
dual topological structures in node-level and component-level. The global proximity
estimates to enrich self-supervised signals between a set of nodes and components offer
a rich perspective on positioning individuals in the entire graph. Lastly, to alleviate
the tremendous computational burden on pre-training massive graphs, GrapE adopts
the sequential training paradigm to continually grow in transfer knowledge over limited
sampled graph instances to improve data efficiency. In extensive experiments on four
benchmarks, GrapE is shown to achieve better data efficiency, generalization performance,
and transferability by a considerable margin, in both in-domain and cross-domain transfer
settings, via two fine-tuning tasks.