Few-Shot Graph Out-of-Distribution Detection with LLMs

Abstract

Existing methods for graph out-of-distribution (OOD) detection typicallydepend on training graph neural network (GNN) classifiers using a substantialamount of labeled in-distribution (ID) data. However, acquiring high-qualitylabeled nodes in text-attributed graphs (TAGs) is challenging and costly due totheir complex textual and structural characteristics. Large language models(LLMs), known for their powerful zero-shot capabilities in textual tasks, showpromise but struggle to naturally capture the critical structural informationinherent to TAGs, limiting their direct effectiveness. To address these challenges, we propose LLM-GOOD, a general framework thateffectively combines the strengths of LLMs and GNNs to enhance data efficiencyin graph OOD detection. Specifically, we first leverage LLMs' strong zero-shotcapabilities to filter out likely OOD nodes, significantly reducing the humanannotation burden. To minimize the usage and cost of the LLM, we employ it onlyto annotate a small subset of unlabeled nodes. We then train a lightweight GNNfilter using these noisy labels, enabling efficient predictions of ID statusfor all other unlabeled nodes by leveraging both textual and structuralinformation. After obtaining node embeddings from the GNN filter, we can applyinformativeness-based methods to select the most valuable nodes for precisehuman annotation. Finally, we train the target ID classifier using theseaccurately annotated ID nodes. Extensive experiments on four real-world TAGdatasets demonstrate that LLM-GOOD significantly reduces human annotation costsand outperforms state-of-the-art baselines in terms of both ID classificationaccuracy and OOD detection performance.