GraphHSCN: Heterogenized Spectral Cluster Network for Long Range Graph Data

Abstract

Graph Neural Networks (GNNs) have gained tremendous popularity for their potential to effectively learn from graphstructured data, commonly encountered in real-world applications. However, most of these models, based on the messagepassing paradigm (interaction within a neighborhood of a few nodes), can only handle local interactions within a graph. When we enforce the models to use information from far away nodes, we will encounter two major issues — oversmoothing & oversquashing. Architectures such as the transformer and diffusion models are introduced to solve this. Yet, these models are not tested on large graph datasets containing graphs with large diameters. Although transformers are powerful, they require significant computational resources for both training and inference, thereby limiting their scalability, particularly for graphs with long-term dependencies. Hence, this paper proposes GraphHSCN—a Heterogenized Spectral Cluster Network, a message-passing-based approach specifically designed for capturing long-range interaction information (when prediction depends on representations of distant nodes interacting with each other).

Architecture

Our model is composed of the following components:

SignNet Positional Encoder
- In the architecture we determined to use SignNet to compute the positional encoding for each graph. We pre-computed the top 50 eigenvalues and corresponding eigenvectors of the laplacian matrix L of each graph. Then we used the com- puted top 50 eigenvectors (Nx50) as the input for the SignNet.
Spectral clustering
- We trained a GNN model using pytorch-geometric and used the loss functions as the sum of minCUT Loss and orthogonal Loss to compute the cluster assignment for each graphs. We trained for each graph for about 50 iterations with a early stop threshold of loss decrease as 0.001. With the trained model we get the cluster assignments which we used in the next step to create virtual nodes and construct the heterogenous graph dataset.
Heterogeneous graph
- From the original graphs after we trained and computed the clusters we will add one new virtual node for each of the clus- ter. For K clusters, where K is a arbitrary number, we will add K new virtual nodes. For each virtual node, we will add M new edges between all the nodes in one cluster and the new virtual node, where M is the number of the nodes in a clus- ter. After virtual nodes connect the local nodes, we will fully connect all the virtual nodes. To address the different proper- ties of the edges from local nodes to local nodes, the edges from local nodes to virtual nodes and the edges from virtual nodes to virtual nodes, we applied the idea of Heterogenous Graph Neural Network by using different message passing convolution layers for different types of edges.

Results

Because of the limited time and computational resources, we could not run the experiment for all the datasets. Therefore, we chose the dataset Peptides-func and Peptides-struct and re-sampled citation dataset from Cora. We used Average Precision, Accuracy, and Mean Absolute Error corresponding to the dataset as suggested by the author. Our baseline models are GCN, GAT, GIN, and SAN.

The models are tested on the following datasets:

Citation Dataset
Peptides Dataset

They are benchmarked against the following models:

GCN
GAT
GIN
SAN ( (SAN is not used in the citation dataset due to la)

The results of our model are shown below: As figures indicate, we achieved a similar performance as SAN for peptides datasets, which is better than other common Message-passing GNNs. Yet, we are way more efficient than SAN.

Discussion

As our model is not performing as well on the resampled citation datasets, we proposes the following potential causes:

The Cora dataset originally might not contain very important long-range interactions, therefore after re-sample it's still not sure if the long-range interactions now contained in the graphs are strong enough to have great influences on the predictions.
Cora dataset contains one single graph for node-level tasks. While our models work well for graph-level tasks, the cluster-method with the fully connected virtual nodes might lead the graph to another side of over-smoothing. While we used heterogenous GNN to try to address this possible issue, that might still has an influence on node-level predictions tasks.

Related Works

There's a lot of excellent work that was introduced around the same time as ours.

BibTeX

@article{xxx
  author    = {Tao, Sirui and Luo, Zhishang and Dunning, Camille},
  title     = {GraphHSCN: Heterogenized Spectral Cluster Network for Long Range Graph Data},
  journal   = {arxiv},
  year      = {2023},
}