Random Latent Clique Lifting (Graph to Simplicial)

A Random Latent Clique Lifting from Graphs to Simplicial Complexes

TL;DR We propose a lifting that ensures both 1) small-world property and 2) edge/cell sparsity. Combining these two properties is very attractive for Topological Deep Learning (TDL) because it ensures computational efficiency due to the reduced number of higher-order connections: only a few message-passing layers connect any two nodes.

Background. A graph is sparse if its number of edges grows proportional to the number of nodes. Many real-world graphs are sparse, but they contain many densely connected subgraphs and exhibit high clustering coefficients. Moreover, such real-world graphs frequently exhibit the small-world property, where any two nodes are connected by a short path of length proportional to the logarithm of the number of nodes. For instance, these are well-known properties of social networks, biological networks, and the Internet.

Contributions. In this notebook, we present a novel random lifting procedure from graphs to simplicial complexes. The procedure is based on a relatively recent proposed Bayesian nonparametric random graph model for random clique covers (Williamson & Tec, 2020). Specifically, the model can learn latent clique complexes that are consistent with the input graph. The model can capture power-law degree distribution, global sparsity, and non-vanishing local clustering coefficient. Its small-world property is also guaranteed, which is a very attractive property for Topological Deep Learning (TDL).

In the original work [1], the distribution has been used as a prior on an observed input graph. In particular, in the Bayesian setting, the model is useful to obtain a distribution on latent clique complexes, i.e. a specific class of simplicial complexes, whose 1-skeleton structural properties are consistent with the ones of the input graph used to compute the likelihood. Indeed, one of the features of the posterior distribution from which the latent complex is sampled is that the set of latent 1-simplices (edges) is a superset of the set of edges of the input graph.

In the context of Topological Deep Learning [2][3] and the very recently emerged paradigm of Latent Topology Inference (LTI) [4], it is natural to look at the model in [1] as a novel LTI method able to infer a random latent simplicial complex from an input graph. Or, in other words, to use [1] as a novel random lifting procedure from graphs to simplicial complexes.

Next, we provide a quick introduction to the model in [1]. For a more in-depth exposition, please refer to the paper. To the best of our knowledge, this is the first random lifting procedure relying on Bayesian arguments.

To summarize, this is:

a non-deterministic lifting,
not present in the literature as a lifting procedure,
Based on connectivity,
modifying the initial connectivity of the graph by adding edges (thus, this can be also considered as a graph rewiring method).

The Random Clique Cover Model

Let $G=(V,E)$ be a graph with $V$ the set of vertices and $E$ the set of edges. Denote the numer of nodes as $N=|V|$. A clique cover can be described as a matrix $Z$ of size $K \times N$ where $K$ is the number of cliques such that $Z_{k, i} = 1$ if node $i$ is in clique $k$ and $Z_{k, i} = 0$ otherwise. The Random Clique Cover (RCC) Model, defined in [1], is a probabilistic model for the matrix $Z$. This matrix can have an infinite number of rows and columns, but only a finite number of them will be active. The model is based on the Indian Buffet Process (IBP), which is a distribution over binary matrices with a possibly infinite number of rows and columns, or more specifically, the Stable Beta IBP as described in [5]. While the mathematics behind the IBP are complex, the model admits a highly intuitive representation describe below.

First, recall that a clique is a fully connected subset of vertices. Therefore, a clique cover $Z$ induces an adjacency matrix by the formula $A = min(Z^T Z - diag(Z^T Z), 1)$, where $min$ is the element-wise minimum. The IBP model can be described recursively as follows:

Conditional on $Z_1, Z_2, ..., Z_{K-1}$, where $Z_j$ is the $j$-th row of $Z$. Then, $Z_K$ is drawn as follows:

$Z_K$ will contain new unobserved nodes according to a distribution: $$Z_K | Z_1, Z_2, ..., Z_{K-1}\sim \text{Poisson}\left(\alpha \frac{\Gamma(1 + c)\Gamma(N + c + \sigma - 1)}{\Gamma(N + \sigma)\Gamma(c + \sigma)}\right)$$
The probability that a previously observed node $n$ will belong to $Z_K$ is proportional to how many cliques it is already in. Specifically, letting $m_i=\Sigma_{k=1}^{K-1} Z_{k, i}$, then $$P(Z_{K,i}=1|Z_1, Z_2, ..., Z_{K-1}) = \frac{m_i + \sigma}{K + c - 1}.$$

The last expression is highly intuitive in the sense that the number of cliques that a node will appear in is proportional to the number of cliques it is already in.

The RCC model depends on four parameters $\alpha, c, \sigma, \pi$. The first three parameters are part of the IBP. Explaining them in detail is beyond the scope of this notebook. However, the reader may see [5]. Fortunately, the learned (posterior) values of $\alpha, \sigma, c$ are strongly determined by the data itself. By contrast, $\pi$ is approximately the probability that an edge is missing from the graph. Generally, the lower $\pi$ is, the lower the number of cliques will be and the less interconnected the nodes of the clique will be.

Importantly, by leveraging the possibility of latent inferred edges, one will superimpose the small-world property on the graph.

References

[1] Williamson, Sinead A., and Mauricio Tec. "Random clique covers for graphs with local density and global sparsity." Uncertainty in Artificial Intelligence (UAI). PMLR, 2020.

[2] Papamarkou, Theodore, et al. "Position paper: Challenges and opportunities in topological deep learning." arXiv preprint arXiv:2402.08871 (2024).

[3] Hajij, Mustafa, et al. "Topological deep learning: Going beyond graph data." arXiv preprint arXiv:2206.00606 (2022).

[4] Battiloro, Claudio, et al. "From latent graph to latent topology inference: Differentiable cell complex module." The Twelfth International Conference on Learning Representations (ICLR), 2024.

[5] Teh, Yee Whye, and Dilan Görür. "Indian Buffet Processes with Power-law Behavior." Advances in neural information processing systems. 2009.

From https://github.com/pyt-team/challenge-icml-2024/pull/63

Home | TopoTune | Structural Liftings

TopoTune

Defining GCCNs
Defining backbone models
Reproducing experiments

Structural Liftings

Graph to Simplicial Complex
Graph to Cell Complex
- Discrete Configuration Complex
- Cycle Lifting
Graph to Hypergraph
Graph to Combinatorial
Pointcloud to Graph
- Protein Lifting
Pointcloud to Simplicial
- Delaunay Lifting
- Random Flag Complex
Pointcloud to Hypergraph
Hypergraph to Simplicial
- Hypergraph Heat Kernel Lifting
Hypergraph to Combinatorial
- Universal Strict Lifting

Random Latent Clique Lifting (Graph to Simplicial)

A Random Latent Clique Lifting from Graphs to Simplicial Complexes

The Random Clique Cover Model

References

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

TopoTune

Structural Liftings

Clone this wiki locally