Skip to content

Commit ccbb921

Browse files
committed
doc: update README.md for GED modules.
1 parent e505c68 commit ccbb921

File tree

1 file changed

+79
-1
lines changed

1 file changed

+79
-1
lines changed

README.md

Lines changed: 79 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -94,13 +94,91 @@ A demo of computing graph kernels can be found on [Google Colab](https://colab.r
9494

9595
### 2 Graph Edit Distances
9696

97+
We currently support a GEDModel class compatible with the `scikit-learn` transformer interface,
98+
which can be used to compute the graph edit distance between attributed graphs.
99+
The `GEDModel` class is based on the extended [`GEDLIB`](https://github.com/dbblumenthal/gedlib) library. See Section
100+
[GEDLIB](#4-interface-to-gedlib) for more details.
101+
102+
#### The following GED methods are supported:
103+
104+
- BRANCH
105+
- BRANCH_FAST
106+
- BRANCH_TIGHT
107+
- BRANCH_UNIFORM
108+
- BRANCH_COMPACT
109+
- PARTITION
110+
- HYBRID
111+
- RING
112+
- ANCHOR_AWARE_GED
113+
- WALKS
114+
- IPFP
115+
- BIPARTITE
116+
- SUBGRAPH
117+
- NODE
118+
- RING_ML
119+
- BIPARTITE_ML
120+
- REFINE
121+
- BP_BEAM
122+
- SIMULATED_ANNEALING
123+
- HED
124+
- STAR
125+
126+
with `GUROBI`:
127+
128+
- F1
129+
- F2
130+
- COMPACT_MIP
131+
- BLP_NO_EDGE_LABELS
132+
133+
#### The following GED cost functions are supported:
134+
135+
- CHEM_1
136+
- CHEM_2
137+
- CMU
138+
- GREC_1
139+
- GREC_2
140+
- PROTEIN
141+
- FINGERPRINT
142+
- LETTER
143+
- LETTER2
144+
- Similar to `LETTER`, but uses 6 cost constants instead of 3. See details [here](https://github.com/jajupmochi/gedlib/blob/master/src/edit_costs/letter_2.hpp).
145+
- NON_SYMBOLIC
146+
- Edit costs for graphs containing only non-symbolic (numeric) node and edge
147+
labels. These labels are used to compute relabeling (substitution) costs, using
148+
e.g., the Euclidean distance. See details [here](https://github.com/jajupmochi/gedlib/blob/master/src/edit_costs/non_symbolic.hpp#L35).
149+
- GEOMETRIC
150+
- Edit costs for graphs containing mixed node and edge attributes (e.g., string (symbolic) and numeric (non-symbolic)).
151+
Users can choose the (dis-)similarity measure for each label type, e.g.,
152+
`cosine_distance` for numeric vectors. See details [here](https://github.com/jajupmochi/gedlib/blob/master/src/edit_costs/geometric.hpp#L42).
153+
- CONSTANT
154+
155+
Detailed documentation can be found [here](https://dbblumenthal.github.io/gedlib/index.html).
156+
97157
### 3 Graph preimage methods
98158

99159
A demo of generating graph preimages can be found on [Google Colab](https://colab.research.google.com/drive/1PIDvHOcmiLEQ5Np3bgBDdu0kLOquOMQK?usp=sharing) and in the [`examples`](https://github.com/jajupmochi/graphkit-learn/blob/master/gklearn/examples/median_preimege_generator.py) folder.
100160

101161
### 4 Interface to `GEDLIB`
102162

103-
[`GEDLIB`](https://github.com/dbblumenthal/gedlib) is an easily extensible C++ library for (suboptimally) computing the graph edit distance between attributed graphs. [A Python interface](https://github.com/jajupmochi/graphkit-learn/tree/master/gklearn/gedlib) for `GEDLIB` is integrated in this library, based on [`gedlibpy`](https://github.com/Ryurin/gedlibpy) library.
163+
[`GEDLIB`](https://github.com/dbblumenthal/gedlib) is an easily extensible C++ library for (suboptimally) computing the
164+
graph edit distance between attributed graphs. [A Python interface](https://github.com/jajupmochi/graphkit-learn/tree/master/gklearn/gedlib) for `GEDLIB` is
165+
integrated in this library, based on [`gedlibpy`](https://github.com/Ryurin/gedlibpy) library. We also extended the
166+
library, adding the following features:
167+
168+
- Support attributed graphs with the following node and edge label types:
169+
- strings, integers, floats, lists / `numpy` arrays of floats and integers. Arbitrary
170+
numbers of features can be added.
171+
172+
- Support fast vectorized computation between labels using `Eigen` (e.g., cosine or
173+
Euclidean distances).
174+
- To benefit from this, we recommend merging numeric labels into
175+
a single label with a `numpy` array.
176+
177+
- Support the following GED cost functions:
178+
- `LETTER2`, `NON_SYMBOLIC`, `GEOMETRIC`.
179+
- See Section [GED](#3-graph-edit-distances) for more details.
180+
181+
- Use modern C++ 17 features, such as `std::optional`, `std::variant`, `std::any`.
104182

105183
### 5 Computation optimization methods
106184

0 commit comments

Comments
 (0)