OrbitSI is a NetworkX-based Python package that efficiently solves the subgraph isomorphism enumeration problem, i.e., finding all subgraphs in a large data graph Gd that are isomorphic to a pattern/query graph Gp. It also supports fast vertex orbit counting, which summarises the local structural roles of vertices within a network.
OrbitSI combines orbit-based filtering and ordering to prune and order candidate node mappings, significantly reducing the search space during subgraph matching. It integrates two state-of-the-art orbit counting engines — EVOKE and ORCA — via optimised C++ bindings using pybind11
, and builds the search pipeline using networkx
.
The package includes a command-line interface (CLI), usable library, and testing utilities. It is installable via pip
, tested on synthetic and real-world benchmark datasets, and fully open source under the Apache 2.0 license.
pip install orbitsi
- Clone the repository:
git clone https://hpdc-gitlab.eeecs.qub.ac.uk/sitauhidi/orbitsi-nx.git
cd orbitsi-nx
- Build and install:
pip install .
Requires Python 3.8+, NumPy, NetworkX, and a C++17 compiler (e.g., g++ or clang++).
Once installed, the orbitsi
CLI tool allows you to perform:
- Subgraph Isomorphism Search
- Node Orbit Counting
To confirm installation:
orbitsi --help
Find all subgraphs in a large graph Gd that are isomorphic to a smaller query/pattern graph Gp.
orbitsi search --data <path_to_data_graph> --pattern <path_to_pattern_graph> [--orbit-counter evoke|orca] [--graphlet-size 4|5]
--data
: Path to the data graph file (required)--pattern
: Path to the pattern/query graph file (required)--orbit-counter
: Orbit counting backend, eitherevoke
(default) ororca
--graphlet-size
: Graphlet size to use for orbit counting (4
or5
, default:4
)
orbitsi search --data datasets/data_graph/HPRD.graph --pattern datasets/query_graph/query1.graph --orbit-counter evoke --graphlet-size 4
Compute node orbit vectors for a given graph using either the EVOKE or ORCA backend.
orbitsi count-orbits --graph <path_to_graph> [--orbit-counter evoke|orca] [--graphlet-size 4|5] [--induced]
--graph
: Path to the graph file (required)--orbit-counter
: Orbit counting backend, eitherevoke
(default) ororca
--graphlet-size
: Size of graphlets to consider (4
or5
)--induced
: Flag to compute induced orbit counts (optional; default is non-induced)
orbitsi count-orbits --graph datasets/data_graph/HPRD.graph --orbit-counter orca --graphlet-size 5 --induced
All graphs should be in .graph
format:
t N M
v 0 1 2
v 1 2 1
...
e 0 1
e 1 2
...
t N M
: number of nodesN
and edgesM
v ID Label Degree
: vertex ID, label (label is required), degree (optional)e u v
: edge between vertexu
andv
A complete suite of unit and integration tests is provided in the tests/
directory. This includes:
- Benchmark-based validation for subgraph isomorphism
- Cross-validation of orbit counting with external tools
To get started, see tests/README.md
for detailed setup and usage instructions.
If you use OrbitSI for subgraph isomorphism search, please cite:
Tauhidi, Syed Ibtisam, Arindam Karmakar, Thai Son Mai, and Hans Vandierendonck. "OrbitSI: An Orbit-based Algorithm for the Subgraph Isomorphism Search Problem." In 2024 IEEE International Conference on Knowledge Graph (ICKG), pp. 360-369. IEEE, 2024. https://doi.org/10.1109/ICKG63256.2024.00052
If you only use the orbit counting modules (ORCAOrbitCounter
, EVOKEOrbitCounter
), cite:
-
EVOKE
: Noujan Pashanasangi and C. Seshadhri. "Efficiently counting vertex orbits of all 5-vertex subgraphs, by EVOKE.", WSDM 2020: Proceedings of the 13th International Conference on Web Search and Data Mining, pp. 447–455. https://doi.org/10.1145/3336191.3371773 -
ORCA
: Tomaž Hočevar and Janez Demšar. "A combinatorial approach to graphlet counting." Bioinformatics, 30(4): 559–565, 2014. https://doi.org/10.1093/bioinformatics/btt717
This work was supported by:
- Kelvin Living Lab [grant number EP/Z531054/1]
- Ministry of Education, Government of India through the collaboration between Queen's University Belfast and Tezpur University.
This project is licensed under the Apache License 2.0.
See the LICENSE file for details.