You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
With relatively few edges (let's say that the number of edges E is about 10 x N, where N is the number of nodes),
Note: I say "relatively few" because a more densely connected graph can have upwards of N x N edges
From that graph, I want to know:
If it has any loops,
What the smallest loop length is (if it has any), and
What nodes are there that loop unto themselves at that loop length.
To do this, the fastest solution I've been able to think of is to use an adjacency matrix A, where vectors are bitfields and their dot product looks like this:
Sidenote: Dot Product for when Adjacency Matrices are implemented as bitfields
// Bitfield dot product v x u is defined as: w = v & u;, w is the bitwise AND of the two vectors w' = 0 if w == (0....0); else 1;, w' is 0 if the resulting vector is 0, and 1 otherwise.
Basically it's a normal dot product, only that scalar results greater than 1 get clamped to 1.
My solution draft
First requirement: finding out if the graph has loops
Anyways. Using the adjacency matrix, I can quickly check if the graph contains loops. We know from algebraic graph theory:
The powers of an adjacency matrix A reach zero if and only if the graph is acyclic
And the largest power that matters for such an adjacency matrix is N, where N is the number of nodes in the graph. That is, if A is an NxN matrix, then:
A represents an acyclic graph if and only if A^N = 0
We can reach the Nth power in log N operations, since we can square our result repeatedly:
(((...log N times...(A^2)^2)...log N times)^2 = A^N in log N matrix multiplications
If the result is 0 at any step, we exit early since we know there are no loops.
If the result is not 0 by the end, we know there are loops
This step then has a complexity of O(N^3) for each matrix multiplication, for a total time cost of O(N^3 log N).
Second requirement: finding out the smallest loop length
Optimizing the first requirement
A shortcut we can take for when there are loops in the graph, is the following fact:
If A^k has 1 in one of its diagonal elements at index [i, i], then A has a loop of at most k steps, starting and ending at node i.
So in our prior requirement, if we check at every step for the presence of a non-zero diagonal element, we will know there are loops in the graph and can continue on.
Actually computing the second requirement
Knowing that some power k of A gives rise to loops in the graph, we can find out the smallest k by doing binary search over the exponents.
Finding out this special k, from now on k*, will take O(log N) matrix multiplications, each one of them costing O(N^3), for a total cost of O(N^3 log N).
Third requirement
After calculating A^k*, its non-zero diagonal elements will be the nodes that loop unto themselves in k* edges.
Since this step reuses the result of the second step, we incur a cost of O(N) for reading the diagonal elements.
Total cost and better alternatives?
Sparse adjacency matrices
There is something to be said about using sparse matrices. However, sparse adjacency matrices quickly become dense when we start computing their powers. So sparse matrices are out of the question.
Floyd-Warshall
One could argue that the Floyd-Warshall algorithm fares better, since it's only of cost O(N^3). However, this alternative algorithm allows me to exit early if there are no loops or if there are only short loops, which is the most common use scenario I have in mind. And I'd like for that scenario to be as lightweight as possible.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
My problem
I will give more context here.
E
is about10 x N
, whereN
is the number of nodes),From that graph, I want to know:
To do this, the fastest solution I've been able to think of is to use an adjacency matrix
A
, where vectors are bitfields and their dot product looks like this:Sidenote: Dot Product for when Adjacency Matrices are implemented as bitfields
Basically it's a normal dot product, only that scalar results greater than 1 get clamped to 1.
My solution draft
First requirement: finding out if the graph has loops
Anyways. Using the adjacency matrix, I can quickly check if the graph contains loops. We know from algebraic graph theory:
And the largest power that matters for such an adjacency matrix is
N
, whereN
is the number of nodes in the graph. That is, ifA
is anNxN
matrix, then:We can reach the
N
th power inlog N
operations, since we can square our result repeatedly:If the result is
0
at any step, we exit early since we know there are no loops.If the result is not
0
by the end, we know there are loopsThis step then has a complexity of
O(N^3)
for each matrix multiplication, for a total time cost ofO(N^3 log N)
.Second requirement: finding out the smallest loop length
Optimizing the first requirement
A shortcut we can take for when there are loops in the graph, is the following fact:
So in our prior requirement, if we check at every step for the presence of a non-zero diagonal element, we will know there are loops in the graph and can continue on.
Actually computing the second requirement
Knowing that some power
k
ofA
gives rise to loops in the graph, we can find out the smallestk
by doing binary search over the exponents.Finding out this special
k
, from now onk*
, will takeO(log N)
matrix multiplications, each one of them costingO(N^3)
, for a total cost ofO(N^3 log N)
.Third requirement
After calculating
A^k*
, its non-zero diagonal elements will be the nodes that loop unto themselves ink*
edges.Since this step reuses the result of the second step, we incur a cost of
O(N)
for reading the diagonal elements.Total cost and better alternatives?
Sparse adjacency matrices
There is something to be said about using sparse matrices. However, sparse adjacency matrices quickly become dense when we start computing their powers. So sparse matrices are out of the question.
Floyd-Warshall
One could argue that the Floyd-Warshall algorithm fares better, since it's only of cost
O(N^3)
. However, this alternative algorithm allows me to exit early if there are no loops or if there are only short loops, which is the most common use scenario I have in mind. And I'd like for that scenario to be as lightweight as possible.Conclusion
What do you think?
Is my algorithm insane? X'D
Should I be using
nalgebra
for this?Or do you recommend I use something else?
Beta Was this translation helpful? Give feedback.
All reactions