-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathdesign-decisions.tex
198 lines (154 loc) · 9.41 KB
/
design-decisions.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
\section{Design Decisions}
\label{sec:decisions}
LAGraph is for users who want to use graph algorithms that run on top of the GraphBLAS.
Our overarching design goal is ease of use with flexibility to handle advanced use-cases.
We do not wish to compromise performance, but when the tradeoff between convenience and performance
is unavoidable, we offer both and let the user choose.
LAGraph includes a set of data structures and utility functions
that make it convenient for developers to write algorithms on top of GraphBLAS with
an approachable API and consistent user experience.
\begin{listing}
\begin{cplus}
typedef struct LAGraph_Graph_struct
{
GrB_Matrix A; // adjacency matrix of the graph
LAGraph_Kind kind; // kind of graph: directed, etc.
// cached properties
GrB_Matrix AT; // transpose of A
GrB_Vector row_degree;
GrB_Vector col_degree;
LAGraph_BooleanProperty A_pattern_is_symmetric;
int64_t ndiag; // -1 if unknown
} *LAGraph_Graph;
typedef struct LAGraph_Graph_struct *LAGraph_Graph; |$\label{line:LAGraph_Graph}$|
// creating a graph
GrB_Matrix M;
// ...construction of M omitted
LAGraph_Graph G;
LAGraph_New(&G, &M, LAGRAPH_DIRECTED_ADJACENCY); |$\label{line:CreateGraphObject}$|
// operating on properties
LAGraph_Property_AT(G, msg); |$\label{line:ComputeTranspose}$|// compute/cache
\end{cplus}
\caption{{\tt LAGraph\_Graph} data structure and methods.}
\label{Lst.graph}
\end{listing}
\subsection{Core data structure}
The main data structure in LAGraph is the \verb'LAGraph_Graph' which consists of primary components
and cached properties. The data structure is not opaque, providing the user with full ability to access and
modify all internal components. This contrasts with the opaque objects in the GraphBLAS. This data structure is
shown at the top of Listing~\ref{Lst.graph} and defined ultimately on Line~\ref{line:LAGraph_Graph}.
The primary components of this struct are a GraphBLAS matrix named \verb'A' and an enumeration \verb'kind'.
The kind indicates how the matrix should be interpreted.
Currently, the only kinds defined are \verb'LAGRAPH_ADJACENCY_UNDIRECTED' and
\verb'LAGRAPH_ADJACENCY_DIRECTED', but more options will be added in the future. Creating the
Graph object is performed on Line~\ref{line:CreateGraphObject} of Listing~\ref{Lst.graph}.
Following this call, \verb'M' will be \verb'NULL'. The matrix previously pointed to by \verb'M' now
lives at \verb'G->A'. This ``move'' constructor helps avoid memory-freeing errors.
Cached properties include the transpose of \verb'A', the row degrees, column degrees, etc.
They can be computed from the primary components, but doing so repeatedly for each algorithm
utilizing \verb'A' would be wasteful. Having them live inside the Graph object simplifies
algorithm call signatures. Utility functions exist to compute each cached property. For example,
Line~\ref{line:ComputeTranspose} of Listing~\ref{Lst.graph} will compute the transpose of \verb'G->A' and store it as \verb'G->AT'.
Following this call, any algorithm which is given \verb'G' will have access to both \verb'A' and its transpose.
Because the Graph object is not opaque, any piece of code may set the transpose as well. For instance, if an algorithm
computes the transpose as part of its normal logic, it could directly set \verb'G->AT'.
The expectation is that the Graph object will always remain consistent.
If \verb'G->A' is modified, all cached properties must be either be set as unknown or modified to reflect the change.
Properties which are not known are set to \verb'NULL' or \verb'LAGRAPH_BOOLEAN_UNKNOWN' in the case of boolean properties.
This expectation is a convention that all LAGraph algorithm implementers are expected to follow.
\subsection{User modes}
Algorithms in LAGraph target two user modes: Basic and Advanced. The Basic user mode is for those who want
things to ``just work'', are less concerned about performance, and may be less experienced with graph libraries.
The Advanced user mode is for those whose primary concern is performance and are willing to conform to stricter
requirements to achieve that goal.
Algorithms targeting the Basic mode typically have limited options. Often, there will only be one function for
a given algorithm. Under the hood, that single algorithm might take different paths depending on the shape or
size of the input graph. The idea is that a basic user wants to compute PageRank or Betweenness Centrality,
but does not want to have to understand the five different ways to compute them. They simply want the correct answer.
Algorithms targeting the Advanced mode are often highly specialized implementations of an algorithm. The Advanced
mode user is expected to understand details such as push-pull~\cite{DBLP:conf/icpp/YangBO18} and batch mode and why different techniques are
better for each graph. Advanced mode algorithms are very strict in their input. If the input does not match the
expected kind, an error will be raised.
Advanced mode algorithms will also raise an error if a cached property is needed by an algorithm, but is not
currently available on the Graph object. While Basic mode algorithms are free to compute and cache properties
on the Graph object, Advanced mode algorithms never will. The idea is to never surprise the user with unexpected
additional computation. An Advanced mode user must opt-in to all computations.
Often, Basic mode algorithms will inspect the input, possibly compute properties or transform the data,
and finally call one of the Advanced mode algorithms to do the actual work on the graph. Having these two user
modes allows LAGraph to target a wider range of users who vary in their experience with graph algorithms.
\subsection{Algorithm calling conventions}
Algorithms in LAGraph follow a general calling convention.
\begin{cplus}
int algorithm
(
// outputs
TYPE *out1,
TYPE *out2,
...
// input/output
TYPE inout,
...
// inputs
TYPE input1,
TYPE input2,
...
// error message holder
char *msg
)
\end{cplus}
The return value is always an int with the following meaning:
\begin{itemize}
\item \verb'=0 ->' success
\item \verb'<0 ->' error
\item \verb'>0 ->' warning
\end{itemize}
The meaning of a given error or warning value is algorithm-specific
and should be listed in the documentation for the algorithm.
We distinguish three types of arguments:
\begin{itemize}
\item
Outputs appear first and are passed by reference. A pointer should be created by the caller, but
memory will be allocated by the algorithm. If the output is not needed, a \verb'NULL' is passed and the
algorithm will not return that output.
\item Input/Output arguments are passed by value. The expectation is that the object will be modified.
This supports features such as batch mode in which a frontier is updated and returned to the caller.
It also supports Basic mode algorithms which may modify a Graph object by adding cached properties.
\item Inputs are passed by value and should never be changed by the algorithm.
\end{itemize}
The final argument of any LAGraph algorithm holds the error message. This must be \verb'char[]' of
size \verb'LAGRAPH_MSG_LEN'. When the algorithm returns an error or a warning, a message may be placed in
this array as additional information. Because the caller creates this array, the caller must free
the memory or reuse it as appropriate. If the algorithm is successful, it should fill the message array
with an empty string to clear any previous message.
\subsection{Error handling}
Because every algorithm in LAGraph can return an error, the return value of every call should be
checked before proceeding. To make this less burdensome for a C-based library, LAGraph provides a
convenience macro which works similar to try/catch in other languages.
\begin{cplus}
#define LAGraph_TRY(LAGraph_method)
{
int LAGraph_status = LAGraph_method;
if (LAGraph_status < 0)
{
LAGraph_CATCH (LAGraph_status);
}
}
\end{cplus}
\verb'LAGraph_CATCH' can be defined before an algorithm and will be called in the event of an error.
This allows for proper freeing of memory and other necessary tasks.
A similar macro, \verb'GrB_TRY', will call \verb'GrB_CATCH' when making GraphBLAS calls which return
a \verb'GrB_Info' value other than \verb'GrB_SUCCESS' or \verb'GrB_NO_VALUE'.
\verb'LAGraph_TRY' and \verb'GrB_TRY' provide an easy to use and easy to read method for dealing with
error checking while writing graph algorithms.
\subsection{Contributing algorithms}
The LAGraph project welcomes contributions from graph practitioners who understand the GraphBLAS vision
of using the language of linear algebra to express graph computations. However, as a matter of
practical concern, many users want a stable experience when using LAGraph for doing real work. To balance
these, the LAGraph repository will have both a stable and an experimental folder.
New algorithms or modifications of existing algorithms will first be added to the experimental folder.
The release schedule of experimental algorithms will generally be much faster than the stable release,
and there is no expectation of a bug-free experience.
The goal is to generate lots of ideas and allow uninhibited contributions to
push the boundary of what is possible with the GraphBLAS. The stable release will be fully tested and
will move much slower, targeting the needs of those who want to use LAGraph as a complete, production-grade
library rather than as a research project.