You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This regular expression applies {intersection} to two large languages
52
+
and make use of the complement. Its goal is to measure the efficiency
53
+
of set operations.
50
54
\end{itemize}
51
55
52
56
\begin{figure}[!t]
@@ -58,30 +62,27 @@ \subsection{Comparing Algorithms in the \haskell Implementation}
58
62
59
63
In the evaluation, we consider five variants of the Haskell implementation.
60
64
\begin{itemize}
61
-
\item The \textbf{naive} implementation corresponds to the code developed by
62
-
the end of Section~\ref{sec:motivation}. It transforms to and from
63
-
segments on the fly and uses plain list indexing.
65
+
\item\textbf{McIlroy} is our implementation of
66
+
the algorithm by \citet{DBLP:journals/jfp/McIlroy99}.
64
67
\item The \textbf{seg} implementation uses the infinite list-based segmented
65
-
representation throughout (\cref{sec:segm-repr}). Moreover,
66
-
it relies on maps and sparse indexing for concatenation and closure
67
-
(\cref{sec:sparse-indexing}).
68
+
representation throughout (\cref{sec:segm-repr}).
68
69
\item The \textbf{segConv} implementation additionally
69
-
applies the convolution approach presented in \cref{sec:convolution}.
70
-
\item The \textbf{ref} implementation uses symbolic segments
71
-
from \cref{sec:more-finite-repr} combined with
72
-
maps and sparse indexing.
70
+
applies the convolution approach (\cref{sec:convolution,sec:faster-closure}).
71
+
%\item The \textbf{ref} implementation uses symbolic segments
72
+
% from \cref{sec:more-finite-repr} combined with
73
+
% maps and sparse indexing.
73
74
\item The \textbf{refConv} implementation combines
74
-
symbolic segments, sparse indexing, and the convolution approach.
75
+
symbolic segments (\cref{sec:segm-repr,sec:more-finite-repr}) with the convolution approach.
75
76
\end{itemize}
76
77
77
78
Performances are evaluated by iterating through the stream of words
78
-
produced by the generator, forcing their evaluation.\footnote{In
79
+
produced by the generator, forcing their evaluation\footnote{In
79
80
Haskell, forcing is done using \lstinline{Control.DeepSeq}.}
80
81
and recording the elapsed timed every 20 words.
81
82
We stop the iteration after 5 seconds.
82
-
The resulting graph plots the time (x-axis) against the number of words (y-axis) produced so far. The slope of the graph in indicates the generation speed of the plotted algorithm, high slope is correlated to high generation speed. \cref{bench:haskell:all} contains the results for the Haskell implementations.
83
-
84
-
Most algorithms generate between 3000 and 150000 words in the first
83
+
The resulting graph plots the time (x-axis) against the number of words (y-axis) produced so far. The slope of the graph indicates the generation speed of the plotted algorithm, high slope is correlated to high generation speed. \cref{bench:haskell:all} contains the results for the Haskell implementations.
84
+
85
+
Most algorithms generate between $1.3\cdot10^3$ and $1.4\cdot10^6$ words in the first
85
86
second, which seems more than sufficient for testing purposes.
86
87
The \textbf{refConv} implementation
87
88
which uses symbolic segments and convolutions is consistently in the
@@ -91,24 +92,23 @@ \subsection{Comparing Algorithms in the \haskell Implementation}
91
92
This observation validates that the
92
93
changes proposed in \cref{sec:improvements} actually lead to
93
94
improvements.
94
-
95
-
Looking at each graph in more detail, we can make the following
95
+
%
96
+
Looking at each graph in detail, we can make the following
96
97
remarks:
97
98
\begin{itemize}[leftmargin=*]
98
99
\item All implementations are equally fast on $\Rstar a$ except
99
-
the naive implementation, which relies on list lookups without
100
+
\textbf{McIlroy}, which relies on list lookups without
100
101
sparse indexing.
101
-
\item For $\Rstar{(\Rconcat{a}{\Rstar{b}})}$ and
102
-
$\Rconcat{\Rcomplement{(\Rstar{a})}}{b}$, the graph of some implementations
102
+
\item The graph of some implementations
103
103
has the shape of ``skewed stairs''. We believe this phenomenon is due to
104
104
insufficient laziness: when arriving at a new segment, part of the
105
105
work is done eagerly which causes a plateau. When that part is done,
106
106
the enumeration proceeds lazily. As laziness and GHC
107
107
optimizations are hard to control, we did not attempt to correct this.
108
-
\item$\Rstar{(\Rconcat{a}{\Rstar{b}})}$ demonstrates that sparse indexing
109
-
does degrade performance when applying \code{star} to non-sparse languages.
110
-
Using the convolution technique presented in \cref{sec:convolution} resolves this problem.
111
-
\item The \textbf{ref} and \textbf{refConv} algorithms are
108
+
\item$\Rstar{(\Rconcat{a}{\Rstar{b}})}$ demonstrates that
109
+
the convolution technique presented in \cref{sec:convolution}
110
+
leads to significant improvements when applying \code{star} to non-sparse languages.
111
+
\item The \textbf{refConv} algorithm is
112
112
significantly faster on $\Rconcat{\Rcomplement{(\Rstar{a})}}{b}$
113
113
compared to \textbf{seg} and \textbf{segConv}. We have no good
114
114
explanation for this behavior as the code is identical up to the
@@ -117,12 +117,15 @@ \subsection{Comparing Algorithms in the \haskell Implementation}
117
117
is $\Lang{b}$, which is also represented finitely by
118
118
\textbf{segConv} and should thus benefit from the convolution
119
119
improvement in the same way as \textbf{refConv}.
120
+
\item$\Rstar{(\Rconcat{a}{\Rstar{b}})}$ shows that all our algorithm have similar
121
+
performance profiles on set-operation. They are also significantly
122
+
faster than \textbf{McIlroy}.
120
123
\end{itemize}
121
124
122
125
123
126
\subsection{Comparing Data Structures in the \ocaml Implementation}
0 commit comments