-
Notifications
You must be signed in to change notification settings - Fork 839
[Outlining] Remove overlapping sequences #7146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 17 commits
ab65ba2
4636a3f
aff6cd7
76d31b0
6a5aa0f
b05f948
15485fa
c68c921
956dbd0
6408a25
fd7c6c8
061211e
f828e45
a6bc337
26d4ab7
9ece5b6
34cd8d1
54e1903
b83bdb3
116ee63
d22fc3a
79c0800
167dbf6
381b2ce
515d550
a044f24
7d30725
4f011f9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||
|---|---|---|---|---|---|---|---|---|
|
|
@@ -15,6 +15,7 @@ | |||||||
| */ | ||||||||
|
|
||||||||
| #include "stringify-walker.h" | ||||||||
| #include "support/intervals.h" | ||||||||
|
|
||||||||
| namespace wasm { | ||||||||
|
|
||||||||
|
|
@@ -147,6 +148,48 @@ std::vector<SuffixTree::RepeatedSubstring> StringifyProcessor::dedupe( | |||||||
| return result; | ||||||||
| } | ||||||||
|
|
||||||||
| std::vector<SuffixTree::RepeatedSubstring> StringifyProcessor::filterOverlaps( | ||||||||
| const std::vector<SuffixTree::RepeatedSubstring>& substrings) { | ||||||||
| // A substring represents a contiguous set of instructions that appear more | ||||||||
| // than once in a Wasm binary. For each appearance of the substring, an | ||||||||
| // Interval is created that lacks a connection back to its originating | ||||||||
| // substring. To fix, upon Interval creation, a second vector is populated with | ||||||||
| // the index of the corresponding substring. | ||||||||
| std::vector<Interval> intervals; | ||||||||
| std::vector<int> substringIdxs; | ||||||||
|
|
||||||||
| // Construct intervals | ||||||||
| for (Index i = 0; i < substrings.size(); i++) { | ||||||||
| auto substring = substrings[i]; | ||||||||
|
||||||||
| auto substring = substrings[i]; | |
| auto& substring = substrings[i]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
batched suggestion
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be slightly shorter using emplace_back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be an emplace_back to really be sure we're not copying the vector of start indices?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| result.emplace_back(SuffixTree::RepeatedSubstring( | |
| {substrings[i].Length, std::move(startIndices[i])})); | |
| result.emplace_back(substrings[i].Length, std::move(startIndices[i])); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I remove SuffixTree::RepeatedSubstring, I get the following error:
FAILED: src/passes/CMakeFiles/passes.dir/hash-stringify-walker.cpp.o /usr/bin/c++ -I/src -I/third_party/FP16/include -I/third_party/llvm-project/include -I -DBUILD_LLVM_DWARF -Wall -Werror -Wextra -Wno-unused-parameter -Wno-dangling-pointer -fno-omit-frame-pointer -fno-rtti -Wno-implicit-int-float-conversion -Wno-unknown-warning-option -Wswitch -Wimplicit-fallthrough -Wnon-virtual-dtor -fPIC -fdiagnostics-color=always -g -g3 -std=c++17 -MD -MT src/passes/CMakeFiles/passes.dir/hash-stringify-walker.cpp.o -MF src/passes/CMakeFiles/passes.dir/hash-stringify-walker.cpp.o.d -o src/passes/CMakeFiles/passes.dir/hash-stringify-walker.cpp.o -c /src/passes/hash-stringify-walker.cpp /src/passes/hash-stringify-walker.cpp: In static member function ‘static std::vector<wasm::SuffixTree::RepeatedSubstring> wasm::StringifyProcessor::filterOverlaps(const std::vector<wasm::SuffixTree::RepeatedSubstring>&)’: /src/passes/hash-stringify-walker.cpp:184:26: error: no matching function for call to ‘std::vector<wasm::SuffixTree::RepeatedSubstring>::emplace_back(<brace-enclosed initializer list>)’ 184 | result.emplace_back( | ~~~~~~~~~~~~~~~~~~~^ 185 | {substrings[i].Length, std::move(startIndices[i])}); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from /usr/include/c++/14/vector:72, from /src/wasm.h:34, from /src/ir/boolean.h:20, from /src/ir/bits.h:20, from /src/ir/properties.h:20, from /src/ir/iteration.h:20, from /src/passes/stringify-walker.h:20, from /src/passes/hash-stringify-walker.cpp:17: /usr/include/c++/14/bits/vector.tcc:111:7: note: candidate: ‘std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::emplace_back(_Args&& ...) [with _Args = {}; _Tp = wasm::SuffixTree::RepeatedSubstring; _Alloc = std::allocator<wasm::SuffixTree::RepeatedSubstring>; reference = wasm::SuffixTree::RepeatedSubstring&]’ 111 | vector<_Tp, _Alloc>:: | ^~~~~~~~~~~~~~~~~~~ /usr/include/c++/14/bits/vector.tcc:111:7: note: candidate expects 0 arguments, 1 provided At global scope: cc1plus: note: unrecognized command-line option ‘-Wno-unknown-warning-option’ may have been intended to silence earlier diagnostics cc1plus: note: unrecognized command-line option ‘-Wno-implicit-int-float-conversion’ may have been intended to silence earlier diagnostics [3/18] Building CXX object src/passes/CMakeFiles/passes.dir/Outlining.cpp.o ninja: build stopped: subcommand failed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no constructor provided on SuffixTree, so we'll keep as-is
| Original file line number | Diff line number | Diff line change | ||||||
|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,66 @@ | ||||||||
| /* | ||||||||
| * Copyright 2024 WebAssembly Community Group participants | ||||||||
| * | ||||||||
| * Licensed under the Apache License, Version 2.0 (the "License"); | ||||||||
| * you may not use this file except in compliance with the License. | ||||||||
| * You may obtain a copy of the License at | ||||||||
| * | ||||||||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||||||||
| * | ||||||||
| * Unless required by applicable law or agreed to in writing, software | ||||||||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||||||||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||||||||
| * See the License for the specific language governing permissions and | ||||||||
| * limitations under the License. | ||||||||
| */ | ||||||||
|
|
||||||||
| #include <assert.h> | ||||||||
|
|
||||||||
| #include "intervals.h" | ||||||||
| #include "support/index.h" | ||||||||
| #include <algorithm> | ||||||||
| #include <iostream> | ||||||||
| #include <optional> | ||||||||
|
||||||||
| #include <iostream> | |
| #include <optional> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
batched suggestion
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // former overlaps with candidate | |
| // replace former if the weights are the same but the candidate ends earlier | |
| // Former overlaps with candidate. Replace former if the weights are the same but the candidate ends earlier. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
batched suggestion
tlively marked this conversation as resolved.
Show resolved
Hide resolved
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,62 @@ | ||||||
| /* | ||||||
| * Copyright 2024 WebAssembly Community Group participants | ||||||
| * | ||||||
| * Licensed under the Apache License, Version 2.0 (the "License"); | ||||||
| * you may not use this file except in compliance with the License. | ||||||
| * You may obtain a copy of the License at | ||||||
| * | ||||||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||||||
| * | ||||||
| * Unless required by applicable law or agreed to in writing, software | ||||||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||||||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||||||
| * See the License for the specific language governing permissions and | ||||||
| * limitations under the License. | ||||||
| */ | ||||||
|
|
||||||
| // Helpers for handling a generic range of values | ||||||
|
|
||||||
| #ifndef wasm_support_intervals_h | ||||||
| #define wasm_support_intervals_h | ||||||
|
|
||||||
| #include <set> | ||||||
| #include <vector> | ||||||
|
|
||||||
| namespace wasm { | ||||||
|
|
||||||
| struct Interval { | ||||||
| unsigned start; | ||||||
| unsigned end; | ||||||
| // The weight is used to determine which interval to keep when two overlap, | ||||||
| // higher is better | ||||||
| unsigned weight; | ||||||
| Interval(unsigned start, unsigned end, unsigned weight) | ||||||
| : start(start), end(end), weight(weight) {} | ||||||
|
|
||||||
| bool operator<(const Interval& other) const { | ||||||
| /*return weight != other.weight ? weight < other.weight | ||||||
| : end != other.end ? end < other.end | ||||||
| : start < other.start;*/ | ||||||
| /*return end != other.end ? end < other.end | ||||||
| : weight != other.weight ? weight < other.weight | ||||||
| : start < other.start;*/ | ||||||
| return start != other.start ? start < other.start | ||||||
| : weight != other.weight ? weight < other.weight | ||||||
| : end < other.end; | ||||||
| } | ||||||
|
Comment on lines
+36
to
+40
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This should take
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
||||||
|
|
||||||
| bool operator==(const Interval& other) const { | ||||||
| return start == other.start && end == other.end && weight == other.weight; | ||||||
| } | ||||||
| }; | ||||||
|
|
||||||
| struct IntervalProcessor { | ||||||
| // Given a vector of Interval, returns a vector of the indices that, mapping | ||||||
| // back to the original input vector, do not overlap with each other, ie: the | ||||||
|
||||||
| // back to the original input vector, do not overlap with each other, ie: the | |
| // back to the original input vector, do not overlap with each other, i.e. the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't look like it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
batched suggestion
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adjust punctuation around ie to , i.e.,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,121 @@ | ||||||
| #include "support/intervals.h" | ||||||
| #include "ir/utils.h" | ||||||
| #include "gtest/gtest.h" | ||||||
|
|
||||||
| using namespace wasm; | ||||||
|
|
||||||
| TEST(IntervalsTest, TestEmpty) { | ||||||
| std::vector<Interval> intervals; | ||||||
| std::vector<int> expected; | ||||||
| ASSERT_EQ(IntervalProcessor::filterOverlaps(intervals), expected); | ||||||
| } | ||||||
|
|
||||||
| TEST(IntervalsTest, TestOne) { | ||||||
| std::vector<Interval> intervals; | ||||||
| intervals.emplace_back(Interval{0, 2, 5}); | ||||||
|
||||||
| intervals.emplace_back(Interval{0, 2, 5}); | |
| intervals.emplace_back(0, 2, 5); |
tlively marked this conversation as resolved.
Show resolved
Hide resolved
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be best to give these more descriptive names like TestOverlapInnerNested, TestOverlapOuterNested, TestOverlapAscendingSequence, TestOverlapDescendingSequence, etc. Ideally the role each interval plays in a test should be clear from the name of the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to have a comment saying how these two vectors relate to each other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done