-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[VeriblePreProcessor][1]: multiple-cu and generate-variants modes #1372
[VeriblePreProcessor][1]: multiple-cu and generate-variants modes #1372
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First round of comments and suggestions.
Stylistically, I think you like to put comments at the end of a code line, but then the formatting makes that unusally awkward as it breaks it into multiple lines. In many cases it makes sense to put the comment in front for eaiser readabiilyt.
Also a suggestion for the API of flow tree which might be nicer with a callback funciton.
Also question that needs to be considered and tested. Consider the following example: `ifdef A
// foo
`else
// bar
`endif
`ifdef A
// baz
`else
// quux
`endif
`ifndef A
// foobar
`else
// bazfoo
`endif will that create (2^3=) 8 variants of (number of different conditions for (In the real world, you see files with many ifdefs but they only look at a handful of different conditions. So it will be a practical difference if we get 16 or 2^3122 variants out of it) |
Hello @hzeller, Thanks a lot for your review, and sorry for my silly mistakes! :) Regarding the question you discussed in the previous comment, I think this would require that we know (while traversing the tree) which macros are assumed to be defined or not, maybe in the form of a bit mask. So, if we are generating a variant in which Suggestion:
Would this idea be a good one? a bit mask can only be 32 bit long, or 64 in case of long int, would that be enough in practical cases? Thanks, |
Codecov Report
@@ Coverage Diff @@
## master #1372 +/- ##
==========================================
+ Coverage 92.92% 92.94% +0.01%
==========================================
Files 340 342 +2
Lines 23855 24103 +248
==========================================
+ Hits 22168 22402 +234
- Misses 1687 1701 +14
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
I like the idea to represent the different conditions as a bit in an int, because then we also have a way to uniquely identify a variant by its number: 0x42 represents the variant in which all conditions are considered 'false' except the second (0x02) and the 7th (0x40). And numbering can happen from the top of the file. Also we can exactly query how many different variants we actually have. We do need to be aware that there might be many more than 64 conditions, so I'd be wary to store it in a unsigned int, but rather some sort of bitset. |
Every class needs a test: Please also add a Since we're working on the raw token stream, where syntax is not necessarily checked yet, we should also consider what happens in situations that don't match up; at least we should handle that gracefully without crashing.
|
I checked the LRM, and here is what I found: Section: 22.6
Section: 22.5.1
Thus, I think it shouldn't be supported, do you agree? |
ah, so you mean the only 'expression' in an ifdef really can be the existence of a single identifier. That is good, that makes things simple. |
I implemented the `ifdef A
A_TRUE
`else
A_FALSE
`endif
`ifdef A
A_TRUE
`else
A_FALSE
`endif
`ifndef A
A_FALSE
`else
A_TRUE
`endif Result:
I assumed the maximum number of macros to be 128 just for now, It seams to be working. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, starting to look good!
Now very important is to write a unit test for flow_tree. It is actually always a good idea to start the test right at the time when you begin implement something, because it makes life so much simpler - you always know what is working, and it is much easier to change things. And since the unit test is the first 'user' of the API, you also get a feel if it provides everything one would need, long before it is wired up in some other library or binary somewhere.
Ah well, we're a bit late now, but better late than never :)
So start a flow_tree_test.cc
, make all the boilerplate to set up a particular input stream and test that you get all the expected responses.
FYI, the preprocessor changed slightly in the meantime, the pull request needs to be rebase. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the subcommand stuff had been removed, I suspect that was in the early implementation stages to get things going ? That should of course be kept.
For the test: looks pretty good, some small suggestions there.
verilog/analysis/flow_tree_test.cc
Outdated
auto status = tree_test.GenerateVariants(GetVariant); | ||
EXPECT_TRUE(status.ok()); | ||
EXPECT_EQ(g_variants.size(), 2); | ||
for (const auto& variant : g_variants[0].sequence) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be more concise by using ElementsAre()
.
// somewhere above
using ::testing::ElementsAre;
//...
EXPECT_THAT(variants[0], ElementsAre("A_TRUE", "A_TRUE", "A_TRUE"));
You could then also easily use different words in the input text to have a bit of fun and better see that really all places have been captures and not only accidentally the first one
EXPECT_THAT(variants[0], ElementsAre("FOO", "BAR", "BAZ"));
EXPECT_THAT(variants[1], ElementsAre("TOO", "MANY", "MACROS"));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't use ElementsAre
with TokenInfo
, as comparing tokens fails due to comparing the bounds of the string_view
of the text.
Used this instead:
EXPECT_THAT(variants[0].sequence[0].text(), "A_TRUE_1");
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you git push your changes yet ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not yet, I am working on the subcommand part.
verilog/analysis/flow_tree_test.cc
Outdated
auto status = tree_test.GenerateVariants(GetVariant); | ||
EXPECT_TRUE(status.ok()); | ||
EXPECT_EQ(g_variants.size(), 2); | ||
for (const auto& variant : g_variants[0].sequence) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you git push your changes yet ?
Did you re-base your branch so that we don't have the conflicts ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. A few comments w.r.t. which output stream to use so that it is easier for users to redirect the contents to the right output.
And the usual: we need a test for ach new subcommand
{"strip-comments", // | ||
{&StripComments, // | ||
R"(strip-comments file | ||
static absl::Status PreprocessSingleFile(const char* source_file, std::istream&, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could the source_file
be of type std::string_view
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it couldn't be, because source_file
comes from GetSubcommandEntry
which returns std::vector<char*>
.
I mean it can be changed of course, but I will have to change SubcommandFunction
to be a template, and maybe change something in common/util/init_command_line.h
too.
verible/common/util/subcommand.h
Lines 31 to 42 in 4faba50
// This should be the same type returned by InitCommandLine in | |
// common/util/init_command_line.h | |
using SubcommandArgs = std::vector<char*>; | |
using SubcommandArgsIterator = SubcommandArgs::const_iterator; | |
using SubcommandArgsRange = | |
verible::container_iterator_range<SubcommandArgsIterator>; | |
// Currently this function type is hard-coded, but this could eventually | |
// become a template parameter. | |
using SubcommandFunction = | |
std::function<absl::Status(const SubcommandArgsRange&, std::istream& ins, | |
std::ostream& outs, std::ostream& errs)>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const char*
converts naturally into a absl::string_view
, so in the loop in MultipleCU
it will just work
Would re-basing using a similar approach to the one you showed me before in PR #1318 solve the conflict? or I should solve the conflict manually here, then pull the changes before doing the re-base? |
Yes, just like in #1318 (comment) (what I do whenver I do some large thing like that, I create a copy of my full git-tree just in case :) ). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the superfluous function parameters are not the ones that we need due to subcommands (as these functions are actually not hooked as subcommands. Or am I missing somehting ?)
Other than that, this looks like we're nearing completion of this change!
To see if you have covered all the relevant code parts, take a look how to run coverage on your machine https://github.com/chipsalliance/verible#test (you might need to install lcov
if not already on your machine)
To run the CI and get ready to merge, please also do the rebase.
{"strip-comments", // | ||
{&StripComments, // | ||
R"(strip-comments file | ||
static absl::Status PreprocessSingleFile(const char* source_file, std::istream&, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const char*
converts naturally into a absl::string_view
, so in the loop in MultipleCU
it will just work
I have done the rebase, I just have a problem using
I am trying to figure out the problem, I tried to build the master branch and it gives me the same error. Note: I use local flex and bison |
You need bazel version 4 now |
did it work ? Can you |
Yes, it worked. |
…o generate all variants with the new mode generate-variants
- Adding comments in a separate line to avoid wrapping. - Providing an API to the user of FlowTree class to generate variants. - Using const_iterators instead of index in FlowTree class. - Using a struct to represents the conditional block as a unit inside FlowTree.
- Listed the macros as they appear in conditionals. - Gave each of them a unique ID and stored them in a map. - Used a more detailed if_blocks struct to help tracking edges. - Used bitset in DFS to know for each variant which macros are defined.
- Removed variants counter from the class members. - Added a new struct called Variant that contains all the vairant's data: - Its TokenSequence. - bitset that shows if the i-th macro is visited/assumed or not. - bitset that shows if the i-th macro is defined or not. - Modified the VariantReceiver to accepts the new struct Variant, instead of just TokenSequence. - GenerateControlFlowTree() is now a private function, such that GenerateVariants() is the only API the user needs. - Added a using declaration for std::bitset<128>.
- Added API GetUsedMacros that returns all the used macros in conditionals. - Changed "assumed" bitset name to "visited". - Added description to "ConditionalBlock". - Added more testing using closures. Preprocessor tool: - Kept the subcommand part. - Available subcommands for now "strip-comments", "multiple-cu", and "generate-variants".
- Moved ConditionalBlock to private section in FlowTree class. - Removed unwanted debug code from the preprocessor tool. - Fixed output streams issues in the preprocessor tool.
It is okay, everything looks fine, but there is some minor issue in Just a new line missing in an expected result of a test, should I commit it to fix this issue? |
Yes, and don't forget to push, so that we can run the CI also. |
5614fe4
to
aa29595
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks very good; only missing before merge
- a few places that are not exercised by tests yet in the flow tree
- PreprocessSingleFile() the superfluous parameters (which are not needed there as it is not directly registered as subcommand)
Everything else, we then can see in the subsequent reviews.
Sorry that it took so many rounds, but that is just happening the first time when contributing to a code-base as there are many assumptions already how things work. Good work!
Yes, I added tests for those, but I am curious to know how did you point them out? :) by looking into the coverage report?
Thanks for your effort, Henner! |
Yes, I ran
Which then generates some HTML output that we can look at. |
: source_sequence_(std::move(source_sequence)){}; | ||
|
||
// Generates all possible variants. | ||
absl::Status GenerateVariants(const VariantReceiver &receiver); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not for now, but one TODO for future work would be to make this a const function
absl::Status GenerateVariants(const VariantReceiver &receiver) const;
For that to work, we would have to call GenerateControlFlowTree()
outside as it modifies the state. You previously had that, but that was at a state when it was not quite clear yet what the final API looks like, so we can revisit.
We discussed before that it might be slightly confusing in the API to have such call, when it is called GenerateControlFlowTree()
as it might not be quite clear to a reader of the API if it is required.
So we can consider these options:
- Do it in the constructor. But it would somewhat violate the don't do work in the constructor guidance, in particular since there can be error conditions that we can't return in the constructor.
- Require the user to call
GenerateControlFlowTree()
, but since that is unnecessarily obscure to the potential usr of the API, we could just call itInitialize()
? That way, for someone reading the API, they know what to do. - If the user has not called
Initialize()
haveGenerateVariants()
return a status error, maybeabsl::FailedPreconditionError()
. If Initialize() was called, but had some non-ok status, we can return that. So in general, we can have anabsl::Status initialize_status_
in theFlowTree
class, that we initialize in the constructor with the failed precondition status, and then again when callingInitialize()
. Then the only thingGenerateVariants()
has to do is toif (!initialize_status_.ok()) return initialize_status_;
Also, once the initialization question is figured out, we need to have it carry its own state, like current_variant_
and wants_more_
so that it does ont modify anything.
But all of that, let's wait until we have the other pull requests done; we always can go back and make things 'prettier'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just little comments with idea for future improvements.
empty_block.if_location = source_sequence_.end(); | ||
empty_block.else_location = source_sequence_.end(); | ||
empty_block.endif_location = source_sequence_.end(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should have a constructor in the ConditionalBlock
struct that gets a TokenSequenceConstIterator end_pos
parameter to initialize all the fields ? The ConditionalBlock
is the best to know what are all the fields it has to initialize.
Also, it can set the positive_condition
to some start value.
So something like that:
struct ConditionalBlock {
ConditionalBlock(TokenSequenceConstIterator if_location,
bool is_positive,
TokenSequenceConstIterator invalid_location)
: if_location(if_location), positive_condition(is_positive),
else_location(invalid_location), endif_location(invalid_location) {}
// ...
};
Then, below, we then can initialize things in one go using emplace_back()
(the parameters to which are passed to the contructor of the ConditionalBlock
.
So pseudo-code:
absl::Status FlowTree::GenerateControlFlowTree() {
// ...
const TokenSequenceConstIterator invalid_location = source_sequence_.end();
// for loop, if, switch later ...
case PP_ifdef: {
if_blocks_.emplace_back(iter, true, invalid_location);
auto status = AddMacroOfConditional(iter);
// ...
}
case PP_ifndef: {
if_blocks_.emplace_back(iter, false, invalid_location);
auto status = AddMacroOfConditional(iter);
// ...
}
Anyway, not needed now, but this can be a quick follow-up pull request.
case PP_ifdef: { | ||
if_blocks_.push_back(empty_block); | ||
if_blocks_.back().if_location = iter; | ||
if_blocks_.back().positive_condition = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have boolean values it is better to use true
and false
to initialize.
I found a bug in |
Yay. A word of caution though: It is good to first start out not using the coverage tool but try to think of all possible ways the code should work. And thinking of ways how to break it - is there a corner cases we're unsure of ? Make an attempt to write a test to break things. If yes, it is good and we can improve the code; if not it is also good, because, well not a bug, but also because we made sure that it won't happen in the future. Then, once we think we have covered everything, the coverage tool will show what is left to do and getting things as close to 100% as possible. Starting with the coverage tool is not as effective: because then one focuses on just making sure that everything is somehow brushed past, but thinking about the code first and how to break it is helping to get a good understanding of potential issues here and for future code - and improve it accordingly. The coverage tool mostly provides insight into path coverage while thinking about corner cases is better in providing condition coverage (path coverage and condition coverage are the technical terms for that). |
Interesting, because I thought it would be enough to use the tool, but now that you explained the difference between path and condition coverage, I get why we still need the condition coverage, because we may hit a line of code, and still not break it. |
Ready to mege, or are there things you'd like to add to this PR ? (the remaining comments I made can be addressed in a separate change) |
Ready to merge, we will need to revisit some parts while reviewing the next PRs anyway. |
Thanks, merged! |
To link this up: the corresponding issue is #1358 |
NOTE: This is a part of the sequentially splitted PRs from PR #1360.
Description:
--generate_variants
mode generates all the possible variants for conditionals directives.In case of
--multiple-cu
is true, and multiple files are passed to the tool, each one of them will be compiled in a separate compilation unit, thus declarations scopes will end by the end of each file (this is the default behavior).