Skip to content

fix(gen): support subtype-based encodings in C header generator #904

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

blazethunderstorm
Copy link

@blazethunderstorm blazethunderstorm commented Jul 15, 2025

Fixes #893

Updated the generator.py to correctly detect instruction encodings when using the new type/subtype schema. Previously, it looked only for the old encoding field, so some instructions were skipped. Now it checks subtype fields and extracts encoding info from there.

image image

@blazethunderstorm blazethunderstorm changed the title fix(gen): support subtype-based encodings in generator fix(gen): support subtype-based encodings in C header generator Jul 15, 2025
Copy link
Collaborator

@ThinkOpenly ThinkOpenly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are WAY too many unrelated changes to easily review this. Could you create a single commit with just the necessary changes to address the issue?

@blazethunderstorm
Copy link
Author

@ThinkOpenly pls see now

@ThinkOpenly
Copy link
Collaborator

I don't think these changes have any impact. Are you seeing some problems being resolved?

I would expect that instead of finding the needed "match" string directly, you'd need to compute it by going into the "format" attribute and its children "opcodes" and "variables".

Note that these changes would likely appear in backends/generators/generator.py in load_instructions(). Indeed when you run ./do gen:c_header, you get error messages where these issues occur:

ERROR:: Missing 'encoding' field in instruction add.uw in /workspace/riscv-unified-db/gen/resolved_spec/_/inst/Zba/add.uw.yaml
ERROR:: Missing 'encoding' field in instruction rolw in /workspace/riscv-unified-db/gen/resolved_spec/_/inst/B/rolw.yaml
ERROR:: Missing 'encoding' field in instruction rol in /workspace/riscv-unified-db/gen/resolved_spec/_/inst/B/rol.yaml
ERROR:: Missing 'encoding' field in instruction xnor in /workspace/riscv-unified-db/gen/resolved_spec/_/inst/B/xnor.yaml
ERROR:: Missing 'encoding' field in instruction clmul in /workspace/riscv-unified-db/gen/resolved_spec/_/inst/B/clmul.yaml
ERROR:: Missing 'encoding' field in instruction orn in /workspace/riscv-unified-db/gen/resolved_spec/_/inst/B/orn.yaml
ERROR:: Missing 'encoding' field in instruction clmulh in /workspace/riscv-unified-db/gen/resolved_spec/_/inst/B/clmulh.yaml
ERROR:: Missing 'encoding' field in instruction andn in /workspace/riscv-unified-db/gen/resolved_spec/_/inst/B/andn.yaml
ERROR:: Missing 'encoding' field in instruction rorw in /workspace/riscv-unified-db/gen/resolved_spec/_/inst/B/rorw.yaml
ERROR:: Missing 'encoding' field in instruction ror in /workspace/riscv-unified-db/gen/resolved_spec/_/inst/B/ror.yaml

Part of your mission is to make those error messages go away. The current code just extracts the value of the "match" attribute, but you'll need to compute it.

@blazethunderstorm
Copy link
Author

@ThinkOpenly would do that

@dhower-qc
Copy link
Collaborator

Thanks for the contribution! Like @ThinkOpenly said, you'll need to build up what used to be explicit in 'match'. Basically, start with an instruction-length string of '-'s and then replace any position that is occupied by an opcode value.

So if you have:

# andn.yaml
format:
  funct7:
    display_name: ANDN
    location: 31-25
    value: 0b0100000
  funct3:
    display_name: ANDN
    location: 14-12
    value: 0b111
  opcode:
    display_name: OP
    location: 6-0
    value: 0b0110011

You want to wind up with the string:

0100000----------111-----0110011

@blazethunderstorm
Copy link
Author

@dhower-qc thanks for help would make the changes as req

@blazethunderstorm
Copy link
Author

@ThinkOpenly @dhower-qc pls review

Copy link
Collaborator

@ThinkOpenly ThinkOpenly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for your efforts. Good code. Comments/questions inline.

if value is None or location is None:
continue

if isinstance(location, str) and "-" in location:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to handle when location does not have -. They all do today, but this is not guaranteed. In this case, the field is a single bit.

Comment on lines +208 to +211
except Exception as e:
logging.error(f"Failed to construct match string for {name} in {path}: {e}")
encoding_filtered += 1
continue
Copy link
Collaborator

@ThinkOpenly ThinkOpenly Jul 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this? This would have the effect of logged, but otherwise ignored errors. We may actually want a crash here.

encoding_filtered += 1
continue

continue
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this have any effect?

@@ -255,14 +283,30 @@ def load_instructions(
encoding_filtered += 1
continue

match_str = encoding_to_use.get("match")
match_str = encoding_to_use.get("match") if encoding_to_use else None
mask_str = encoding_to_use.get("mask") if encoding_to_use else None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to do any work with "mask" here? I thought it was calculated elsewhere based on the "match" string.

@blazethunderstorm
Copy link
Author

@ThinkOpenly would think about this and inform you and make changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Generated C header missing instructions with new subtype schema
3 participants