fix(gen): support subtype-based encodings in C header generator #904

blazethunderstorm · 2025-07-15T23:12:06Z

Fixes #893

Updated the generator.py to correctly detect instruction encodings when using the new type/subtype schema. Previously, it looked only for the old encoding field, so some instructions were skipped. Now it checks subtype fields and extracts encoding info from there.

Fixes riscv-software-src#893

ThinkOpenly

There are WAY too many unrelated changes to easily review this. Could you create a single commit with just the necessary changes to address the issue?

Fixes riscv-software-src#893

blazethunderstorm · 2025-07-15T23:19:27Z

@ThinkOpenly pls see now

ThinkOpenly · 2025-07-16T02:00:02Z

I don't think these changes have any impact. Are you seeing some problems being resolved?

I would expect that instead of finding the needed "match" string directly, you'd need to compute it by going into the "format" attribute and its children "opcodes" and "variables".

Note that these changes would likely appear in backends/generators/generator.py in load_instructions(). Indeed when you run ./do gen:c_header, you get error messages where these issues occur:

ERROR:: Missing 'encoding' field in instruction add.uw in /workspace/riscv-unified-db/gen/resolved_spec/_/inst/Zba/add.uw.yaml
ERROR:: Missing 'encoding' field in instruction rolw in /workspace/riscv-unified-db/gen/resolved_spec/_/inst/B/rolw.yaml
ERROR:: Missing 'encoding' field in instruction rol in /workspace/riscv-unified-db/gen/resolved_spec/_/inst/B/rol.yaml
ERROR:: Missing 'encoding' field in instruction xnor in /workspace/riscv-unified-db/gen/resolved_spec/_/inst/B/xnor.yaml
ERROR:: Missing 'encoding' field in instruction clmul in /workspace/riscv-unified-db/gen/resolved_spec/_/inst/B/clmul.yaml
ERROR:: Missing 'encoding' field in instruction orn in /workspace/riscv-unified-db/gen/resolved_spec/_/inst/B/orn.yaml
ERROR:: Missing 'encoding' field in instruction clmulh in /workspace/riscv-unified-db/gen/resolved_spec/_/inst/B/clmulh.yaml
ERROR:: Missing 'encoding' field in instruction andn in /workspace/riscv-unified-db/gen/resolved_spec/_/inst/B/andn.yaml
ERROR:: Missing 'encoding' field in instruction rorw in /workspace/riscv-unified-db/gen/resolved_spec/_/inst/B/rorw.yaml
ERROR:: Missing 'encoding' field in instruction ror in /workspace/riscv-unified-db/gen/resolved_spec/_/inst/B/ror.yaml

Part of your mission is to make those error messages go away. The current code just extracts the value of the "match" attribute, but you'll need to compute it.

blazethunderstorm · 2025-07-16T13:27:42Z

@ThinkOpenly would do that

dhower-qc · 2025-07-16T15:00:31Z

Thanks for the contribution! Like @ThinkOpenly said, you'll need to build up what used to be explicit in 'match'. Basically, start with an instruction-length string of '-'s and then replace any position that is occupied by an opcode value.

So if you have:

# andn.yaml
format:
  funct7:
    display_name: ANDN
    location: 31-25
    value: 0b0100000
  funct3:
    display_name: ANDN
    location: 14-12
    value: 0b111
  opcode:
    display_name: OP
    location: 6-0
    value: 0b0110011

You want to wind up with the string:

0100000----------111-----0110011

blazethunderstorm · 2025-07-16T15:08:52Z

@dhower-qc thanks for help would make the changes as req

blazethunderstorm · 2025-07-16T15:29:55Z

@ThinkOpenly @dhower-qc pls review

ThinkOpenly

Thanks again for your efforts. Good code. Comments/questions inline.

ThinkOpenly · 2025-07-16T16:06:55Z

backends/generators/generator.py

+                        if value is None or location is None:
+                            continue
+
+                        if isinstance(location, str) and "-" in location:


You need to handle when location does not have -. They all do today, but this is not guaranteed. In this case, the field is a single bit.

ThinkOpenly · 2025-07-16T16:16:39Z

backends/generators/generator.py

+                except Exception as e:
+                    logging.error(f"Failed to construct match string for {name} in {path}: {e}")
+                    encoding_filtered += 1
+                    continue


Do we need this? This would have the effect of logged, but otherwise ignored errors. We may actually want a crash here.

ThinkOpenly · 2025-07-16T16:17:35Z

backends/generators/generator.py

+                    encoding_filtered += 1
+                    continue
+
+                continue 


does this have any effect?

ThinkOpenly · 2025-07-16T16:23:59Z

backends/generators/generator.py

@@ -255,14 +283,30 @@ def load_instructions(
                encoding_filtered += 1
                continue

-            match_str = encoding_to_use.get("match")
+            match_str = encoding_to_use.get("match") if encoding_to_use else None
+            mask_str = encoding_to_use.get("mask") if encoding_to_use else None


Do we need to do any work with "mask" here? I thought it was calculated elsewhere based on the "match" string.

blazethunderstorm · 2025-07-16T16:26:25Z

@ThinkOpenly would think about this and inform you and make changes

fix(gen): support subtype-based encodings in generator

6ee1689

Fixes riscv-software-src#893

blazethunderstorm requested review from dhower-qc and ThinkOpenly as code owners July 15, 2025 23:12

blazethunderstorm changed the title ~~fix(gen): support subtype-based encodings in generator~~ fix(gen): support subtype-based encodings in C header generator Jul 15, 2025

ThinkOpenly requested changes Jul 15, 2025

View reviewed changes

fix(gen): support subtype-based encodings in C header generator

a57ffa7

Fixes riscv-software-src#893

blazethunderstorm requested a review from ThinkOpenly July 15, 2025 23:19

blazethunderstorm and others added 2 commits July 16, 2025 20:57

fixed the error

15172fd

Merge branch 'main' into fix-gen-subtype-encoding-893

a228ba6

ThinkOpenly requested changes Jul 16, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(gen): support subtype-based encodings in C header generator #904

fix(gen): support subtype-based encodings in C header generator #904

Uh oh!

blazethunderstorm commented Jul 15, 2025 •

edited

Loading

Uh oh!

ThinkOpenly left a comment

Uh oh!

blazethunderstorm commented Jul 15, 2025

Uh oh!

ThinkOpenly commented Jul 16, 2025

Uh oh!

blazethunderstorm commented Jul 16, 2025

Uh oh!

dhower-qc commented Jul 16, 2025

Uh oh!

blazethunderstorm commented Jul 16, 2025

Uh oh!

blazethunderstorm commented Jul 16, 2025

Uh oh!

ThinkOpenly left a comment

Uh oh!

ThinkOpenly Jul 16, 2025

Uh oh!

ThinkOpenly Jul 16, 2025 •

edited

Loading

Uh oh!

ThinkOpenly Jul 16, 2025

Uh oh!

ThinkOpenly Jul 16, 2025

Uh oh!

blazethunderstorm commented Jul 16, 2025

Uh oh!

Uh oh!

fix(gen): support subtype-based encodings in C header generator #904

Are you sure you want to change the base?

fix(gen): support subtype-based encodings in C header generator #904

Uh oh!

Conversation

blazethunderstorm commented Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ThinkOpenly left a comment

Choose a reason for hiding this comment

Uh oh!

blazethunderstorm commented Jul 15, 2025

Uh oh!

ThinkOpenly commented Jul 16, 2025

Uh oh!

blazethunderstorm commented Jul 16, 2025

Uh oh!

dhower-qc commented Jul 16, 2025

Uh oh!

blazethunderstorm commented Jul 16, 2025

Uh oh!

blazethunderstorm commented Jul 16, 2025

Uh oh!

ThinkOpenly left a comment

Choose a reason for hiding this comment

Uh oh!

ThinkOpenly Jul 16, 2025

Choose a reason for hiding this comment

Uh oh!

ThinkOpenly Jul 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ThinkOpenly Jul 16, 2025

Choose a reason for hiding this comment

Uh oh!

ThinkOpenly Jul 16, 2025

Choose a reason for hiding this comment

Uh oh!

blazethunderstorm commented Jul 16, 2025

Uh oh!

Uh oh!

blazethunderstorm commented Jul 15, 2025 •

edited

Loading

ThinkOpenly Jul 16, 2025 •

edited

Loading