debug.superh.aud: Add SuperH AUD-II read support #985

pd0wm · 2025-08-01T08:56:35Z

AUD-II is an interesting protocol as it seems to be present on all SH-2A SuperH chips. Unlike JTAG (H-UDI) it looks like it cannot be turned off or password protected. The protocol is meant for real time tracing of data and branches, however it also has a second mode: "RAM Monitoring Mode". This is a simple mode where you can peek/poke at memory in 1, 2 or 4 byte chunks.

This applet only implements the "Read" command of the "RAM Monitoring Mode", as I was using it dump the flash of one of these SuperH chips from an automotive part.

Dumping 512kB takes about 8 minutes. I think most of the time is wasted due to USB round-trips waiting for the data to be ready and reading out each nibble individually. This seems to take about 250us per nibble. I could move this logic to the FPGA side, and return 4 bytes at once. However, not sure if that complexity is worth it.

This is my first time writing a glasgow applet, so feel free to let me know if there are things I could have done better or more efficient. Also happy to remove some comments if you think they are superfluous. Let me know if it's in the right place in the debugmodule, it's a debug/tracing protocol but I'm currently only using it to read memory.

PR is in Draft until I confirm that the docs still build properly.

Logic Analyzer trace of reading 4 bytes:

Reference from datasheet:

pd0wm · 2025-08-04T10:03:19Z

Confirmed the docs build locally and the new applet shows up. Let me know what you think!

whitequark · 2025-08-04T10:20:39Z

I'll try to do a review today; was busy with other things, sorry!

whitequark

Although I wrote over 2100 words of review comments, please don't take this to mean that I dislike your PR. On the contrary, I'm incredibly impressed: this is one of the nicest PRs from a first-time contributor I've seen for Glasgow. You've clearly taken the time to examine at existing code, understand its intent, and replicate it in your applet, even though there's basically no documentation, and without being explicitly directed by a reviewer. I would be very happy to have you as a repeat contributor, which is why I spent a few hours making sure we're on the same page going forward.

Dumping 512kB takes about 8 minutes. I think most of the time is wasted due to USB round-trips waiting for the data to be ready and reading out each nibble individually. This seems to take about 250us per nibble. I could move this logic to the FPGA side, and return 4 bytes at once. However, not sure if that complexity is worth it.

I'd say that the complexity is worth not having to dump half a megabyte over a slower-than-9600-baud interface alone (or maybe I just lack patience). Impatience aside, being able to do things like wait state polling on the FPGA is how Glasgow ends up being so much nicer than other tools, and is what the FPGA is there for. (Our probe-rs applet is one of the fastest, possibly the fastest, supported SWD debug probe, usually with a good margin too. This kind of thing is what I want the project to be known for in general.)

It is true that there is some increase in complexity due to having to replicate the nibble read state machine and its timeout handling, but you could manage that by refactoring the code a little, which I explain further in the comments below.

whitequark · 2025-08-04T22:50:40Z

docs/manual/src/applets/debug/index.rst

@@ -9,3 +9,4 @@ MCU debugging
    :maxdepth: 3

    arm7
+    superh


I think the appropriate taxon is program even though it is technically using an interface with "Debug" in its name:

The program taxon groups applets implementing interfaces to memory technology devices (volatile and non-volatile) that are directly connected to programmable logic, such as a microcontroller or a gate array.

I tried to figure out how you would make a debugger using AUD-II and it's not clear to me that you can. The built-in modes are tracing and arbitrary read/write. You have a breakpoint module (UBC) that seems programmable via monitoring mode which could support single-stepping and breakpoints. There is however no way to get an arbitrary CPU register via AUD-II (or execute an arbitrary instruction, or even just get the value of R15), so you'd need help from the firmware: an ISR that dumps the CPU context into some predefined RAM area that is used for communicating with the debugger. Even then, UBC has no way to tell which breakpoint has fired on any particular cycle, so even though it supports data breakpoints you can't use it for watchpoints: you wouldn't know whether a breakpoint or a watchpoint has fired, which breaks single-stepping. You could work around that by adding an SH-2A emulator and an architecture model and evaluating every configured breakpoint/watchpoint against the post-breakpoint architectural state, if you wanted to create an abomination and had a vast excess of free time on your hands.

I think it's safe to say nobody will ever contribute a debugger for this chip.

whitequark · 2025-08-04T22:55:35Z

software/glasgow/applet/debug/superh/aud/__init__.py

+from amaranth.lib.wiring import In, Out
+
+from glasgow.abstract import AbstractAssembly, GlasgowPin
+from glasgow.applet import GlasgowAppletV2


We leave two blank lines between top-level entities (like class or class and import block).

I really should set up automation for this; I'll actually do this later via https://pre-commit.ci.

whitequark · 2025-08-04T22:57:55Z

software/glasgow/applet/debug/superh/aud/__init__.py

+        data_pins = []
+        for i, pin in enumerate(self._ports.audata):
+            m.submodules[f"audata{i}_buffer"] = pin = io.Buffer("io", pin)
+            data_pins.append(pin)


Unless you need individual OE control, you don't need this: all Amaranth buffers are multi-pin capable, with shared OE.

whitequark · 2025-08-04T23:12:32Z

software/glasgow/applet/debug/superh/aud/__init__.py

+        audata_o = Signal(4)
+        audata_oe = Signal(4)
+        audata_i  = Signal(4)
+        audsync = Signal()
+        audrst = Signal()
+        audck = Signal()
+
+        # Connect the IO buffers to the signals
+        m.d.comb += m.submodules.audmd_buffer.o.eq(1) # Always in RAM monitoring mode
+        m.d.comb += m.submodules.audrst_buffer.o.eq(audrst)
+        m.d.comb += m.submodules.audck_buffer.o.eq(audck)
+        m.d.comb += m.submodules.audsync_buffer.o.eq(audsync)
+        m.d.comb += [
+            Cat(pin.oe for pin in data_pins).eq(audata_oe),
+            Cat(pin.o for pin in data_pins).eq(audata_o),
+        ]
+        m.submodules += cdc.FFSynchronizer(Cat(pin.i for pin in data_pins), audata_i)


Although this isn't a wrong way to implement this part of the applet, I would suggest the following as more compact and more idiomatic:

m.submodules.audsync_buffer = audsync_buffer = \ io.Buffer("o", self._ports.audsync) # ... with m.FSM(): # ... m.d.sync += audsync_buffer.o.eq(1)

There are two general styles in use in the codebase:

With a *Bus class: in this style, an additional component called (e.g. in this case) AudBus would have audata_o: In(4); audata_i: Out(4); audsync: In(1); ... as its signature members. The *Bus class contains little to no logic, and is used immediately in some other component, which generates strobes, sequences, and so on.

Without a *Bus class: in this style:
a. Typically you would either use IOStreamer with a Enframer+Deframer+Controller architecture, or
b. just an FSM for very simple protocols without high performance demands.

This applet looks like a mixture of (1) and (2b). I think we may actually have some applets like this in the codebase because it's sometimes impractical to fully refactor a complete but obscure applet when migrating to a new API model, but ideally we shouldn't. I think (2b) is probably the most appropriate here, but the code you have is also not wrong, just slightly unidiomatic.

(But also, see the other suggestions, which may supersede the ones in this comment.)

whitequark · 2025-08-04T23:12:58Z

software/glasgow/applet/debug/superh/aud/__init__.py

+                with m.If(self.i_stream.valid):
+                    m.d.comb += self.i_stream.ready.eq(1)
+                    m.d.sync += data.eq(self.i_stream.payload >> 4)
+                    with m.If((self.i_stream.payload & 0xF) == AUDCommand.Reset):


You could reduce repetition here by using

with m.Switch(self.i_stream.payload & 0xF): with m.Case(AUDCommand.Reset):

whitequark · 2025-08-05T00:18:34Z

software/glasgow/applet/debug/superh/aud/__init__.py

+
+        self.logger.info("Reading data")
+        bs = 4
+        with open(args.output, 'wb') as f:


You should use type=argparse.FileType('wb') for add_argument(), which causes argparse itself to ensure that the filename can be opened. It also saves you having to manage this resource yourself, which can get annoying with more complex applets.

whitequark · 2025-08-05T00:24:14Z

software/glasgow/applet/debug/superh/aud/__init__.py

+    def add_build_arguments(cls, parser, access):
+        access.add_voltage_argument(parser)
+
+        access.add_pins_argument(parser, "audata", width=4, required=True)


In applets like this one, I suggest using required=True, default=True to avoid having to specify the pinout each time. This way, if you connect one of the harnesses to your DUT, you don't need to guess which pinout it is: it is the default one.

The order of the arguments that applets use is:

reset(s)

strap(s)

clock(s)

controls

data

This ordering is done per functional block, e.g. for RGMII you would have PHY reset, then RX control/data, then TX control/data (inputs before outputs).

For this applet, I suggest:

AUDRST#

AUDMD

AUDCK

AUDSYNC

AUDATA

whitequark · 2025-08-05T00:25:40Z

software/glasgow/applet/debug/superh/aud/test.py

+
+    @synthesis_test
+    def test_build(self):
+        self.assertBuilds(self.hardware_args)


Please add a replay-hardware-in-loop test using @applet_v2_hardware_test(mocks=["aud_iface"]. I don't have the (fairly esoteric) hardware you're using, so if I'm doing a codebase-wide refactoring, I would have little ability to check if I broke it.

The way this decorator works is that the first time you run the test, it talks to your hardware and records the function calls done on the mocked class. The second time you run the test, it ensures the same arguments are passed, and provides the same return values. Provided no actual algorithms change, this gives a quick and easy way to validate that the applet is still functional, and permit a (limited) degree of refactorings, avoiding fossilization of the codebase.

whitequark · 2025-08-05T00:41:11Z

software/glasgow/applet/debug/superh/aud/__init__.py

+                with m.If(self.o_stream.ready):
+                    m.next = "RECV-COMMAND"
+
+        return m


Some general comments on the AUDComponent:

You're using all of the pins synchronously; also, the bus itself is fully synchronous except for AUDRST# (and AUDMD). I think you can use io.FFBuffer and drop FFSynchronizer (since AUDATA isn't asynchronous to your sampling clock).

This protocol includes wait states ("Not Ready" output). Polling for the end of a wait state in software is one of the worst inefficiencies a tool like Glasgow could include. I think you should, at the very least, have a command that toggles AUDCK and polls AUDATA until it becomes non-zero and then returns that value to the host. The timeout would become an input to the component, which is tied to a register in the AUDInterface class. If you implement polling in gateware, you will be able to submit a sequence of reads as a single bulk operation, without spending USB roundtrips on bus synchronization.

I took a careful look at the entire §21 and it appears that both the tracing mode and the monitor mode were designed to allow a byte-oriented implementation: for tracing, last nibble of CMD1 and first nibble of CMD2 form a header byte, and for monitoring, command/direction nibble and zeroes form a header byte. This means that an implementation using octet-based communication with the host is feasible, provided that wait state polling is implemented in hardware. Being able to use fast primitives like bytearray concatenation, int.{to,from}_bytes, etc without having to pack nibbles in Python is very good for performance; this applet deals with enough data that it starts to really matter. If you implement byte-packing in hardware, you will be able to retrieve the result of a sequence of reads as a single await self._pipe.read(size * count) operation, which is dramatically faster for large quantities of data.

I suggest at least making a small stream-based component that takes two commands: "read nibble with given sync" and "write nibble with given sync", and encapsulates toggling the clock. The AUDComponent will then only take care of command handling, byte-packing, as well as wait state polling.

Since it is so similar to reads and quite easy to implement and test, I think it would be nice if you added write support for completeness. Also, writes allow you to set up a predictable state in RAM for hardware-in-the-loop testing (so you don't have to modify your assertions if you have a different firmware image).

Of these suggestions, (1) and (2) are I'd say required for a high-quality implementation, while (3), (4), and (5) would be nice but I would not reject the applet if you choose not to implement them.

whitequark · 2025-08-05T01:34:47Z

software/glasgow/applet/debug/superh/aud/__init__.py

+        await self.aud_iface.init()
+
+        self.logger.info("Reading data")
+        bs = 4


Since you do not ensure that the address provided by the user is longword-aligned, this would have to be bs = 1, or you'll end up with bus errors. It's also fine to enforce that the address is aligned in def address(x):.

I assume there are some registers that can only be accessed in certain modes (byte/word/longword). This doesn't need to be surfaced in the CLI since the vast majority of people won't use it and those who do can use the AUDInterface class API.

pd0wm requested a review from whitequark as a code owner August 1, 2025 08:56

pd0wm mentioned this pull request Aug 1, 2025

Add G00106: SuperH AUD-II protocol GlasgowEmbedded/archive#17

Merged

pd0wm marked this pull request as draft August 1, 2025 08:57

pd0wm force-pushed the superh-aud branch from 96e295f to 9069581 Compare August 1, 2025 10:03

debug.superh.aud: Add SuperH AUD-II read support

5f658b5

pd0wm force-pushed the superh-aud branch from 9069581 to 5f658b5 Compare August 4, 2025 10:02

pd0wm marked this pull request as ready for review August 4, 2025 10:02

pd0wm added 2 commits August 4, 2025 13:30

debug.superh.aud: add assertBuilds test case

c9477ca

debug.superh.aud: split arguments into build/run

c3a1f37

whitequark requested changes Aug 5, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

debug.superh.aud: Add SuperH AUD-II read support #985

debug.superh.aud: Add SuperH AUD-II read support #985

pd0wm commented Aug 1, 2025 •

edited

Loading

Uh oh!

pd0wm commented Aug 4, 2025

Uh oh!

whitequark commented Aug 4, 2025

Uh oh!

whitequark left a comment •

edited

Loading

Uh oh!

whitequark Aug 4, 2025 •

edited

Loading

Uh oh!

whitequark Aug 4, 2025

Uh oh!

whitequark Aug 4, 2025

Uh oh!

whitequark Aug 4, 2025

Uh oh!

whitequark Aug 4, 2025

Uh oh!

whitequark Aug 5, 2025

Uh oh!

whitequark Aug 5, 2025

Uh oh!

whitequark Aug 5, 2025

Uh oh!

whitequark Aug 5, 2025 •

edited

Loading

Uh oh!

whitequark Aug 5, 2025

Uh oh!

Uh oh!

debug.superh.aud: Add SuperH AUD-II read support #985

Are you sure you want to change the base?

debug.superh.aud: Add SuperH AUD-II read support #985

Conversation

pd0wm commented Aug 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pd0wm commented Aug 4, 2025

Uh oh!

whitequark commented Aug 4, 2025

Uh oh!

whitequark left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

whitequark Aug 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

whitequark Aug 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pd0wm commented Aug 1, 2025 •

edited

Loading

whitequark left a comment •

edited

Loading

whitequark Aug 4, 2025 •

edited

Loading

whitequark Aug 5, 2025 •

edited

Loading