Skip to content

Conversation

@AlexJones0
Copy link

This PR contains two commits aimed at addressing flaky/inconsistent JTAG behaviour seen in various OpenTitan tests which appeared to originate from resetting and then halting the hart using the RV_DM over JTAG.

The first commit contains a bug fix to the pulp_rv_dm where harts could be seen as idle/halted across being reset (and no longer being halted), causing halt acknowledgements to not be transmitted due to caching. The second commit contains a work-around (self-contained within dm.c without changes to the RISC-V CPU) to address the fact that halt requests sent to unresponsive harts are currently dropped whereas the RISC-V Debug specification states that they should instead be applied immediately when the hart becomes responsive. See the commit messages for more detailed explanations.

Also, as a related question (not planning to address in this PR): I have noticed in dm.c that upon an NDM reset the entire system is reset (which appears to include the debug module itself, from tracing) despite the fact that the DM itself is not supposed to be reset. I wonder if there's some logic I'm missing that means this reset is not a problem or if this could cause other issues down the line?

Reset the `idle_bm` on a reset of the Pulp RV_DM, as a system / NDM
reset while a hart is halted can cause the hart to resume, whilst the
Pulp RV_DM still thinks the hart is halted (thus it never acknowledges
the halted state, and it never appears in the DM's `dmstatus`).

Signed-off-by: Alex Jones <[email protected]>
This patch works around the Debug Module's current non-conformance with
the RISC-V debug specification in how it handles halt requests
(haltreqs) to unresponsive harts (e.g. harts currently in reset).
Several parts of the RISC-V Debug Specification refer to this behaviour
(see sections 3.2, 3.4, 3.5) but section C.1.3 is specifically a bug
fix defined just for this behaviour.

The current DM implementation just ignores incoming halt requests if the
hart is unresponsive. This commit instead latches these requests in a
bitmask, so that when the hart comes out of reset (i.e. starts executing
instructions) it can be checked and used to immediately halt the hart.

Unfortunately, a fully correct implementation of this behaviour would
likely require a direct link between the RISC-V CPU and the DM to allow
the hart to be halted at this point. Such links are not currently
supported by QEMU. This commit introduces a reasonable workaround where
we poll the availability of cores on a `dmstatus` read and halt newly
responsive harts with latched haltreqs there, relying on the fact that
most debuggers interacting with the DM will write the haltreq and then
repeatedly poll `dmstatus` to watch the harts halt (see e.g. section
B.3 of the RISC-V Debug specification).

This is not 100% conformant (if dmstatus is not polled for a while then
the hart will execute some instructions whereas in real HW it would
halt immediately before executing guest code) but is a small workaround
/ hack that can support the majority of practical use cases reasonably
enough.

Signed-off-by: Alex Jones <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant