Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error code framework #17719

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Error code framework #17719

wants to merge 1 commit into from

Conversation

zoltanvb
Copy link
Contributor

@zoltanvb zoltanvb commented Mar 21, 2025

Description

As one of the feedbacks from the RetroArch UI Improvements series put together by Nic Watt, the ways RetroArch (or any other frontend) could indicate errors about the cores, is quite limited. The Libretro API functions provide little feedback opportunity for the cores to indicate problems.

By introducing a new environment call, the possibility will be available across the board.

Error codes and messages

Error code is a 32-bit uint. Upper 16 bits are for the frontend, lower 16 bits are reserved for the core. Core codes are not planned to be standardized.

The error code enum list is definitely not complete, and not even mandatory, but by defining a few error categories, the way is open for a textual feedback to be given, even localized. In the future, frontend could also use error codes for purposes not related to cores.

Specific error codes have associated messages. Displayed message priority in the current implementation:

  • if core sends a specific message, it will be displayed
  • if message from core is empty, message belonging to the specific error code is displayed (masking core bits)
  • if no such message, message belonging to the code category is displayed
  • if not even category is recognized, a generic error message is displayed

Environment call

The environment call is not special in any way. It stores one (and only one) error code and a message, that is zeroed before each Libretro API call going to the core. If the error code is set during any such call (run, serialize, reset...), it will be logged and displayed as a widget, according to the severity:

  • error, when the action is clearly unsuccessful based on return value from the core (such as failure of retro_load_game)
  • warning, currently not used
  • log, default
  • debug, currently only for the environment call itself, and it is not shown as message

If there are multiple codes set during one API call, the last one will be reported.

Core side

Example is added to Remote Retropad, can be triggered by trying to set input device type.

Possible future improvements

  • show errors as sort of modal dialog, if it is fatal (i.e. core is terminated) - probably reusing the help text display, but it needs more investigation
  • add customizing options to suppress messages for certain error levels
  • more error codes

@schellingb
Copy link
Contributor

schellingb commented Mar 21, 2025

I like the approach, but I think it should be better defined. The defined error codes should be tied to specific functions. Can retro_serialize really return RETROE_UNSUPPORTED_CONTENT_FORMAT? What would that mean?

So if the error codes were tied to specific functions, the frontend wouldn't even have to do the whole setting to 0 and checking of the error code for calls that don't have error codes defined for them. And the usage would be more clear.

I like that thought is being put into passing an error code the other way (from the frontend to the core) but it seems a bit vague. I assume there would need to be a RETRO_ENVIRONMENT_GET_ERROR_CODE? I think it's fine to use the same enum for error codes in both directions. But is the strict split at 0xFFFF needed?

In my mind it should just be a simple list of error codes, each strictly tied to one or more functions that either the core or the frontend does into the other side. That means it can be one of the retro_ functions, an environment call, or any of the interface callbacks where an interface struct is given to the frontend by the core (such callbacks go both ways, too).

The error code enum list is definitely not complete, and not even mandatory, but by defining a few error categories, the way is open for a textual feedback to be given, even localized. Frontend could also use error codes for purposes not related to cores.

With that being said, I don't think a frontend or core should be allowed to add its own error codes like proposed here. What if the next libretro.h defines a new error code that just happens to match one some frontend or core ended up using on its own? It would be much easier to not allow that. A frontend can very well keep its own custom error states in its own data. And if we think this should be an option for custom cores interacting with specific custom frontends, then maybe that is where a fixed range could be used instead, i.e. a promise that libretro.h will never define codes in that range.

@zoltanvb
Copy link
Contributor Author

Thanks for the thoughts!

I like the approach, but I think it should be better defined. The defined error codes should be tied to specific functions. Can retro_serialize really return RETROE_UNSUPPORTED_CONTENT_FORMAT? What would that mean?

So if the error codes were tied to specific functions, the frontend wouldn't even have to do the whole setting to 0 and checking of the error code for calls that don't have error codes defined for them. And the usage would be more clear.

This is reasonable, but would introduce more complexity. I was considering reserving part of the error code to indicate the function that was being invoked, but then considered it unnecessary, as frontend knows already what was invoked.

But, there is a catch: the core is not required to call the environment callback from the same thread that was invoked. So the code could set this environment variable any time, if some threads were spawned. (Which also means that zeroing the error code before calling is potentially losing data, hmm.)

The clean solution would be a return value in all cases, but that is an API breakage.

I like that thought is being put into passing an error code the other way (from the frontend to the core) but it seems a bit vague. I assume there would need to be a RETRO_ENVIRONMENT_GET_ERROR_CODE? I think it's fine to use the same enum for error codes in both directions. But is the strict split at 0xFFFF needed?

Uhh, I think I was unclear on that. The starting point was: https://youtu.be/yhlncKDNBeg?list=PLE-F1WiiFWN_l4JqehPtr7FG3sl2Sk4cY&t=1543 (I did not comment the slide deck in detail, some technical aspects of RA / libretro are not quite like as in the presentation.)
So, the aim is just to have an error code presented for the user, where first (hex) digits are frontend-specific, last (hex) digits are core specific. Frontend can show both, and if provided, it can help the user identify the root cause, or at least help debugging efforts. Technically, error code can be just substituted with logging. Restricting errors for the codes just makes it better presentable.
As for passing error codes from the frontend to the core: I think the libretro API is clear enough for unsupported or error situations in that direction, so I did not plan that, but it could be added.

In my mind it should just be a simple list of error codes, each strictly tied to one or more functions that either the core or the frontend does into the other side. That means it can be one of the retro_ functions, an environment call, or any of the interface callbacks where an interface struct is given to the frontend by the core (such callbacks go both ways, too).

It makes the code range a bit more fragmented, but maybe it can be formulated like that. If I had to guess, >50% of error situations happens in retro_init (which I may have omitted from this PR...) or load_content, a good deal more in retro_run, and <10% for all others (serialize, unserialize, cheats...)

With that being said, I don't think a frontend or core should be allowed to add its own error codes like proposed here. What if the next libretro.h defines a new error code that just happens to match one some frontend or core ended up using on its own? It would be much easier to not allow that. A frontend can very well keep its own custom error states in its own data. And if we think this should be an option for custom cores interacting with specific custom frontends, then maybe that is where a fixed range could be used instead, i.e. a promise that libretro.h will never define codes in that range.

Core specific parts are the lower 32 bits, and it is entirely up to the core, not coordinated. Good point about different frontends, probably frontend specific ranges could also be reserved in addition to common ones.

@schellingb
Copy link
Contributor

Gotcha, sorry about my misunderstanding where I thought that this is also supposed to be for a core to learn about errors in the frontend. Just core to frontend makes more sense.

But, there is a catch: the core is not required to call the environment callback from the same thread that was invoked.

As far as I'm aware the libretro API gives near zero warranties towards using any API anywhere but on the main thread, so I don't think this needs much consideration for this API now.

So, the aim is just to have an error code presented for the user, where first (hex) digits are frontend-specific, last (hex) digits are core specific. Frontend can show both, and if provided, it can help the user identify the root cause, or at least help debugging efforts. Technically, error code can be just substituted with logging. Restricting errors for the codes just makes it better presentable.

This proposal is starting to loose my support 😅

Who the heck likes to get error codes thrown at their face? If you're a user, anything is better than an error code. I'd take a crude abbreviation or text in a language I can't even read over a numbered code any day! I thought the plan for this was to present nice error messages in a case where a core doesn't want to go the extra mile to use RETRO_ENVIRONMENT_SET_MESSAGE_EXT to show a custom warning or error. As a maintainer of a core myself, I'd love a big list of things my code could pass with RETRO_ENVIRONMENT_SET_ERROR_CODE which is then translated to a clear and readable (localized!) error message to the user. A code is worse for everyone, worse for users, worse for the few people out there spending their free time to help users in need, and worse for core maintainers.

Technically, error code can be just substituted with logging.

What if there was a way for the core to pass both a code (which should be from a pre-defined list) and a text message, then the frontend could show the localized generic error message for that code and a "[ More Info ]" button which would show the text message. Or the frontend could FIFO buffer the last few log messages and present them to the user.

I really think the focus should be on providing information where a user ends up in the best position to help themselves. This isn't some big corporation software with a well staffed support team or some consumer product too scared of showing any technical details.

@zoltanvb
Copy link
Contributor Author

Good thoughts, exactly why I wanted to open a discussion. I think having a list of code ranges and associated frontend explanations is possible, it does not need to be restricted to a pure numerical value. But it will not cover each case, and a code is searchable, giving a better opportunity for self-support (via a favorite search engine) and that information could be more detailed and more up-to-date than what we can hardcode. (It can also be completely wrong, ofc.)

Providing extra text along with the error code: hmm, probably doable (not localized, though). But the error presentation that is achievable with little effort, would be similar to the menu help screen, which is currently limited to 512 characters or thereabouts, so it would not be that much more information.

@zoltanvb zoltanvb force-pushed the error_codes branch 2 times, most recently from 6e1a0b0 to 02a7574 Compare March 29, 2025 18:40
@zoltanvb
Copy link
Contributor Author

Updated a bit, message can be added now, and all core calls should have the code check, luckily it is all through runloop.c. Still log-only, I plan to add at least widget message before proposing merge.

An example is implemented in Remote Retropad which can be triggered by setting input device type.

[DEBUG] [Error code]: Code received in retro_environment_set: 0004-0805 "Device type change is not supported!".
[INFO] [Error code]: Code received in core_set_controller_port_device: 0004-0805 "Device type change is not supported!".

@zoltanvb
Copy link
Contributor Author

zoltanvb commented Apr 1, 2025

Alright, so in the current format it can actually do something meaningful. A few more parts added to Remote Retropad, more codes can be tested by either enabling the code behind if (false), or setting the serialize_size to 1 and trying to load-save states.

Leaving it in draft for 1-2 days, if no further comments, I plan to set it in ready and from my point of view it can go forward.

As one of the feedbacks from the RetroArch UI Improvements
series put together by Nic Watt, the ways RetroArch (or any
other frontend) could indicate errors about the cores, is quite
limited. The Libretro API functions provide little feedback
opportunity for the cores to indicate problems.

By introducing a new environment call, the possibility will be
available for the cores to indicate errors at any point, using
a code that contains a defined part and a core specific part,
as well as a short textual message.

RetroArch will display such errors as on-screen messages (widgets),
with the appropriate message class. Localized explanation of the
error codes can be maintained.

Example for the core side added to Remote Retropad, can be seen
by trying to change input device type.
@zoltanvb
Copy link
Contributor Author

zoltanvb commented Apr 2, 2025

Rebased and updated first comment, from my side it is ready.

@zoltanvb zoltanvb marked this pull request as ready for review April 2, 2025 05:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants