-
Notifications
You must be signed in to change notification settings - Fork 7.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ESP32S3 and ESP32S2 (and potentially others) ROM bug in Cache_Flash_To_SPIRAM_Copy
(IDFGH-14494)
#15263
Open
3 tasks done
Labels
Status: In Progress
Work is in progress
Comments
github-actions
bot
changed the title
ESP32S3 and ESP32S2 (and potentially others) ROM bug in
ESP32S3 and ESP32S2 (and potentially others) ROM bug in Jan 22, 2025
Cache_Flash_To_SPIRAM_Copy
Cache_Flash_To_SPIRAM_Copy
(IDFGH-14494)
EliteTK
added a commit
to EliteTK/esp-hal
that referenced
this issue
Jan 23, 2025
This implementation mirrors how the ESP-IDF implementation of this feature (which is based on the `Cache_Flash_To_SPIRAM_Copy` rom function) works except it differs in a few key ways: The ESP-IDF seems to map `.text` and `.rodata` into the first and second 128 cache pages respectively (although looking at the linker scripts, I'm not sure how, but a runtime check confirmed this seemed to be the case). This is reflected in how the `Cache_Count_Flash_Pages`, `Cache_Flash_To_SPIRAM_Copy` rom functions and the ESP-IDF code executing them works. The count function can only be made to count flash pages within the first 256 pages (of which there are 512 on the ESP32-S3). Likewise, the copy function will only copy flash pages which are mapped within the first 256 entries (across two calls). As the esp-hal handles mapping `.text` and `.rodata` differently, these ROM functions are technically not appropriate if more than 256 pages of flash (`.text` and `.rodata` combined) are in use by the application. Additionally, the functions both contain bugs, one of which the IDF attempts to work around incorrectly, and the other which the IDF does not appear to be aware of. Details of these bugs can be found on the IDF issue/PR tracker[0][1]. As a result, this commit contains a heavily modified/adjusted rust re-write of the reverse engineered ROM code combined with a vague port of the ESP-IDF code. There are three additional noteworthy differences from the ESP-IDF version of the code: 1. The ESP-IDF allows the `.text` and `.rodata` segments to be mapped independently and separately allowing only one to be mapped. But the current version of the code does not allow this flexibility. This can be implemented by checking the address of each page entry against the segment locations to determine which segment each address belongs to. 2. The ESP-IDF calls `cache_ll_l1_enable_bus(..., cache_ll_l1_get_bus(..., SOC_EXTRAM_DATA_HIGH, 0));` (functions from the ESP-IDF) in order to "Enable the most high bus, which is used for copying FLASH `.text` to PSRAM" but on the ESP32-S3 after careful inspection these calls result in a no-op as the address passed to cache_ll_l1_get_bus will result in an empty cache bus mask. It's currently unclear to me if this is a bug in the ESP-IDF code, or if this code (which from cursory investigation is probably not a no-op on the -S2) is solely targetting the ESP32-S3. 3. The ESP-IDF calls `Cache_Flash_To_SPIRAM_Copy` with an icache address when copying `.text` and a dcache address when copying `.rodata`. This affects which cache the reads will occur through. But the writes always go through a "spare page" (name I came up with during reverse engineering) via the dcache. This code performs all reads through the dcache. I don't know if there's a proper reason to read through the correct cache when doing the copy and this doesn't appear to have any negative impact. [0]: espressif/esp-idf#15262 [1]: espressif/esp-idf#15263
EliteTK
added a commit
to EliteTK/esp-hal
that referenced
this issue
Jan 23, 2025
This implementation mirrors how the ESP-IDF implementation of this feature (which is based on the `Cache_Flash_To_SPIRAM_Copy` rom function) works except it differs in a few key ways: The ESP-IDF seems to map `.text` and `.rodata` into the first and second 128 cache pages respectively (although looking at the linker scripts, I'm not sure how, but a runtime check confirmed this seemed to be the case). This is reflected in how the `Cache_Count_Flash_Pages`, `Cache_Flash_To_SPIRAM_Copy` rom functions and the ESP-IDF code executing them works. The count function can only be made to count flash pages within the first 256 pages (of which there are 512 on the ESP32-S3). Likewise, the copy function will only copy flash pages which are mapped within the first 256 entries (across two calls). As the esp-hal handles mapping `.text` and `.rodata` differently, these ROM functions are technically not appropriate if more than 256 pages of flash (`.text` and `.rodata` combined) are in use by the application. Additionally, the functions both contain bugs, one of which the IDF attempts to work around incorrectly, and the other which the IDF does not appear to be aware of. Details of these bugs can be found on the IDF issue/PR tracker[0][1]. As a result, this commit contains a heavily modified/adjusted rust re-write of the reverse engineered ROM code combined with a vague port of the ESP-IDF code. There are three additional noteworthy differences from the ESP-IDF version of the code: 1. The ESP-IDF allows the `.text` and `.rodata` segments to be mapped independently and separately allowing only one to be mapped. But the current version of the code does not allow this flexibility. This can be implemented by checking the address of each page entry against the segment locations to determine which segment each address belongs to. 2. The ESP-IDF calls `cache_ll_l1_enable_bus(..., cache_ll_l1_get_bus(..., SOC_EXTRAM_DATA_HIGH, 0));` (functions from the ESP-IDF) in order to "Enable the most high bus, which is used for copying FLASH `.text` to PSRAM" but on the ESP32-S3 after careful inspection these calls result in a no-op as the address passed to cache_ll_l1_get_bus will result in an empty cache bus mask. It's currently unclear to me if this is a bug in the ESP-IDF code, or if this code (which from cursory investigation is probably not a no-op on the -S2) is solely targetting the ESP32-S3. 3. The ESP-IDF calls `Cache_Flash_To_SPIRAM_Copy` with an icache address when copying `.text` and a dcache address when copying `.rodata`. This affects which cache the reads will occur through. But the writes always go through a "spare page" (name I came up with during reverse engineering) via the dcache. This code performs all reads through the dcache. I don't know if there's a proper reason to read through the correct cache when doing the copy and this doesn't appear to have any negative impact. [0]: espressif/esp-idf#15262 [1]: espressif/esp-idf#15263
EliteTK
added a commit
to EliteTK/esp-hal
that referenced
this issue
Jan 23, 2025
This implementation mirrors how the ESP-IDF implementation of this feature (which is based on the `Cache_Flash_To_SPIRAM_Copy` rom function) works except it differs in a few key ways: The ESP-IDF seems to map `.text` and `.rodata` into the first and second 128 cache pages respectively (although looking at the linker scripts, I'm not sure how, but a runtime check confirmed this seemed to be the case). This is reflected in how the `Cache_Count_Flash_Pages`, `Cache_Flash_To_SPIRAM_Copy` rom functions and the ESP-IDF code executing them works. The count function can only be made to count flash pages within the first 256 pages (of which there are 512 on the ESP32-S3). Likewise, the copy function will only copy flash pages which are mapped within the first 256 entries (across two calls). As the esp-hal handles mapping `.text` and `.rodata` differently, these ROM functions are technically not appropriate if more than 256 pages of flash (`.text` and `.rodata` combined) are in use by the application. Additionally, the functions both contain bugs, one of which the IDF attempts to work around incorrectly, and the other which the IDF does not appear to be aware of. Details of these bugs can be found on the IDF issue/PR tracker[0][1]. As a result, this commit contains a heavily modified/adjusted rust re-write of the reverse engineered ROM code combined with a vague port of the ESP-IDF code. There are three additional noteworthy differences from the ESP-IDF version of the code: 1. The ESP-IDF allows the `.text` and `.rodata` segments to be mapped independently and separately allowing only one to be mapped. But the current version of the code does not allow this flexibility. This can be implemented by checking the address of each page entry against the segment locations to determine which segment each address belongs to. 2. The ESP-IDF calls `cache_ll_l1_enable_bus(..., cache_ll_l1_get_bus(..., SOC_EXTRAM_DATA_HIGH, 0));` (functions from the ESP-IDF) in order to "Enable the most high bus, which is used for copying FLASH `.text` to PSRAM" but on the ESP32-S3 after careful inspection these calls result in a no-op as the address passed to cache_ll_l1_get_bus will result in an empty cache bus mask. It's currently unclear to me if this is a bug in the ESP-IDF code, or if this code (which from cursory investigation is probably not a no-op on the -S2) is solely targetting the ESP32-S3. 3. The ESP-IDF calls `Cache_Flash_To_SPIRAM_Copy` with an icache address when copying `.text` and a dcache address when copying `.rodata`. This affects which cache the reads will occur through. But the writes always go through a "spare page" (name I came up with during reverse engineering) via the dcache. This code performs all reads through the dcache. I don't know if there's a proper reason to read through the correct cache when doing the copy and this doesn't appear to have any negative impact. [0]: espressif/esp-idf#15262 [1]: espressif/esp-idf#15263
EliteTK
added a commit
to EliteTK/esp-hal
that referenced
this issue
Jan 23, 2025
This implementation mirrors how the ESP-IDF implementation of this feature (which is based on the `Cache_Flash_To_SPIRAM_Copy` rom function) works except it differs in a few key ways: The ESP-IDF seems to map `.text` and `.rodata` into the first and second 128 cache pages respectively (although looking at the linker scripts, I'm not sure how, but a runtime check confirmed this seemed to be the case). This is reflected in how the `Cache_Count_Flash_Pages`, `Cache_Flash_To_SPIRAM_Copy` rom functions and the ESP-IDF code executing them works. The count function can only be made to count flash pages within the first 256 pages (of which there are 512 on the ESP32-S3). Likewise, the copy function will only copy flash pages which are mapped within the first 256 entries (across two calls). As the esp-hal handles mapping `.text` and `.rodata` differently, these ROM functions are technically not appropriate if more than 256 pages of flash (`.text` and `.rodata` combined) are in use by the application. Additionally, the functions both contain bugs, one of which the IDF attempts to work around incorrectly, and the other which the IDF does not appear to be aware of. Details of these bugs can be found on the IDF issue/PR tracker[0][1]. As a result, this commit contains a heavily modified/adjusted rust re-write of the reverse engineered ROM code combined with a vague port of the ESP-IDF code. There are three additional noteworthy differences from the ESP-IDF version of the code: 1. The ESP-IDF allows the `.text` and `.rodata` segments to be mapped independently and separately allowing only one to be mapped. But the current version of the code does not allow this flexibility. This can be implemented by checking the address of each page entry against the segment locations to determine which segment each address belongs to. 2. The ESP-IDF calls `cache_ll_l1_enable_bus(..., cache_ll_l1_get_bus(..., SOC_EXTRAM_DATA_HIGH, 0));` (functions from the ESP-IDF) in order to "Enable the most high bus, which is used for copying FLASH `.text` to PSRAM" but on the ESP32-S3 after careful inspection these calls result in a no-op as the address passed to cache_ll_l1_get_bus will result in an empty cache bus mask. It's currently unclear to me if this is a bug in the ESP-IDF code, or if this code (which from cursory investigation is probably not a no-op on the -S2) is solely targetting the ESP32-S3. 3. The ESP-IDF calls `Cache_Flash_To_SPIRAM_Copy` with an icache address when copying `.text` and a dcache address when copying `.rodata`. This affects which cache the reads will occur through. But the writes always go through a "spare page" (name I came up with during reverse engineering) via the dcache. This code performs all reads through the dcache. I don't know if there's a proper reason to read through the correct cache when doing the copy and this doesn't appear to have any negative impact. [0]: espressif/esp-idf#15262 [1]: espressif/esp-idf#15263
6 tasks
espressif-bot
added
Status: In Progress
Work is in progress
and removed
Status: Opened
Issue is new
labels
Jan 24, 2025
EliteTK
added a commit
to EliteTK/esp-hal
that referenced
this issue
Jan 25, 2025
This implementation mirrors how the ESP-IDF implementation of this feature (which is based on the `Cache_Flash_To_SPIRAM_Copy` rom function) works except it differs in a few key ways: The ESP-IDF seems to map `.text` and `.rodata` into the first and second 128 cache pages respectively (although looking at the linker scripts, I'm not sure how, but a runtime check confirmed this seemed to be the case). This is reflected in how the `Cache_Count_Flash_Pages`, `Cache_Flash_To_SPIRAM_Copy` rom functions and the ESP-IDF code executing them works. The count function can only be made to count flash pages within the first 256 pages (of which there are 512 on the ESP32-S3). Likewise, the copy function will only copy flash pages which are mapped within the first 256 entries (across two calls). As the esp-hal handles mapping `.text` and `.rodata` differently, these ROM functions are technically not appropriate if more than 256 pages of flash (`.text` and `.rodata` combined) are in use by the application. Additionally, the functions both contain bugs, one of which the IDF attempts to work around incorrectly, and the other which the IDF does not appear to be aware of. Details of these bugs can be found on the IDF issue/PR tracker[0][1]. As a result, this commit contains a heavily modified/adjusted rust re-write of the reverse engineered ROM code combined with a vague port of the ESP-IDF code. There are three additional noteworthy differences from the ESP-IDF version of the code: 1. The ESP-IDF allows the `.text` and `.rodata` segments to be mapped independently and separately allowing only one to be mapped. But the current version of the code does not allow this flexibility. This can be implemented by checking the address of each page entry against the segment locations to determine which segment each address belongs to. 2. The ESP-IDF calls `cache_ll_l1_enable_bus(..., cache_ll_l1_get_bus(..., SOC_EXTRAM_DATA_HIGH, 0));` (functions from the ESP-IDF) in order to "Enable the most high bus, which is used for copying FLASH `.text` to PSRAM" but on the ESP32-S3 after careful inspection these calls result in a no-op as the address passed to cache_ll_l1_get_bus will result in an empty cache bus mask. It's currently unclear to me if this is a bug in the ESP-IDF code, or if this code (which from cursory investigation is probably not a no-op on the -S2) is solely targetting the ESP32-S3. 3. The ESP-IDF calls `Cache_Flash_To_SPIRAM_Copy` with an icache address when copying `.text` and a dcache address when copying `.rodata`. This affects which cache the reads will occur through. But the writes always go through a "spare page" (name I came up with during reverse engineering) via the dcache. This code performs all reads through the dcache. I don't know if there's a proper reason to read through the correct cache when doing the copy and this doesn't appear to have any negative impact. [0]: espressif/esp-idf#15262 [1]: espressif/esp-idf#15263
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Answers checklist.
General issue report
While working on implementing XiP from PSRAM for the rust esp-hal project I reverse engineered the
Cache_Flash_To_SPIRAM_Copy
from the ESP32S3 ROM (revision 0). I have reproduced my cleaned up version of the function (reverse engineered with the help of Ghidra) below:As can be seen from the comment and code above, the
start_page
variable is incremented before thepage0_page
in-out parameter is updated. This means that thestart_page
variable will be referring to the PSRAM page immediately after the location of the page which contains flash page 0 data.This means that only the first page which is mapped to page 0 in flash will be handled correctly and all subsequent pages will end up pointing at a copy of a differnet flash page or at an uninitialised PSRAM page.
In cases where the page 0 mapping points at an uninitialised PSRAM page, the unused page may even be allocated for other purposes.
I believe a fix for this may be to re-implement the functionality of this function entirely in the IDF or to ensure that this function is never called if there is more than 1 flash page 0 mapping.
I confirmed this bug in a bare-metal environment by mapping flash page 0 multiple times and calling this function.
The text was updated successfully, but these errors were encountered: