-
Notifications
You must be signed in to change notification settings - Fork 516
[XRT-SMI] open archive update #9517
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Akshay Tondak <[email protected]>
|
clang-tidy review says "All clean, LGTM! 👍" |
| } catch (const std::exception& /*e*/) { | ||
| // Continue without archive - this is not a fatal error | ||
| } catch (const std::exception& e) { | ||
| std::cerr << "Error opening archive: " << e.what() << std::endl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to expand on this error and direct the user what to do. The code (platform_path) already is checks a few locations, one being custom to the current XRT build
$HOME/.local/share/xrt/<major>.<minor>.<patch>/amdxdna/bins
I suggest, but I am not sure what is best, that we make the catch Linux specific and direct the user to run an install script from VTD to copy the proper archive to to this version specific location. Let discuss.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reference, based on our offline discussion, we will ship a Linux specific bash script with the npu deb package which this error should point to. The script will run a "wget" to download correct version of archive from the VTD repository based on the installed version of XRT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Let's do what you suggested. Can you add a script under core/tools/xbutil2 that can install the proper archive under $HOME/.local/share/xrt/<major>.<minor>.<patch>/amdxdna/bins? I will sort out where we install this script and then we can amend the message in the catch statement accordingly. I think you need some code in XBUtilities to compute the proper arguments to the script, like XRT major.minor.patch from a compile time define from CMake and the base archive name?
… in xrt.ini for shim tile (Xilinx#9515)
* add preemption and gemm test for aie4 Signed-off-by: AShivangi <[email protected]> * fix lint errors Signed-off-by: AShivangi <[email protected]> --------- Signed-off-by: AShivangi <[email protected]>
Signed-off-by: Takasi, Manoj <[email protected]>
* Event trace updates Signed-off-by: Akshay Tondak <[email protected]> * build fix and clang updates Signed-off-by: Akshay Tondak <[email protected]> * Re add sudo requirement Signed-off-by: Akshay Tondak <[email protected]> * update error handling Signed-off-by: Akshay Tondak <[email protected]> * Removing sudo requirememt and remove redundant try catch Signed-off-by: Akshay Tondak <[email protected]> --------- Signed-off-by: Akshay Tondak <[email protected]> Co-authored-by: Akshay Tondak <[email protected]>
Signed-off-by: Takasi, Manoj <[email protected]>
When executing a vector of runs or a runlist, it is the responsibility of the caller to make sure the runlist is idle. The profile's main execution routine for iterations inserts the necessary wait in between iterations of the underlying runlist. Signed-off-by: Soren Soe <[email protected]>
…ng client emulation (Xilinx#9522)
Added --queue-limit (-l) to control max number of pending jobs in xrt-runner queue when running in script mode. ``` xrt-runner --help usage: xrt-runner.exe [options] [--recipe <recipe.json>] recipe file to run [--profile <profile.json>] execution profile [--iterations <number>] override all profile iterations [--script <script>] runner script, enables multi-threaded execution [--threads <number>] number of threads to use when running script (default: #jobs) [--queue-limit <number>] max jobs in job queue running script (default: threads or jobs) [--dir <path>] directory containing artifacts (default: current dir) [--mode <latency|throughput|validate>] execute only specified mode (default: all) [--verbose <val>] set XRT verbosity level to specified value (default: 0) [--progress] show progress (same as --verbose 6) [--report [<file>]] output runner metrics to <file> or use stdout for no <file> or '-' % xrt-runner.exe --recipe recipe.json --profile profile.json [--iterations <num>] [--dir <path>] % xrt-runner.exe --script runner.json [--threads <num>] [--iterations <num>] [--dir <path>] Note, [--threads <number>] overrides the default number, where default is the number of jobs in the runner script. Note, [--queue-limit <num>] sets the maximum number of pending jobs when running in script mode. If not set, the max number is set to 2 * number of threads used. Note, [--iterations <num>] overrides iterations in profile.json, but not in runner script. If the runner script specifies iterations for a recipe/profile pair, then this value is sticky for that recipe/profile pair. Note, [--mode <latency|throughput|validate>] filters execution sections in profile.json such only specified modes are executed. If the runner script specifies a mode for a recipe/profile pair, then this value is sticky for that recipe/profile pair. ``` Signed-off-by: Soren Soe <[email protected]>
* xbtracer: Link against abseil for protobuf 22+ Protobuf 22+ uses abseil for logging internally. On systems with newer protobuf (e.g., Arch Linux with protobuf 33.0), the xbtracer tools fail to link with undefined references to absl::log_internal symbols. This fix: - Detects protobuf >= 4.0 (corresponds to protobuf 22+ with new versioning) - Finds abseil and links xbtracer targets against absl::log_internal_check_op and absl::log_internal_message when needed Signed-off-by: Kashif Rasul <[email protected]> * Remove seemingly redundant explicit link with ${XBTRACER_ABSL_LIBS} Signed-off-by: Soren Soe <[email protected]> --------- Signed-off-by: Kashif Rasul <[email protected]> Signed-off-by: Soren Soe <[email protected]> Co-authored-by: Soren Soe <[email protected]>
Signed-off-by: Akshay Tondak <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
Signed-off-by: Akshay Tondak <[email protected]>
|
clang-tidy review says "All clean, LGTM! 👍" |
Signed-off-by: Akshay Tondak <[email protected]>
|
clang-tidy review says "All clean, LGTM! 👍" |
stsoe
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. One suggestion related to inclusion of version.h.
| #include "common/sysinfo.h" | ||
| #include "common/smi.h" | ||
| #include "common/module_loader.h" | ||
| #include "xrt/detail/version.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like how you are using xrt_build_version, but the problem with version.h is that it regenerates with different timestamp every time we run cmake, this breaks ccache. We have probably want to move xrt_build_version from version.h.in to version-slim.h.in and then have your code include version-slim.h. Can you try that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've moved it to vesion-slim.h.ln. I see the incremental builds and not showing cmake prints anymore. Should've noticed that earlier
Signed-off-by: Akshay Tondak <[email protected]>
|
clang-tidy review says "All clean, LGTM! 👍" |
Signed-off-by: Akshay Tondak <[email protected]>
|
clang-tidy review says "All clean, LGTM! 👍" |
|
clang-tidy review says "All clean, LGTM! 👍" |
Problem solved by the commit
This PR adds correct error handling when archive for xrt-smi validate is not found. There are following cases :
Bug / issue (if any) fixed, which PR introduced the bug, how it was discovered
None. Required for correct error handling when upstreaming the XRT package
How problem was solved, alternative solutions (if any) and why they were rejected
Solved via correct error handling
Risks (if any) associated the changes in the commit
None
What has been tested and how, request additional testing if necessary
Tested on linux after manually deleting the installed xrt_smi_strix.a archive. Following is the error seen when the archive is not found :
Documentation impact (if any)
None