-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
wzx
committed
Jun 12, 2024
0 parents
commit 55ed583
Showing
290 changed files
with
71,175 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
name: LIBKINETOCI | ||
|
||
on: | ||
push: | ||
branches: | ||
- main | ||
pull_request: | ||
branches: | ||
- main | ||
|
||
jobs: | ||
build: | ||
runs-on: ${{ matrix.os }} | ||
strategy: | ||
matrix: | ||
os: [ubuntu-latest] | ||
|
||
steps: | ||
- uses: actions/checkout@v2 | ||
- name: Checkout submodules | ||
shell: bash | ||
run: | | ||
auth_header="$(git config --local --get http.https://github.com/.extraheader)" | ||
git submodule sync --recursive | ||
git -c "http.extraheader=$auth_header" -c protocol.version=2 submodule update --init --force --recursive --depth=1 | ||
- name: Get env vars | ||
run: | | ||
echo GITHUB_WORKFLOW = $GITHUB_WORKFLOW | ||
echo HOME = $HOME | ||
echo GITHUB_ACTION = $GITHUB_ACTION | ||
echo GITHUB_ACTIONS = $GITHUB_ACTIONS | ||
echo GITHUB_REPOSITORY = $GITHUB_REPOSITORY | ||
echo GITHUB_EVENT_NAME = $GITHUB_EVENT_NAME | ||
echo GITHUB_EVENT_PATH = $GITHUB_EVENT_PATH | ||
echo GITHUB_WORKSPACE = $GITHUB_WORKSPACE | ||
echo GITHUB_SHA = $GITHUB_SHA | ||
echo GITHUB_REF = $GITHUB_REF | ||
c++ --verbose | ||
# TODO: Figure out how to install mupti headers T84637671 | ||
- name: Build static lib | ||
run: | | ||
set -e | ||
mkdir build_static | ||
cd build_static | ||
cmake -DKINETO_LIBRARY_TYPE=static ../libkineto/ | ||
make -j | ||
- name: Build shared lib | ||
run: | | ||
set -e | ||
mkdir build_shared | ||
cd build_shared | ||
cmake -DKINETO_LIBRARY_TYPE=shared ../libkineto/ | ||
make -j |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
name: Build torch-tb-profiler Pip Package | ||
|
||
on: | ||
# TODO: Add an on_release trigger to build on tags | ||
workflow_dispatch: | ||
|
||
jobs: | ||
build-package: | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@v2 | ||
- name: build pip package | ||
run: | | ||
set -e | ||
cd tb_plugin | ||
python setup.py sdist bdist_wheel | ||
cd dist/ | ||
pip install *.whl | ||
python -c "import torch_tb_profiler;print(torch_tb_profiler.__version__)" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
name: TB_Plugin_CI | ||
|
||
on: | ||
push: | ||
branches: | ||
- main | ||
- release/** | ||
- plugin/** | ||
|
||
pull_request: | ||
branches: | ||
- main | ||
- release/** | ||
- plugin/** | ||
|
||
jobs: | ||
generate-matrix: | ||
runs-on: ubuntu-latest | ||
outputs: | ||
matrix: ${{ steps.set-matrix.outputs.matrix }} | ||
steps: | ||
- id: set-matrix | ||
run: | | ||
echo $GITHUB_BASE_REF | ||
if [ $GITHUB_BASE_REF == "plugin/vnext" ] | ||
then | ||
echo "::set-output name=matrix::{\"python-version\":[3.8], \"cuda-version\":[\"cpu\"], \"pytorch-version\":[\"nightly\"]}" | ||
else | ||
echo "::set-output name=matrix::{\"python-version\":[3.8], \"cuda-version\":[\"cpu\"], \"pytorch-version\":[\"nightly\", \"2.0\", \"stable\"]}" | ||
fi | ||
build: | ||
needs: generate-matrix | ||
runs-on: ubuntu-latest | ||
strategy: | ||
matrix: ${{fromJSON(needs.generate-matrix.outputs.matrix)}} | ||
steps: | ||
- uses: actions/checkout@v2 | ||
- name: Set up Python ${{ matrix.python-version }} | ||
uses: actions/setup-python@v2 | ||
with: | ||
python-version: ${{ matrix.python-version }} | ||
architecture: 'x64' | ||
- name: Test | ||
env: | ||
CUDA_VERSION: ${{ matrix.cuda-version }} | ||
PYTORCH_VERSION: ${{ matrix.pytorch-version }} | ||
TORCH_PROFILER_LOG_LEVEL: DEBUG | ||
GRPC_VERBOSITY: DEBUG | ||
GRPC_ENABLE_FORK_SUPPORT: 'False' | ||
run: | | ||
set -e | ||
cd tb_plugin | ||
sh ./ci_scripts/install_env.sh | ||
pip install .[gs] | ||
cd test | ||
pytest |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
# ignore common items | ||
.idea | ||
.vscode |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
[submodule "libkineto/third_party/googletest"] | ||
path = libkineto/third_party/googletest | ||
url = https://github.com/google/googletest.git | ||
[submodule "libkineto/third_party/fmt"] | ||
path = libkineto/third_party/fmt | ||
url = https://github.com/fmtlib/fmt.git | ||
[submodule "libkineto/third_party/dynolog"] | ||
path = libkineto/third_party/dynolog | ||
url = https://github.com/facebookincubator/dynolog.git |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,77 @@ | ||
# Code of Conduct | ||
|
||
## Our Pledge | ||
|
||
In the interest of fostering an open and welcoming environment, we as | ||
contributors and maintainers pledge to make participation in our project and | ||
our community a harassment-free experience for everyone, regardless of age, body | ||
size, disability, ethnicity, sex characteristics, gender identity and expression, | ||
level of experience, education, socio-economic status, nationality, personal | ||
appearance, race, religion, or sexual identity and orientation. | ||
|
||
## Our Standards | ||
|
||
Examples of behavior that contributes to creating a positive environment | ||
include: | ||
|
||
* Using welcoming and inclusive language | ||
* Being respectful of differing viewpoints and experiences | ||
* Gracefully accepting constructive criticism | ||
* Focusing on what is best for the community | ||
* Showing empathy towards other community members | ||
|
||
Examples of unacceptable behavior by participants include: | ||
|
||
* The use of sexualized language or imagery and unwelcome sexual attention or | ||
advances | ||
* Trolling, insulting/derogatory comments, and personal or political attacks | ||
* Public or private harassment | ||
* Publishing others' private information, such as a physical or electronic | ||
address, without explicit permission | ||
* Other conduct which could reasonably be considered inappropriate in a | ||
professional setting | ||
|
||
## Our Responsibilities | ||
|
||
Project maintainers are responsible for clarifying the standards of acceptable | ||
behavior and are expected to take appropriate and fair corrective action in | ||
response to any instances of unacceptable behavior. | ||
|
||
Project maintainers have the right and responsibility to remove, edit, or | ||
reject comments, commits, code, wiki edits, issues, and other contributions | ||
that are not aligned to this Code of Conduct, or to ban temporarily or | ||
permanently any contributor for other behaviors that they deem inappropriate, | ||
threatening, offensive, or harmful. | ||
|
||
## Scope | ||
|
||
This Code of Conduct applies within all project spaces, and it also applies when | ||
an individual is representing the project or its community in public spaces. | ||
Examples of representing a project or community include using an official | ||
project e-mail address, posting via an official social media account, or acting | ||
as an appointed representative at an online or offline event. Representation of | ||
a project may be further defined and clarified by project maintainers. | ||
|
||
## Enforcement | ||
|
||
Instances of abusive, harassing, or otherwise unacceptable behavior may be | ||
reported by contacting the project team at <[email protected]>. All | ||
complaints will be reviewed and investigated and will result in a response that | ||
is deemed necessary and appropriate to the circumstances. The project team is | ||
obligated to maintain confidentiality with regard to the reporter of an incident. | ||
Further details of specific enforcement policies may be posted separately. | ||
|
||
Project maintainers who do not follow or enforce the Code of Conduct in good | ||
faith may face temporary or permanent repercussions as determined by other | ||
members of the project's leadership. | ||
|
||
## Attribution | ||
|
||
This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, | ||
available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html | ||
|
||
[homepage]: https://www.contributor-covenant.org | ||
|
||
For answers to common questions about this code of conduct, see | ||
https://www.contributor-covenant.org/faq | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
# Contributing to Kineto | ||
We want to make contributing to this project as easy and transparent as | ||
possible. | ||
|
||
## Code of Conduct | ||
The code of conduct is described in [`CODE_OF_CONDUCT.md`](CODE_OF_CONDUCT.md). | ||
|
||
## Pull Requests | ||
We actively welcome your pull requests. | ||
|
||
1. Fork the repo and create your branch from `main`. | ||
2. If you've added code that should be tested, add tests. | ||
3. If you've changed APIs, update the documentation. | ||
4. Ensure the test suite passes. | ||
5. Make sure your code lints. | ||
6. If you haven't already, complete the Contributor License Agreement ("CLA"). | ||
|
||
## Contributor License Agreement ("CLA") | ||
In order to accept your pull request, we need you to submit a CLA. You only need | ||
to do this once to work on any of Facebook's open source projects. | ||
|
||
Complete your CLA here: <https://code.facebook.com/cla> | ||
|
||
## Issues | ||
We use GitHub issues to track public bugs. Please ensure your description is | ||
clear and has sufficient instructions to be able to reproduce the issue. | ||
|
||
Facebook has a [bounty program](https://www.facebook.com/whitehat/) for the safe | ||
disclosure of security bugs. In those cases, please go through the process | ||
outlined on that page and do not file a public issue. | ||
|
||
## License | ||
By contributing to Kineto, you agree that your contributions will be licensed | ||
under the LICENSE file in the root directory of this source tree. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
BSD License | ||
|
||
For Kineto software | ||
|
||
Copyright (c) Meta Platforms, Inc. and affiliates. | ||
|
||
All contributions by Microsoft: | ||
Copyright (c) Microsoft Corporation. (The Azure AI Platform team) | ||
|
||
Redistribution and use in source and binary forms, with or without modification, | ||
are permitted provided that the following conditions are met: | ||
|
||
* Redistributions of source code must retain the above copyright notice, this | ||
list of conditions and the following disclaimer. | ||
|
||
* Redistributions in binary form must reproduce the above copyright notice, | ||
this list of conditions and the following disclaimer in the documentation | ||
and/or other materials provided with the distribution. | ||
|
||
* Neither the name Meta nor the names of its contributors may be used to | ||
endorse or promote products derived from this software without specific | ||
prior written permission. | ||
|
||
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND | ||
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED | ||
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE | ||
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR | ||
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES | ||
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; | ||
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON | ||
ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT | ||
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS | ||
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
# Kineto | ||
|
||
Kineto is part of the PyTorch Profiler. | ||
|
||
The Kineto project enables: | ||
- **performance observability and diagnostics** across common ML bottleneck components | ||
- **actionable recommendations** for common issues | ||
- integration of external system-level profiling tools | ||
- integration with popular visualization platforms and analysis pipelines | ||
|
||
A central component is Libkineto, a profiling library with special focus on low-overhead GPU timeline tracing. | ||
|
||
## Libkineto | ||
|
||
Libkineto is an in-process profiling library integrated with the PyTorch Profiler. Please refer to the [README](libkineto/README.md) file in the `libkineto` folder as well as documentation on the [new PyTorch Profiler API](https://pytorch.org/docs/master/profiler.html). | ||
|
||
## Holistic Trace Analysis | ||
|
||
Holistic Trace Analysis (HTA) is an open source performance debugging library aimed at | ||
distributed workloads. HTA takes as input PyTorch Profiler traces and elevates the performance | ||
bottlenecks to enable faster debugging. Here's a partial list of features in HTA: | ||
|
||
1. [Temporal Breakdown](https://hta.readthedocs.io/en/latest/source/features/temporal_breakdown.html): Breakdown of GPU time in terms of time spent in computation, communication, memory events, and idle time on a single node and across all ranks. | ||
1. [Idle Time Breakdown](https://hta.readthedocs.io/en/latest/source/features/idle_time_breakdown.html): Breakdown of GPU idle time into waiting for the host, waiting for another kernel or attributed to an unknown cause. | ||
1. [Kernel Breakdown](https://hta.readthedocs.io/en/latest/source/features/kernel_breakdown.html): Find kernels with the longest duration on each rank. | ||
1. [Kernel Duration Distribution](https://hta.readthedocs.io/en/latest/source/features/kernel_breakdown.html#kernel-duration-distribution): Distribution of average time taken by longest kernels across different ranks. | ||
1. [Communication Computation Overlap](https://hta.readthedocs.io/en/latest/source/features/comm_comp_overlap.html): Calculate the percentage of time when communication overlaps computation. | ||
|
||
For a complete list see [here](http://hta.readthedocs.io). | ||
|
||
## PyTorch TensorBoard Profiler (Deprecated) | ||
The goal of the PyTorch TensorBoard Profiler is to provide a seamless and intuitive end-to-end profiling experience, including straightforward collection from PyTorch and insightful visualizations and recommendations in the TensorBoard UI. | ||
Please refer to the [README](tb_plugin/README.md) file in the `tb_plugin` folder. | ||
|
||
## Future Development Direction: | ||
Some areas we're currently working on: | ||
- Support for tracing distributed workloads | ||
- Trace processing, analysis and recommendation engine | ||
- System-level activities, multiple tracing sources | ||
- Profiling and monitoring daemon for larger scale deployments | ||
|
||
## Releases and Contributing | ||
We will follow the PyTorch release schedule which roughly happens on a 3 month basis. | ||
|
||
We appreciate all contributions. If you are planning to contribute back bug-fixes, please do so without any further discussion. | ||
|
||
If you plan to contribute new features, please first open an issue and discuss the feature with us. Sending a PR without discussion might end up resulting in a rejected PR because we might be taking the infrastructure in a different direction than you might be aware of. We expect the architecture to keep evolving. | ||
|
||
## License | ||
Kineto has a BSD-style license, as found in the [LICENSE](LICENSE) file. | ||
|
Oops, something went wrong.