|
| 1 | +# Modify Repository Proposal: Training |
| 2 | + |
| 3 | +## Summary |
| 4 | + |
| 5 | +This document proposes taking the existing [instructlab/training](https://github.com/instructlab/training) and transforming it into a library that will be consumed |
| 6 | +by other clients such as the CLI and what is currently the `main_ds.py` file |
| 7 | +that launches a full fine-tune. |
| 8 | + |
| 9 | +## Background |
| 10 | + |
| 11 | +Today, we have training implemented for different use-cases |
| 12 | +existing in different repos. LoRA/QLoRA training currently lives |
| 13 | +in the [`instructlab`](https://github.com/instructlab/instructlab) |
| 14 | +CLI repo and contains logic for both Linux and MacOS training. |
| 15 | +Each of these then brings additional logic for platform-specific accelerators. |
| 16 | +For example NVIDIA CUDA and Intel Habana on Linux, and MLX and MPS on MacOS. |
| 17 | + |
| 18 | +On the other hand, we have logic that exists today for full fine-tuning |
| 19 | +in the [instructlab/training](https://github.com/instructlab/training) repo. Today, this implementation is CUDA-specific |
| 20 | +as it relies on NCCL for handling communication across various |
| 21 | +NVIDIA GPUs. |
| 22 | +We want to bring support for alternative accelerators such as Intel Gaudi to this repo as well, without rewriting much of the logic. |
| 23 | + |
| 24 | + |
| 25 | + |
| 26 | +We need to have a library which provides a common interface |
| 27 | +for training, allowing all of the hard logic to be concentrated |
| 28 | +in a single point. |
| 29 | + |
| 30 | +The [instructlab/training](https://github.com/instructlab/training) repo should then |
| 31 | +become the home for the overall training library. |
| 32 | + |
| 33 | +This library will then contain a "simple" interface for all |
| 34 | +other clients to pull from and use without much needing to change |
| 35 | +on their side other than how they intend to use it. |
| 36 | + |
| 37 | + |
| 38 | +## Maintainers |
| 39 | + |
| 40 | + |
| 41 | +The maintainers should be the folks who currently work on training. |
| 42 | +Namely: |
| 43 | + |
| 44 | +- [Aldo Pareja](https://github.com/aldopareja) |
| 45 | +- [James Kunstle](https://github.com/orgs/instructlab/people/JamesKunstle) |
| 46 | +- [Oleg Silkin](https://github.com/orgs/instructlab/people/RobotSail) |
| 47 | +- [Mustafa Eyceoz](https://github.com/orgs/instructlab/people/Maxusmusti) |
| 48 | + |
| 49 | + |
| 50 | +We can also blanket access by setting it to the `Backend Maintainers` GitHub team. |
| 51 | + |
| 52 | +## Alternatives Considered |
| 53 | + |
| 54 | +### Keep everything separate |
| 55 | + |
| 56 | +Rather than consolidating the logic, we can keep the existing |
| 57 | +logic as-is. |
| 58 | + |
| 59 | +This would mean that the [CLI repo](https://github.com/instructlab/instructlab) and the [training repo](https://github.com/instructlab/training) would both maintain their own implementations of training. |
| 60 | + |
| 61 | +This means that if extra logic must be added for Intel Gaudi or NVIDIA, it would need to be added and tested in two different places. |
| 62 | + |
| 63 | +Since we need to move the full training to live inside of the the |
| 64 | +[CLI repo](https://githhub.com/instructlab/instructlab), |
| 65 | +this would now have two duplicate implementations of the |
| 66 | +full fine-tune training and add additional points of complications. |
| 67 | + |
| 68 | +### Move everything into one repo but don't design an interface |
| 69 | + |
| 70 | + |
| 71 | +We can move the existing training logic into the training repo |
| 72 | +and simply have the existing clients consume it this way. |
| 73 | + |
| 74 | +The challenge here is that a lot of the logic is very specific |
| 75 | +to the client application, so there would be cross-development. |
| 76 | + |
| 77 | +If someone wants to create a new PR to the CLI repo for a change |
| 78 | +they're making to the LoRA training, they'd need to create |
| 79 | +another PR into the training repo and maintain two PRs at once. |
| 80 | +When a maintainer requests a change in one PR, both would need to be updated to accommodate each other. |
| 81 | + |
| 82 | + |
| 83 | +The challenges this presents to both the developers and maintainers |
| 84 | +is clear. Therefore a natural conclusion is that we need a separate library that provides an extensible interface for other clients to consume. |
0 commit comments