Releases: instructlab/sdg
Releases · instructlab/sdg
v0.1.3
What's Changed
- Add a YAML based file format for pipelines by @markmc in #86
- llmblock: Set a more reasonable default for num_tokens by @russellb in #125
- pipeline: Fail explicitly on an empty dataset by @russellb in #127
- Automate validation of pipeline configs by @russellb in #132
- Update grounded_skills.yaml to add seed value by @aakankshaduggal in #137
- ci: Run lint job if pipeline configs change by @russellb in #140
- Set gen_kwargs['n'] dynamically in the simple pipelines by @russellb in #144
- Add
model_promptconfig param for LLMBlock by @russellb in #141 - filterblock: add default_value for use with convert_dtype by @markmc in #143
- Export public APIs in top-level package by @tiran in #73
- Indent simple pipeline "principle" content by @derekhiggins in #150
- Move
gen_kwargsdown toLLMBlockby @markmc in #146 - ci: run e2e on pipeline config related changes by @russellb in #151
- importblock: resolve circular import issue by @markmc in #153
- Remove unused requirements by @russellb in #152
- Drop
__index_level_0__columns by @aakankshaduggal in #142 - Fix SamplePopulatorBlock by @markmc in #156
- Block Name In Errors by @gabe-l-hart in #155
- Load custom pipelines from shared data dir by @derekhiggins in #166
- LLMBlock concurrency by @gabe-l-hart in #157
New Contributors
- @tiran made their first contribution in #73
- @derekhiggins made their first contribution in #150
- @gabe-l-hart made their first contribution in #155
Full Changelog: v0.1.2...v0.1.3
v0.1.2
What's Changed
- Update messages datafile extension to be .jsonl by @Maxusmusti in #115
New Contributors
- @Maxusmusti made their first contribution in #115
Full Changelog: v0.1.1...v0.1.2
v0.1.1
What's Changed
- Update generate_data.py to capture context key by @aakankshaduggal in #98
- Add CI workflow that runs the full SDG pipeline by @russellb in #93
- Remove two files that are now unused by @russellb in #104
- Batch support with vllm by @aakankshaduggal in #105
- converts dataset format messages required for training by @oindrillac in #94
Full Changelog: v0.1.0...v0.1.1
v0.1.0
This version introduces an effective rewrite of the library. There is a simple pipeline aimed at maintaining compatibility with small environments supported by the ilab CLI. There is also a new full pipeline that is much more extensive and can produce higher quality results for environments capable of running it, along with the required teacher model, Mixtral-8x7b-instruct.
What's Changed
- Update e2e config to optimize pip caching by @nathan-weinberg in #44
- github: Automate some labels with mergify by @russellb in #40
- Add SDG library code by @shivchander, @aakankshaduggal, @oindrillac, et. al. in #42
- 📚 Adding Knowledge llm blocks by @abhi1092 in #50
- e2e: Fix permissions error by @russellb in #51
- Initial CLI integration with new SDG interfaces by @russellb in #46
- Fix dataset formatting for pipeline differences by @russellb in #57
- updates to grounded flow by @oindrillac, @shivchanderm, @oindrillac in #53
- e2e: Only run one job at a time for a given PR by @russellb in #68
- Fix prompt file paths for an installed library by @russellb in #67
- Resolve some trivial TODOs in generate_data() by @markmc in #74
- Fix mismatch in full pipeline outputs by @russellb in #75
- Updated chunking_document. by @PalmPalm7 in #65
- Handle type conversion errors in FilterByValueBlock by @russellb in #78
- Make SynthSkillsFlow honor the num_iters parameter by @russellb in #82
- Bump actions/download-artifact from 4.1.7 to 4.1.8 by @dependabot in #91
- Drop remaining import from main instructlab package by @russellb in #89
- generate_data: Fix check for
outputin results by @russellb in #71 - generate_data: fix support for multiple leaf nodes by @russellb in #85
- Allow FilterByValueBlock to handle one or many values by @russellb in #81
- Bump pypa/gh-action-pypi-publish from 1.8.14 to 1.9.0 by @dependabot in #24
- iterblock: remove duplicate line of code by @russellb in #83
New Contributors
- @shivchander made their first contribution in #42
- @aakankshaduggal made their first contribution in #42
- @abhi1092 made their first contribution in #50
- @oindrillac made their first contribution in #53
- @markmc made their first contribution in #74
- @PalmPalm7 made their first contribution in #65
Full Changelog: v0.0.4...v0.1.0
v0.0.4.1
Full Changelog: v0.0.4...v0.0.4.1
v0.0.4
v0.0.3
What's Changed
- Add e2e test workflow by @russellb in #33
- offshoot gen_test_data() from very long generate_data() by @makelinux in #15
- Add py.typed marker file by @russellb in #32
- Move some code from instructlab.utils by @russellb in #35
New Contributors
- @makelinux made their first contribution in #15
Full Changelog: v0.0.2...v0.0.3
v0.0.2
What's Changed
- Add action badges to README by @nathan-weinberg in #27
- requirements.txt: Move instructlab to requirements-dev.txt by @russellb in #28
Full Changelog: v0.0.1...v0.0.2
v0.0.1
What's Changed
- Bump rojopolis/spellcheck-github-actions from 0.37.0 to 0.38.0 by @dependabot in #23
- Bump step-security/harden-runner from 2.8.0 to 2.8.1 by @dependabot in #19
- Rename instructlab.config to instructlab.configuration by @russellb in #25
Full Changelog: v0.0.1.rc3...v0.0.1
v0.0.1.rc4
What's Changed
- Bump rojopolis/spellcheck-github-actions from 0.37.0 to 0.38.0 by @dependabot in #23
- Bump step-security/harden-runner from 2.8.0 to 2.8.1 by @dependabot in #19
- Rename instructlab.config to instructlab.configuration by @russellb in #25
Full Changelog: v0.0.1.rc3...v0.0.1.rc4