Skip to content

Commit d27652e

Browse files
feat: Add ModelReadmeFetcher and integrate README data into BOM generation
- Implemented ModelReadmeFetcher to fetch and parse model README files. - Enhanced generator to fetch model README data and include it in the BOM. - Updated field specifications to incorporate model card properties from README. - Added tests for ModelReadmeFetcher and updated generator tests to validate README integration. - Modified metadata handling to support additional model card fields such as use cases, limitations, and environmental considerations. - Updated example model loading script to reflect changes in model identifiers.
1 parent b39f216 commit d27652e

File tree

12 files changed

+1042
-45
lines changed

12 files changed

+1042
-45
lines changed

README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,8 @@ What works today:
1111

1212
- Basic scanning for Hugging Face model IDs in Python-like sources via `from_pretrained("...")`.
1313
- AIBOM generation per detected model in JSON or XML.
14-
- Optional Hugging Face Hub API fetch to populate some metadata fields.
14+
- Hugging Face Hub API fetch to populate metadata fields.
15+
- Hugging Face Repo README fetch to populate more metadata fields.
1516
- Completeness scoring and validation of existing AIBOM files.
1617

1718
What is explicitly future work:

internal/builder/bom_builder.go

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@ func (b BOMBuilder) Build(ctx BuildContext) (*cdx.BOM, error) {
2929
ModelID: strings.TrimSpace(ctx.ModelID),
3030
Scan: ctx.Scan,
3131
HF: ctx.HF,
32+
Readme: ctx.Readme,
3233
}
3334
tgt := metadata.Target{
3435
BOM: bom,

internal/builder/context.go

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ type BuildContext struct {
99
ModelID string
1010
Scan scanner.Discovery
1111
HF *fetcher.ModelAPIResponse
12+
Readme *fetcher.ModelReadmeCard
1213
}
1314

1415
type Options struct {

internal/completeness/completeness_test.go

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,20 @@ func buildFullyPopulatedBOMForRegistry(t *testing.T) *cdx.BOM {
6767
Evidence: "from_pretrained() pattern at line 1",
6868
},
6969
HF: hf,
70+
Readme: &fetcher.ModelReadmeCard{
71+
BaseModel: "bert-base-uncased",
72+
ModelCardContact: "[email protected]",
73+
DirectUse: "Use for classification.",
74+
OutOfScopeUse: "Do not use for medical.",
75+
BiasRisksLimitations: "May be biased.",
76+
BiasRecommendations: "Use with care.",
77+
EnvironmentalHardwareType: "NVIDIA A100",
78+
EnvironmentalHoursUsed: "10",
79+
EnvironmentalCloudProvider: "AWS",
80+
EnvironmentalComputeRegion: "us-east-1",
81+
EnvironmentalCarbonEmitted: "123g",
82+
ModelIndexMetrics: []fetcher.ModelIndexMetric{{Type: "accuracy", Value: "0.91"}},
83+
},
7084
}
7185
tgt := metadata.Target{
7286
BOM: bom,

internal/fetcher/model_api_fetcher.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -119,6 +119,6 @@ func (f *ModelAPIFetcher) Fetch(ctx context.Context, modelID string) (*ModelAPIR
119119
logf(modelID, "decode error (%v)", err)
120120
return nil, err
121121
}
122-
logf(modelID, "ok (library=%q pipeline=%q)", strings.TrimSpace(parsed.LibraryName), strings.TrimSpace(parsed.PipelineTag))
122+
logf(modelID, "ok")
123123
return &parsed, nil
124124
}

0 commit comments

Comments
 (0)