Skip to content

Conversation

@effigies
Copy link
Collaborator

@effigies effigies commented Aug 4, 2025

This is a minimal change to column JSON definitions to permit them to be as expressive as JSON schema for defining TSV columns. This adds:

  1. Format, which covers both type and format, given that all TSV values are strings until cast by a parser.
  2. Minimum and Maximum, which are used for some values. Here I permit them also to be strings, on the grounds that someone might want a minimum/maximum datetime.

This also deprecates the string "89+" for ages above 88, instead setting a Maximum of 89. It is expected that a validator implementing this will carve out an exception for age for 89+. The alternative would be something along the lines of:

anyOf:
  - type: number
  - type: string
    enum:
      - "89+"

Adding the equivalent of anyOf to JSON sidecar column descriptions would be excessive complexity. Looking at the existing uses of anyOf, two are degenerate (anyOf: {type: string}) and one allowed strings or numbers, which is equivalent to type: string for TSV values.

At this point, this PR does not attempt to adopt a single convention for schema-defined and sidecar-defined columns. I think we probably should, and posted some options in #2048 (comment).

Closes #1791
Closes #2045
Closes #2048

@CodyCBakerPhD
Copy link
Contributor

@effigies Mostly LGTM as for what we had discussed - I will take a second look at src/schema/objects/metadata.yaml later to ensure I understand

Were you planning to submit a separate pre-PR to fix the mypy issues on master?

@effigies
Copy link
Collaborator Author

effigies commented Aug 4, 2025

Were you planning to submit a separate pre-PR to fix the mypy issues on master?

Done in #2163.

@effigies effigies force-pushed the feat/column-unification branch 2 times, most recently from 205c898 to 095cfdf Compare August 4, 2025 19:27
@codecov
Copy link

codecov bot commented Aug 4, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 82.71%. Comparing base (7ba7e67) to head (1162a5b).
⚠️ Report is 158 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #2162   +/-   ##
=======================================
  Coverage   82.71%   82.71%           
=======================================
  Files          20       20           
  Lines        1608     1608           
=======================================
  Hits         1330     1330           
  Misses        278      278           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Collaborator

@yarikoptic yarikoptic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more discussion points/ideas

"Description": "Subject age in postnatal years",
"Format": "number",
"Units": "year",
"Maximum": 89,
Copy link
Collaborator

@yarikoptic yarikoptic Aug 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a wonderful motivation for me to extract the bst migrate-datasets from https://github.com/bids-standard/bids-specification/pull/1775/files#diff-24b8a97543a1dc43190b8d5b070df26e1b88023f2b468b4483255af049788788R39 , likely to call upgrade-datasets instead. And add migration for this so while going to newer version we could get participants.tsv fixed up automagically?

FTR -- there is a one bids-examples with 89+

❯ git grep 89\+ **/*.tsv
genetics_ukbb/participants.tsv:sub-05       89+     M       control 623589  yes     b002    SNV/CNV hg19/hg38       WES/SNP_intensity
genetics_ukbb/participants.tsv:sub-06       89+     F       patient 458712  no      b003    SNV     hg19    WES
genetics_ukbb/participants.tsv:sub-07       89+     M       patient 985632  yes     b004    SNV     hg19    WES
genetics_ukbb/participants.tsv:sub-13       89+     M       patient 225587  no      b001    SNV/CNV hg19/hg38       WES/SNP_intensity

Comment on lines +185 to +187
def test_format_consistency(schema_obj):
"""Test that the "Format" field is consistent with objects.formats."""
assert set(schema_obj.objects.metadata.Format.enum) == schema_obj.objects.formats.keys()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think any additional testing would be worth it (e.g., example files asserting their column values to be of a particular type, and ensuring validation errors if not that particular non-string type, etc.) or is the assumption that the current testing suite would cover that?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's out-of-scope for bidsschematools, but will definitely be in-scope for the validator.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you want to start a sister PR over on the validator to ensure this works as expected or simply proceed and patch?

Copy link
Collaborator

@yarikoptic yarikoptic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is great to see for specifications to converge! You should come over to the North more often!

@effigies effigies force-pushed the feat/column-unification branch from 7060607 to 1162a5b Compare August 14, 2025 18:31
@rwblair rwblair merged commit 5a357dd into bids-standard:master Aug 14, 2025
27 checks passed
@effigies effigies deleted the feat/column-unification branch August 16, 2025 12:12
effigies added a commit that referenced this pull request Aug 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

4 participants