Skip to content

Conversation

@AugustijnVrolijk
Copy link
Contributor

@AugustijnVrolijk AugustijnVrolijk commented Aug 20, 2025

For metadata of "type: object". Ensured all keys become metadata objects. This would enable a much more elegant and streamlined design for creating relevant namespaces from metadata, by allowing object keys to be recursively defined as metadata objects themselves.

Making this consistent would make it more machine readable, and allow for treatment of property keys as "strict" objects with an easily findable definition (i.e. shema.objects.metadata.get(name)).

This is sometimes done, i.e. for StimulusPresentation:

StimulusPresentation:
  name: StimulusPresentation
  display_name: Stimulus Presentation
  description: |
    Object containing key-value pairs related to the software used to present
    the stimuli during the experiment.
  type: object
recommended:
    - OperatingSystem
    - ScreenDistance
    - ScreenRefreshRate
    - ScreenResolution
    - ScreenSize
    - SoftwareName
    - SoftwareRRID
    - SoftwareVersion
    - Code
    - HeadStabilization
  properties:
    OperatingSystem:
      $ref: objects.metadata.OperatingSystem
    ScreenDistance:
      $ref: objects.metadata.ScreenDistance
    ScreenRefreshRate:
      $ref: objects.metadata.ScreenRefreshRate

But othertimes ignored, i.e. for DeidentificationMethodCodeSequence:

DeidentificationMethodCodeSequence:
  name: DeidentificationMethodCodeSequence
  display_name: Deidentification Method Code Sequence
  description: |
    A sequence of code objects describing the mechanism or method use to remove the Patient's identity.
    Corresponds to [DICOM Tag 0012, 0064](https://dicomlookup.com/dicomtags/(0012,0064))
    `De-identification Method Code Sequence`.
  type: array
  items:
    type: object
    recommended_fields:
      - CodeValue
      - CodeMeaning
      - CodingSchemeDesignator
      - CodingSchemeVersion
    properties:
      CodeValue:
        name: CodeValue
        type: string
        description: |
          An identifier that is unambiguous within the Coding Scheme
          denoted by Coding Scheme Designator and Coding Scheme Version.
          Corresponds to [DICOM Tag 0008, 0100](https://dicomlookup.com/dicomtags/(0008,0100)) `Code Value`.
      CodeMeaning:
        name: CodeMeaning
        type: string
        description: |
          Text that has meaning to a human and conveys the meaning of the term
          Corresponds to [DICOM Tag 0008, 0104](https://dicomlookup.com/dicomtags/(0008,0104)) `Code Meaning`.
      CodingSchemeDesignator:
        name: CodingSchemeDesignator
        type: string
        description: |
          The identifier of the coding scheme in which the coded entry is defined.
          Corresponds to [DICOM Tag 0008, 0102](https://dicomlookup.com/dicomtags/(0008,0102))
          `Coding Scheme Designator`.
      CodingSchemeVersion:
        name: CodingSchemeVersion
        type: string
        description: |
          An identifier of the version of the coding scheme if necessary to resolve ambiguity.
          Corresponds to [DICOM Tag 0008, 0103](https://dicomlookup.com/dicomtags/(0008,0103)) `Coding Scheme Version`.

use references for properties objects. Done for EditPulse and Code (something)
finished updating properties for GeneratedBy
Final <properties:> update
@AugustijnVrolijk
Copy link
Contributor Author

AugustijnVrolijk commented Aug 20, 2025

I'm not sure about the intricacies of formatting for objects. But would people consider removing the recommended: and required: tags. Instead moving the level next to the key, or as a level: field.

This would then have the same format as in rule files. (i.e.: MRIAnatomyCommonMetadataFields:

  selectors:
    - datatype == "anat"
    - match(extension, "^\.nii(\.gz)?$")
  fields:
    ContrastBolusIngredient: optional
    RepetitionTimeExcitation: optional
    RepetitionTimePreparation: optional).  

Similarly if all object keys are now on objects.metadata,

recommended:
    - OperatingSystem
    - ScreenDistance
    - ScreenRefreshRate
    - ScreenResolution
    - ScreenSize
    - SoftwareName
    - SoftwareRRID
    - SoftwareVersion
    - Code
    - HeadStabilization
  properties:
    OperatingSystem:
      $ref: objects.metadata.OperatingSystem
    ScreenDistance:
      $ref: objects.metadata.ScreenDistance
    ScreenRefreshRate:
      $ref: objects.metadata.ScreenRefreshRate

could be changed to being:

  properties:
    OperatingSystem: recommended
    ScreenDistance: recommended
    ScreenRefreshRate: recommended

etc..

Once again encouraging machine readability, where scripts to interpret rule .yamls could be reused for object .yamls.

@codecov
Copy link

codecov bot commented Aug 20, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 82.71%. Comparing base (db1c087) to head (c700085).
⚠️ Report is 13 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #2175   +/-   ##
=======================================
  Coverage   82.71%   82.71%           
=======================================
  Files          20       20           
  Lines        1608     1608           
=======================================
  Hits         1330     1330           
  Misses        278      278           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@effigies
Copy link
Collaborator

The reason for objects.metadata having the form it does is to allow us to use JSON schema to validate JSON fields, including deeply-nested objects. Any attempt to move away from this will need to come with patches to the BIDS Validator to reconstruct JSON schema from the new format, or to otherwise implement validation that is currently done using JSON schema.

Comment on lines 1379 to 1381
required: [PipelineName]
recommended: [PipelineVersion]
properties:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a breaking change. Factoring out subfields should not have any impact on names or semantics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed... Will this not give character case errors as now attributes like description: and name: may clash with metadata definitions Description: and Name:

revert breaking changes.
@AugustijnVrolijk
Copy link
Contributor Author

The reason for objects.metadata having the form it does is to allow us to use JSON schema to validate JSON fields, including deeply-nested objects. Any attempt to move away from this will need to come with patches to the BIDS Validator to reconstruct JSON schema from the new format, or to otherwise implement validation that is currently done using JSON schema.

Ah this makes sense... Is this done to validate actual BIDS dataset JSON files? Or in a meta-manner to validate the schema itself?
If it is the former are there additional tools used to validate .tsv's, niftii etc... or can this all be handled by JSON schema?

@effigies
Copy link
Collaborator

Is this done to validate actual BIDS dataset JSON files? Or in a meta-manner to validate the schema itself?

We use the ajv validator to validate sidecar values, although this has to be done one at a time, since the requirement levels can change, e.g.:

MRIASLCommonMetadataFieldsM0TypeRec:
selectors:
- datatype == "perf"
- suffix == "asl"
- sidecar.M0Type != "Estimate"
fields:
M0Estimate:
level: optional
level_addendum: required if `M0Type` is `Estimate`
MRIASLCommonMetadataFieldsM0TypeReq:
selectors:
- datatype == "perf"
- suffix == "asl"
- sidecar.M0Type == "Estimate"
fields:
M0Estimate:
level: required
issue:
code: M0ESTIMATE_NOT_DEFINED
message: |
You must define `M0Estimate` for this file, because `M0Type` is set to
'Estimate'. `M0Estimate` is a single numerical whole-brain M0 value
(referring to the M0 of blood), only if obtained externally (for example
retrieved from CSF in a separate measurement).

If it is the former are there additional tools used to validate .tsv's, niftii etc... or can this all be handled by JSON schema?

TSVs use columns:

Behavioral:
selectors:
- suffix == "beh"
columns:
trial_type: optional
response_time: optional
HED: optional
stim_file: optional
additional_columns: allowed

These reference column terms like:

HED:
name: HED
display_name: HED
description: |
Hierarchical Event Descriptor (HED) tags.
See the [HED Appendix](SPEC_ROOT/appendices/hed.md) for details.
type: string

For this, custom validation is required, since TSVs are nothing but strings. I'm in the process of rewriting that check:

https://github.com/bids-standard/bids-validator/pull/234/files#diff-a31f03915a52c1f5e72aa583a05d39f82965ede644ddc714210beeed9e1f62e0R177-R189

For NIfTI checks, we rely on a library to parse the NIfTI header, and populate the NIfTI context:

nifti_header:
name: 'NIfTI Header'
description: 'Parsed contents of NIfTI header referenced elsewhere in schema.'
type: object
required:
- dim_info
- dim
- pixdim
- shape
- voxel_sizes
- xyzt_units
- qform_code
- sform_code
- axis_codes
additionalProperties: false
properties:
dim_info:
name: 'Dimension Information'
description: 'Metadata about dimensions data.'
type: object
required: [freq, phase, slice]
additionalProperties: false
properties:
freq:
name: 'Frequency'
description: 'These fields encode which spatial dimension (1, 2, or 3).'
type: integer
phase:
name: 'Phase'
description: 'Corresponds to which acquisition dimension for MRI data.'
type: integer
slice:
name: 'Slice'
description: 'Slice dimensions.'
type: integer
dim:
name: 'Data Dimensions'
description: 'Data seq dimensions.'
type: array
minItems: 8
maxItems: 8
items:
type: integer
pixdim:
name: 'Pixel Dimension'
description: 'Grid spacings (unit per dimension).'
type: array
minItems: 8
maxItems: 8
items:
type: number
shape:
name: 'Data shape'
description: 'Data array shape, equal to dim[1:dim[0] + 1]'
type: array
minItems: 0
maxItems: 7
items:
type: integer
voxel_sizes:
name: 'Voxel sizes'
description: 'Voxel sizes, equal to pixdim[1:dim[0] + 1]'
type: array
minItems: 0
maxItems: 7
items:
type: number
xyzt_units:
name: 'XYZT Units'
description: 'Units of pixdim[1..4]'
type: object
required: [xyz, t]
additionalProperties: false
properties:
xyz:
name: 'XYZ Units'
description: 'String representing the unit of voxel spacing.'
type: string
enum:
- $ref: objects.enums.unknown.value
# TODO: Add definitions for these values. (perhaps don't specify)
- 'meter'
- 'mm'
- 'um'
t:
name: 'Time Unit'
description: 'String representing the unit of inter-volume intervals.'
type: string
enum:
- $ref: objects.enums.unknown.value
# TODO: Add definitions for these values. (perhaps don't specify)
- 'sec'
- 'msec'
- 'usec'
qform_code:
name: 'qform code'
description: 'Use of the quaternion fields.'
type: integer
sform_code:
name: 'sform code'
description: 'Use of the affine fields.'
type: integer
axis_codes:
name: 'axis codes'
description: >
Orientation labels indicating primary direction of data axes defined
with respect to the object of interest.
type: array
minItems: 3
maxItems: 3
items:
type: string
enum: ['R', 'L', 'A', 'P', 'S', 'I']
mrs:
name: 'NIfTI-MRS extension'
description: 'NIfTI-MRS JSON fields'
type: object

Then we can use it in checks:

BoldNot4d:
issue:
code: BOLD_NOT_4D
message: |
BOLD scans must be 4 dimensional.
level: error
selectors:
- suffix == "bold"
- type(nifti_header) != "null"
checks:
- nifti_header.dim[0] == 4

@effigies effigies added this to the 1.10.1 milestone Aug 27, 2025
@effigies effigies added schema Issues related to the YAML schema representation of the specification. Patch version release. exclude-from-changelog This item will not feature in the automatically generated changelog labels Aug 27, 2025
@effigies effigies merged commit 135cc18 into bids-standard:master Aug 27, 2025
24 of 25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

exclude-from-changelog This item will not feature in the automatically generated changelog schema Issues related to the YAML schema representation of the specification. Patch version release.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants