Skip to content

[ENH] BEP045: "Peripheral" physiology (raw data)#2267

Draft
smoia wants to merge 32 commits intobids-standard:masterfrom
physiopy:bep045
Draft

[ENH] BEP045: "Peripheral" physiology (raw data)#2267
smoia wants to merge 32 commits intobids-standard:masterfrom
physiopy:bep045

Conversation

@smoia
Copy link

@smoia smoia commented Nov 22, 2025

This BEP proposes an expansion of the raw physiological data section of the standard.

The main development is happening in https://github.com/physiopy/bids-specification-physio

To see the rule of engagements, see the discussion board.

To leave comments, engage with the physiopy repository (specifically, with its PR #32) rather than with this PR.
However, we're not really ready for reviews or comments yet, we are opening a draft just to have the HTML and schema json preview. We will communicate ASAP when we are ready for public engagement.

m-miedema and others added 14 commits July 17, 2025 09:19
I'm not sure if a corresponding change should be made to modalities.yaml, I'm having trouble finding where the latter table is used by the compiler macros.
Incorporate physio BEP section 4 to physiological-recordings.md
Some checks on formatting (particularly for how I entered the allowable values) might be necessary here, I also added some comments on the BEP document to flag potential issues or changes we could make going forward.
Adding metadata from BEP document to specification glossary, closes #4 and closes #21.
@smoia smoia closed this Nov 22, 2025
@smoia smoia changed the title BEP045 draft [ENH] BEP045: "Peripheral" physiology (raw data) Nov 22, 2025
@smoia smoia reopened this Nov 22, 2025
Copy link
Collaborator

@effigies effigies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should at least partially address the validation failures.

(I'm assuming that technical schema comments are more appropriate here than the regular discussion channels you mentioned. LMK if you'd rather they go somewhere else.)

description: |
Amplifier settings during data acquisition (e.g. gain, sampling filters, bandwidth, etc.)
Strings MAY be represented in the format Setting:Value (e.g. Gain:10)
type: array of strings
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
type: array of strings
type: array
items:
type: string

from participant to recording device. A pseudonym can also be used to
prevent the equipment from being identifiable, so long as each
pseudonym is unique within the dataset.
type: string or array of strings
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
type: string or array of strings
anyOf:
- type: string
- type: array
items:
type: string

the sensor, connecting cables, and amplifiers. If more than one manufacturer
applies, list the manufacturer for each piece of hardware, in order
of attachment from participant to recording device.
type: string or array of strings
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
type: string or array of strings
anyOf:
- type: string
- type: array
items:
type: string

produced the measurements upstream of the recording device, including
the sensor, connecting cables, and amplifiers. Equipment should be listed
in order of attachment, from participant to recording device.
type: string or array of strings
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
type: string or array of strings
anyOf:
- type: string
- type: array
items:
type: string

@neuromechanist neuromechanist added Proposed BEP see https://bids.neuroimaging.io/collaboration/governance.html#proposed-bep physio labels Dec 15, 2025
Removed example file structure for physiological recordings.
Add MeasureType descriptions section to documentation
Removed unnecessary code blocks and clarified recommendations for recording physiological data.
Added descriptions for various MeasureTypes related to physiological recordings.
Updates to src/modality-specific-files/physiological-recordings.md
smoia and others added 5 commits February 18, 2026 15:46
Rebasing after Eyetracking BEP got merged
Some checks on formatting (particularly for how I entered the allowable values) might be necessary here, I also added some comments on the BEP document to flag potential issues or changes we could make going forward.
For any other data to be specified in columns, the column names can be chosen
as deemed appropriate by the researcher.
- The most important one is `MeasureType`, a **RECOMMENDED** metadata that indicates the actual nature of the data in the column.
- This metadata value is a string that **MUST** come from a set of keywords (see table 2.2).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- This metadata value is a string that **MUST** come from a set of keywords (see table 2.2).
- This metadata value is a string that **MUST** come from a set of keywords (see table 2.3).

In the rendered PDF the Table for MeasureType appears to be table 2.3

@scott-huberty
Copy link
Contributor

Hi all 👋 I am popping over from #2342 . Long story short is we are working on a reader for BIDS compliant eyetracking data over at mne-tools/mne-bids#1512 , and we hit an issue that I think this PR may address with the addition of theMeasureType field in <matches>_physio.json files.

The core issue is that eyetracking-specific <matches>_physio.tsv files may contain multiple
columns, corresponding to e.g. x_coordinate, y_coordinate, pupil_size, and/or other derived signals.. but the order of these columns can be somewhat arbitrary, and the columns names are free-text.

So if I understand this PR correctly, the proposed MeasureType field would provide metadata that indicates the actual nature of the data in each column of a <matches>_physio.tsv file (i.e. is it x-coordinate data, is it pupil size data, etc).

@oesteban following up from our previous thread, I think the MeasureType field could help with my issue, and here is what I would propose:

  • When PhysioType = "eyetrack", the MeasureType field SHOULD (or possibly MUST?) be defined for each non-time column.
  • Extend the allowed keywords for MeasureType to include 'EYEGAZE' and 'PUPIL' (harmonizing terminology with existing usage in EEG/MEG channels.tsv, which already defines these as valid keywords from a restricted list corresponding to the type field of that file.).

Then, borrowing from my example in #2342, an example eyetracking TSV/JSON pair could look like this:

# sub-01_ses-01_task-foo_run-01_recording-eye1_physio.tsv
0.000	988.300	534.700	3879.000
0.002	987.000	536.300	3879.000
0.004	987.400	533.300	3868.000
0.006	988.200	531.200	3855.000
0.008	988.300	532.600	3858.000
# sub-01_ses-01_task-foo_run-01_recording-eye1_physio.json
{
    "SamplingFrequency": 500.0,
    "StartTime": 0.0,
    "Columns": [
        "time",
        "xpos_left",
        "ypos_left",
        "pa_left"
    ],
    "PhysioType": "eyetrack",
    "RecordedEye": "left",
    "SampleCoordinateSystem": "gaze-on-screen",
    "time": {
        "Description": "The timestamp of the data, in seconds.",
        "Units": "s"
    },
    "xpos_left": {
        "Description": "The x-coordinate of the gaze on the screen in pixels.",
        "MeaureType": "EYEGAZE"
        "Units": "pixel"
    },
    "ypos_left": {
        "Description": "The y-coordinate of the gaze on the screen in pixels.",
        "MeaureType": "EYEGAZE"
        "Units": "pixel"
    },
    "pa_left": {
        "Description": "Pupil area of the recorded eye as calculated by the eye-tracker in arbitrary units",
        "MeaureType": "PUPIL"
        "Units": "arbitrary"
    }
}

CC @julia-pfarr @mszinte

@oesteban
Copy link
Collaborator

oesteban commented Feb 20, 2026

@scott-huberty please note your example is not BIDS-compliant after the BEP020 merge. This would be compliant:

{
    "SamplingFrequency": 500.0,
    "StartTime": 0.0,
    "PhysioType": "eyetrack",
    "RecordedEye": "left",
    "SampleCoordinateSystem": "gaze-on-screen",
    "timestamp": {
        "Description": "The timestamp of the data, in seconds.",
        "Units": "s"
    },
    "x_coordinate": {
        "Description": "The x-coordinate of the gaze on the screen in pixels.",
        "MeasureType": "EYEGAZE"
        "Units": "pixel"
    },
    "y_coordinate": {
        "Description": "The y-coordinate of the gaze on the screen in pixels.",
        "MeasureType": "EYEGAZE"
        "Units": "pixel"
    },
    "pupil_size": {
        "Description": "Pupil area of the recorded eye as calculated by the eye-tracker in arbitrary units",
        "MeasureType": "PUPIL"
        "Units": "arbitrary"
    }
}
  • I believe Columns is not necessary, since all are pre-specified columns
  • Column names are enforced for the gaze, the pupil_size is recommended, and in your example, I would just use it.

Considering those, what's the difference between expecting the "type" metadata be encoded as an explicit MeasureType or as part of the column name?

In fact, to me, "coordinate" (for x/y) is way more expressive as a "type" than EYEGAZE, which opens many questions as to the dimensionality of the quantity. Parsing pupil_size seems as ambiguous/unambiguous as parsing a PUPIL metadata value.

@oesteban
Copy link
Collaborator

Considering those, what's the difference between expecting the "type" metadata be encoded as an explicit MeasureType or as part of the column name?

Sorry for the rebound. Just to add: it feels the column name is more authoritative than a metadata field that would need to be REQUIRED to be effective and need further description. Column names are already required, and can encode the same information just as well.

@scott-huberty
Copy link
Contributor

scott-huberty commented Feb 20, 2026

Please note your example is not BIDS-compliant after the BEP020 merge. .... I believe Columns is not necessary, since all are pre-specified columns

I believe that I am following The Eyetracking specific physiological data spec as it is written:

The following table specifies metadata fields for the _recording-_physio.json file:

Columns REQUIRED array of strings Names of columns in file.

So as I understand the Eyetracking specific physiological data specification, in the JSON sidecar There must be a "Columns" key, which contains the datasets arbitrarily assigned channel names (e.g. could be ["x_coordinate", "y_coordinate", "pupil_size"], but just as well could be ["x", "y", "ps"]). And Then the there is a corresponding key for each of those names, which maps the name to a "Description", "Unit" etc.

Am I mistaken in that interpretation?

What's the difference between expecting the "type" metadata be encoded as an explicit MeasureType or as part of the column name? ... it feels the column name is more authoritative than a metadata field that would need to be REQUIRED to be effective and need further description. Column names are already required, and can encode the same information just as well.

That would be true if the schema enforces a controlled vocabulary for the Column names. Is it the case that if pupil size data are in the TSV file, its column name must be "pupil_size"? What about the "Additional Columns" that are optional but permitted according to the spec?

@oesteban
Copy link
Collaborator

Columns REQUIRED array of strings Names of columns in file.

Yes, it looks like Columns would be necessary. The intent was to do what's default in BIDS, so maybe we should revise this part of BEP020. @yarikoptic is more aware of this topic on other areas.

which contains the datasets arbitrarily assigned channel names (e.g. could be ["x_coordinate", "y_coordinate", "pupil_size"], but just as well could be ["x", "y", "ps"]).

This is definitely invalid: timestamp, x_coordinate, y_coordinate are REQUIRED, and therefore, must always be there for any BEP020-compliant dataset. You can count on _coordinate to be there telling you the type. pupil_size is OPTIONAL, so you may or may not have it. In case you want to encode pupilometry, then it would be pretty nuts not to use the pupil_size field name, but it'd still be valid.

That would be true if the schema enforces a controlled vocabulary for the Column names.

As said before, it does. pupil_size is controlled (though optional). As a matter of fact, @mszinte and @julia-pfarr will be able to confirm that one interesting point in the development of BEP020 was how far in controlling that vocabulary (for the columns) we wanted to go. Each eye-tracker produces different channels beyond gaze position, and we agreed that only pupil_size would be present for most of the vendors and models. The problem of controlling the vocabulary is that, to stay vendor-neutral, you need to map ALL possible cases (which potentially will be a moving traget). We applied the 80/20 rule here and narrowed it down to those three + timestamp columns.

@scott-huberty
Copy link
Contributor

Ah ok so the column names are controlled..! sorry not sure why it took so long to get on the same page there..

Is it just me that finds it very confusing that eye-tracking is under the physiology modality -> generic physiology TSV files do not constrain column names (and hence require the Columns key) -> "Eye-tracking data MUST be stored following the general specifications for "generic" physiological recordings." -> except actually eye-tracking TSV files must use (mostly) constrained column names.. But you still need to provide the Columns key that lists all column names because some "additional columns" can have unconstrained names 😵‍💫

@oesteban
Copy link
Collaborator

sorry not sure why it took so long to get on the same page there..

I think BIDS' specifications could be clearer, no need to say sorry ;)

Is it just me that finds it very confusing that eye-tracking is under the physiology modality -> generic physiology TSV files do not constrain column names (and hence require the Columns key)

That's exactly what BEP020 contributed to BIDS. It provides a pathway to control for column names and metadata by providing a PhysioType, which is a controlled vocabulary currently having only generic and eyetrack. Adding to this vocabulary allows having other data types define their columns' control as well, without the need for spawning arbitrary <datatype>/ folders (like func/).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

physio Proposed BEP see https://bids.neuroimaging.io/collaboration/governance.html#proposed-bep

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants