Skip to content

Conversation

drammock
Copy link
Contributor

@drammock drammock commented Dec 6, 2024

UPDATE: 09/29/2025

Important

The final review of the BEP042 is up (9/29/2025 to 10/10/2025), Please consider voting and providing your feedback: #2219


This adds support for Electromyography (EMG) datasets. CIs are not expected to pass yet.

cc @neuromechanist @jwelzel @larsoner @arnodelorme @robertoostenveld feel free to push directly to this branch, I'll add you as repo collaborators on my fork.

We meet regularly to discuss this BEP
Next meeting: 18 Dec 2024 on https://ucsd.zoom.us/j/96433382377
Communication channel on github repo / matrix / slack / discord : #1371

Note

See the HTML version of the proposped specifications at this readthedocs page

closes #1371

@drammock
Copy link
Contributor Author

cc @agramfort

@Remi-Gau Remi-Gau added the BEP label Dec 19, 2024
@yarikoptic yarikoptic changed the title [ENH] extension for electromyography (EMG) - BEP42 [ENH] extension for electromyography (EMG) - BEP042 Jan 16, 2025
@drammock drammock force-pushed the emg branch 2 times, most recently from 0d01783 to 8902705 Compare February 25, 2025 17:34
@sjeung
Copy link
Collaborator

sjeung commented Feb 26, 2025

Hi, @neuromechanist pointed me to this PR and I would like to share some thoughts. This seems to be pretty advanced in terms of sensor placement description which was not very well defined in the motion BEP :)

  • .json EMGPlacementScheme field : could be more restrictive with keywords?
    For instance in case of absence of a common process one MUST write "channel-specific".
    Keywords "visual reference", "palpation", "functional localization" ... can be explicitly recommended rather than having people use different keywords for describing the same thing (e.g., "visual inspection", "pressing on the skin"... ). They may even use multiple of those methods at the same time and in that case they can separate them with some designated delimiter (that can be prescribed too) for easy parsing. This depends of course on how well-categorized these processes are but since you are allowing unprescribed keywords for names of external schemes anyway (like SENIAM) it would be okay to not be comprehensive.

  • In the example on the website draft I read "EMGPlacementScheme": "midpoint
    between cubital fossa and radial styloid process", : this seems to contradict the description that says NOT to give the target muscle description

  • .json EMGReference : similarly to EMGPlacementScheme field, you may simply have them choose between 1) a specific name, 2) keyword "channel-specific", or 3) "bipolar". Mix of bipolar and other references would then be a case of "channel-specific".

  • .json SkinPreparation : might this be channel-specific as well? For instance in EEG we would use the abrasive gel only for EOG and not for other electrodes. Then having this as a column in channels.tsv with description of keywords in channels.json can be helpful

@drammock
Copy link
Contributor Author

Hi @sjeung, thanks for the feedback / ideas.

  • .json EMGPlacementScheme field : could be more restrictive with keywords?

done in e84cadc

In the example on the website draft I read "EMGPlacementScheme": "midpoint between cubital fossa and radial styloid process", : this seems to contradict the description that says NOT to give the target muscle description

Those are skeletal landmarks, not muscles. But we've reworked EMGPlacementScheme to be an enum now, so that example will need to change anyway.

.json EMGReference : similarly to EMGPlacementScheme field, you may simply have them choose between 1) a specific name, 2) keyword "channel-specific", or 3) "bipolar". Mix of bipolar and other references would then be a case of "channel-specific".

This was the intent, perhaps it's just not worded clearly enough? Suggestions for clarification are welcome.

.json SkinPreparation : might this be channel-specific as well?

For EEG, I think abrasive gel isn't used because of possible damage to hair. According to @neuromechanist it would be odd to use a different skin prep for different EMG sites in the same session, so we'll probably leave this as as-is.

@drammock
Copy link
Contributor Author

I think this BEP is ready for a thorough review by the team: @robertoostenveld @larsoner @neuromechanist @arnodelorme @JuliusWelzel @tjeerdboonstra

cc @agramfort

@arnodelorme
Copy link

arnodelorme commented Mar 10, 2025

The document looks comprehensive https://bids-specification--1998.org.readthedocs.build/en/1998/modality-specific-files/electromyography.html. A few comments:

  • Sampling Frequency Specification: The sampling frequency is expected to be the same for all electrodes right?

  • "EMGPlacementScheme" is set to "midpoint between cubital fossa and radial styloid process" in the example, but the specification says it should be "Measured", "Other", or "ChannelSpecific"

  • For channel.tsv, maybe "reference" should be "reference_electrode" to mirror the column "signal_electrode" and make it clearer to users

  • EMGCoordinateSystem must be one of "Others" (or maybe the other keywords are missing). This is not accurate since the other coordinate systems seem allowed https://bids-specification--1998.org.readthedocs.build/en/1998/appendices/coordinate-systems.html

@drammock
Copy link
Contributor Author

  • Sampling Frequency Specification: The sampling frequency is expected to be the same for all electrodes right?

In the majority of cases yes. But not necessarily, if there are e.g. some grid devices and some bipolar devices at different spots on the body, recording into separate amplifiers / data files but acquired simultaneously. These would get different values for the acq- entity.

  • "EMGPlacementScheme" is set to "midpoint between cubital fossa and radial styloid process" in the example, but the specification says it should be "Measured", "Other", or "ChannelSpecific"

good catch. That should be Other and the text should be in EMGPlacementSchemeDescription --- which is missing from the list of *_emg.json fields.

  • For channel.tsv, maybe "reference" should be "reference_electrode" to mirror the column "signal_electrode" and make it clearer to users

I went back and forth on that question. reference already exists as a defined column for EEG datasets, so it was easier / more consistent to re-use it... but I agree that it would be good if the two column names were more parallel. I think @neuromechanist and I agreed that calling the other column just "signal" was too ambiguous, so maybe calling it reference_electrode (and thus breaking the similarity with other modalities) is the best way forward.

I think this is actually correct as-is. Other coordinate systems are allowed for other modalities, but we're making the assumption that things like CTF, NeuromagElektaMEGIN, CapTrak, etc are not relevant for the vast majority of EMG datasets, and that coordinate systems for EMG datasets will almost always be "custom" (AKA, will define their own origin and XYZ directions, based on anatomical landmarks not on the skull).

@JuliusWelzel
Copy link
Collaborator

Hello,

thanks very much for the great progress! I have some remarks listed below:

  1. The BIDS definition for the acq-label reads as follows:

"Definition: The acq- entity corresponds to a custom label the user MAY use to distinguish a different set of parameters used for acquiring the same modality."

In EMG BIDS the acq-label is used to differentiate between recording systems, not parameters. While I do understand it is simple to use, maybe BIDS in general could introduce a sys label for filenames for this purpose. For motion data we introduced a new tracksys label for the same use case. Probably should have named that sys :D

Also the acq-label is explained again with the coordinate_system.json. I would move the explanation further up.

  1. Provide some more information how the acq_time is to be formated (e.g. similar to MOTION-BIDS):

In the scans.tsv file, date-time information MUST be expressed as indicated in Units, which allows to use sub milisecond presicion.

  1. In the description of the sensor locations, are the example structures a MUST? E.g. if landmarks are digitized with a Polhemus ... coordinates of an electrode MUST be given with x,y,z coordinates?

  2. The table of Hardware information has a nan row at the 5th position

  3. The example *_emg.json is not a valid .json file due to the last comma (after "jumping").

  4. For the channel description it is stated: "Channels SHOULD appear in the table in the same order they do in the EMG data file". Are headers a MUST ind the EMG data file? If not how can the channels be matched, if not by order. I would propose to make this a MUST.

  5. The restricted keyword list for the channels.json seems a little counterintuative. EMG is, at least to my limited knowledge, not often sampled through the same amplifier as Eye-Tracking. I would exlcude EYEGAZE and PUPIL from this list. Probably include POS channels from motion data, as e.g. Vicon offers EMG integration with optical motion capture.

@drammock
Copy link
Contributor Author

  1. The BIDS definition for the acq-label reads as follows [...]
    In EMG BIDS the acq-label is used to differentiate between recording systems, not parameters. While I do understand it is simple to use, maybe BIDS in general could introduce a sys label for filenames for this purpose.

That is an attractive idea. You are right that we chose acq mostly for convenience. It would also be possible to expand the definition of acq (e.g., "different set of parameters or devices used for acquiring the same modality").

Also the acq-label is explained again with the coordinate_system.json. I would move the explanation further up.

it is explained at the end of the initial "EMG Data" section (just before the "Terminology: electrodes vs channels" subsection). It comes up again when discussing coordsystems, and then again when discussing photos. I couldn't see a good way to avoid talking about it in multiple places. In light of that, do you still think it needs to move / change?

  1. Provide some more information how the acq_time is to be formated (e.g. similar to MOTION-BIDS)

Are you specifically asking to add the "sub-millisecond precision" bit? (if so, no objection). If not, can you clarify what you think is lacking here?

  1. In the description of the sensor locations, are the example structures a MUST? E.g. if landmarks are digitized with a Polhemus ... coordinates of an electrode MUST be given with x,y,z coordinates?

what is MUST, SHOULD, or MAY is open to discussion. There are also likely some more rules to be added, e.g., to make some optional fields required depending on the values in other fields. Regarding specifically the Polhemus case, I would agree that digitized locations MUST include x,y,z based on my experience using Polhemus for digitizing EEG electrode locations. Is there a case where one would use a Polhemus (or similar spatial digitizer) and not provide coordinates in 3D?

  1. The table of Hardware information has a nan row at the 5th position

thanks, fixed. It was asking for AmplifierType, which wasn't defined elsewhere. IIRC we decided that wasn't needed, but I can add it back in if folks disagree.

  1. The example *_emg.json is not a valid .json file due to the last comma (after "jumping").

thanks, fixed.

  1. For the channel description it is stated: "Channels SHOULD appear in the table in the same order they do in the EMG data file". Are headers a MUST ind the EMG data file? If not how can the channels be matched, if not by order. I would propose to make this a MUST.

EDF/BDF necessarily have channel names in the file (which I think is what you mean by "headers" right?). There are also guidelines on what the format of such channel names should look like (modality-space-identifier, i.e., EEG Cz or MEG 1441 or EMG 002). I suppose it would be conceivable to have an EDF/BDF file where the channel names were non-unique (which IMO would be a degenerate case), but I don't think they can be missing.

  1. The restricted keyword list for the channels.json seems a little counterintuative. EMG is, at least to my limited knowledge, not often sampled through the same amplifier as Eye-Tracking. I would exlcude EYEGAZE and PUPIL from this list. Probably include POS channels from motion data, as e.g. Vicon offers EMG integration with optical motion capture.

This was originally copy-pasted from EEG, then pruned. I agree it needs refinement... I had a code comment in there for a while saying as much, until I realized almost nobody was reading the source :) For now I'll remove PUPIL, EYEGAZE, ADC, DAC, and OTHER, and add POS.

@neuromechanist
Copy link
Member

neuromechanist commented Mar 13, 2025

Thanks, @JuliusWelzel, very insightful comments,

  1. Re acq-<label>, I am in favor of expanding the definition of acquisition mostly to avoid introducing yet another entity to BIDS.
    Also, it might be good to consider recording-<label>, which is defined entity although its definition needs to be expanded:

This entity is commonly applied when continuous recordings have different sampling frequencies or start times. For example, physiological recordings with different sampling frequencies may be distinguished using labels like recording-100Hz and recording-500Hz.

IMHO, acq-<label> is more meaningful as it would indicate separate acquisitions.

  1. acq_time in the scans_tsv is quite clear IMO. We briefly discussed accommodating a LATENCY channel, if the data has multiple recordings. Probably, we should add it to the list of reserved channel types?
    Here is the description of the LATENCY channel:

LATENCY | Latency of samples in seconds from recording onset (see acq_time column of the respective *_scans.tsv file). MUST be in form of s[.000000], where s reflects whole seconds, and .000000 reflects OPTIONAL fractional seconds.

And the description of how to use it:

In case a tracking system provides time information with every recorded sample, these times information MAY be stored in form of latencies to recording onset (first sample) in the *_motion.tsv file. If a system has uneven sampling rate behavior, the LATENCY channel can be used to share this information.

  1. +1 that the relation between the data and channels should be a MUST, this is also problematic for the relation between channels and electrodes (see the detailed discussion here: [ENH] channels.tsv and electrodes.tsv should have a clear relationship #2041).

@JuliusWelzel
Copy link
Collaborator

it is explained at the end of the initial "EMG Data" section (just before the "Terminology: electrodes vs channels" subsection). It comes up again when discussing coordsystems, and then again when discussing photos. I couldn't see a good way to avoid talking about it in multiple places. In light of that, do you still think it needs to move / change?

Good point, I think it can stay as it is. Maybe it is worth adding a detailed explanation for the reasoning in the paper.

Are you specifically asking to add the "sub-millisecond precision" bit? (if so, no objection). If not, can you clarify what you think is lacking here?

Yes, sub-millisecond presicion is imo worth mentioning as EMG usually has a high srate. This time resolution is important for good syncronization with other modalities.

what is MUST, SHOULD, or MAY is open to discussion. There are also likely some more rules to be added, e.g., to make some optional fields required depending on the values in other fields. Regarding specifically the Polhemus case, I would agree that digitized locations MUST include x,y,z based on my experience using Polhemus for digitizing EEG electrode locations. Is there a case where one would use a Polhemus (or similar spatial digitizer) and not provide coordinates in 3D?

I am not aware of any case where it is not provided in x,y,z.

EDF/BDF necessarily have channel names in the file (which I think is what you mean by "headers" right?). There are also guidelines on what the format of such channel names should look like (modality-space-identifier, i.e., EEG Cz or MEG 1441 or EMG 002). I suppose it would be conceivable to have an EDF/BDF file where the channel names were non-unique (which IMO would be a degenerate case), but I don't think they can be missing.

True, sorry. But maybe it can be pointed out, that the names in the 'channels.tsv' MUST match the names in the BDE/EDF file?

@JuliusWelzel
Copy link
Collaborator

IMHO, acq-<label> is more meaningful as it would indicate separate acquisitions.

Agreed, maybe a PR can be opened to extend the definition for the acq label as @drammock suggested?

  1. acq_time in the scans_tsv is quite clear IMO. We briefly discussed accommodating a LATENCY channel, if the data has multiple recordings. Probably, we should add it to the list of reserved channel types?

Good idea, I would be in favor off adding LATENCY to the channel types. The scans.tsv file will also be replace with a recordings.tsv file in BIDS 2.0.

Copy link
Member

@neuromechanist neuromechanist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made it through once and made suggestions. The overall specifications look great. I look forward to the community comments.

@drammock
Copy link
Contributor Author

ping @robertoostenveld and @tjeerdboonstra. I think we're about ready to open this up to public comment; do you want a chance to go through it again first?

@drammock
Copy link
Contributor Author

maybe a PR can be opened to extend the definition for the acq label as @drammock suggested?

done in #2090

@yarikoptic
Copy link
Collaborator

sys entity feels analogous (so can replace or be replaced with) to

idea. So if to parallel exactly, should get systems.{json,tsv}? But then I would prefer devices.{json.tsv} as better descriptive since systems could be abstract ("coordinate system" etc).

@JuliusWelzel
Copy link
Collaborator

sys entity feels analogous (so can replace or be replaced with) to

idea. So if to parallel exactly, should get systems.{json,tsv}?

Yes! As far as I understand how devices should be used, this is what we wanted to achieve with the tracksys entity for MOTION-BIDS. In the Paper we define a "tracking-system" as:

We define a tracking system as a group of channels that synchronously sample motion data from one or multiple tracked points. To be grouped as a single tracking system, channels MUST share the core parameters of sampling (namely the sampling rate and the duration) as well as hardware and software properties, resulting in the same number of samples and, if available, a single latency channel associated with the rest of the channels.

This resulted in a REQUIRED tracksys-<label> per motion.tsv file.
I think it is important to specify if users MUST define the sys/acq/dev label or if this is optional. We made it required, even though, the majority of motion datasets records data using only a single device. Should BIDS 2.0 remove the tracksys label for the motion data and streamline with whatever is decided in this and similar BEPs?

But then I would prefer devices.{json.tsv} as better descriptive since systems could be abstract ("coordinate system" etc).

As for the terminology, adopting devices.{json.tsv} is preferable over systems.{json.tsv} to avoid confusion with other abstract concepts like coordinate systems. The term "devices" more accurately reflects the physical equipment used in data acquisition, leading to clearer documentation and understanding.

@drammock
Copy link
Contributor Author

The term "devices" more accurately reflects the physical equipment used in data acquisition, leading to clearer documentation and understanding.

agreed, dev / device is semantically a better entity name than acq (or sys or recording) for what we're grappling with in EMG.

@drammock
Copy link
Contributor Author

@effigies re: our conversation earlier about needing to back out earlier changes: I think we want to revert c6d4388 and 8b0393a in this PR, right? IIUC we won't need it; most of what we need will be in the new associations.coordsystems (plural), and we can check for e.g. the presence of keys (ParentCoordinateSystem, AnchorElectrode etc) via src/schema/rules/json/emg.yaml I think.

@effigies
Copy link
Collaborator

I think we want to revert c6d4388 and 8b0393a in this PR, right? IIUC we won't need it;

Correct.

@drammock
Copy link
Contributor Author

drammock commented Sep 19, 2025

@effigies and @neuromechanist I think now it's time to squash all commits here, per review policy? For me locally, all the datasets at neuromechanist/bids-examples#3 pass validation when run against bids-standard/bids-validator#268 and the CIs here are happy too

Co-authored-by: Seyed (Yahya) Shirazi <[email protected]>
Co-authored-by: Jörn M. Horschig <[email protected]>
Co-authored-by: Chris Markiewicz <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
@drammock drammock marked this pull request as ready for review September 23, 2025 15:40
Copy link
Collaborator

@agramfort agramfort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not had time to dig into lots of details here unfortunately but reading the preview at https://bids-specification--1998.org.readthedocs.build/en/1998/modality-specific-files/electromyography.html you have my +1. It covers the use cases I have top of mind. thanks @drammock @neuromechanist for pushing this and all reviewers who helped bring this proposal to this level 👏

@VisLab
Copy link
Member

VisLab commented Sep 25, 2025

Nice job! Here are some minor comments/suggestions:

  1. Consider making _task-<label> optional for EMG data and channels. It is optional for electrodes.tsv. I know you said that task could be conditions rather than behavioral, but it seems non-intuitive and unnecessary. There are experiments for which the notion of "task" is not relevant. We are having a discussion about making it optional for EEG for the same reason.
  2. In Electrodes vs. Channels the = signs look very odd. Maybe bold with :, e.g. Electrodes:
  3. Do a global search of .tsv and .json -- the use of code typeface is inconsistent. Also sometimes its n/a and sometimes 'n/a'.

@arnodelorme
Copy link

+1 on my side as well.

@neuromechanist
Copy link
Member

neuromechanist commented Sep 25, 2025

Please review the EMG-BIDS proposal, an initiative aimed at integrating EMG data sharing within the BIDS ecosystem.

Your review will contribute to the development of specifications that strike a balance between thoroughness and clarity and ease-of-use.

To conduct a code review, you can examine the files modified from the tabs above.

Alternatively, you can review the HTML preview of the proposed specifications and/or the examples.

In the end, please provide your comments on #2219, or upvote and approve the proposed specifications.

@neuromechanist
Copy link
Member

Important

The final review of the BEP042 is up, Please consider voting and proving your feedback: #2219

@agramfort, @VisLab, @arnodelorme, @larsoner, thanks much for your feedback and review here. Please consider voting on the first post on the Discussion Thread 🙌

@neuromechanist neuromechanist added Proposed BEP see https://bids.neuroimaging.io/collaboration/governance.html#proposed-bep EMG labels Oct 8, 2025
@drammock
Copy link
Contributor Author

drammock commented Oct 21, 2025

@VisLab (re: #1998 (comment))

Consider making _task- optional for EMG data and channels

Thank you for the suggestion, we're still discussing this.

In Electrodes vs. Channels the = signs look very odd.

fixed in 26346de

Do a global search of .tsv and .json -- the use of code typeface is inconsistent. Also sometimes its n/a and sometimes 'n/a'

I actually cannot find cases where this is inconsistent and also under our control. non-code typeface occurs in right-sidebar table-of-contents (which is stripped of formatting) and in the titles of code blocks for example files (a context which doesn't allow formatting). For n/a I see it quoted in two places (in example JSON blocks, where it MUST be quoted to be valid JSON) and unquoted in one place (where it's an example for the content of the description column in a TSV file --- where it SHOULD NOT have quotation marks around it). For the latter occurrence I've added a bit of explanatory text in 88eea21.

@drammock
Copy link
Contributor Author

drammock commented Oct 22, 2025

@klotz-t (re: #2219 (comment))

Two items recommended for reporting in the CEDE guidelines but missing within the BIDS specification are (i) pre-amplification and (ii) gain

added Preamplification and Gain as RECOMMENDED fields in fafb3f1. LMK if you're satisfied with how they're described, or want to suggest a rewording.

where should the details on multiple, potentially heterogeneous electrodes be reported?

added additional explanation and cross-references in e009b96. LMK if this clears up your questions, or needs additional tweaks.

@drammock
Copy link
Contributor Author

@JuliusWelzel (re: #2219 (comment))

the latency description differs between this EMG extension and the motion specification

I've harmonized these in 4e9adcd

I suggest adding a paragraph similar to the motion spec regarding synchronization

Thanks for this suggestion. I've added a modified version of your suggested text in afaa9e1, LMK if you think it needs any further edits.

@neuromechanist
Copy link
Member

@JuliusWelzel (re: #2219 (comment)):

Consider adding examples showing EMG data alongside other modalities (especially eeg or motion). I have to admit this is currently missing for motion :D

We added a sample dataset based on ds0003739, by separating the motion and EMG data from the EEG data files. It is available as of bids-standard/bids-examples@1758df9

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

BEP EMG Proposed BEP see https://bids.neuroimaging.io/collaboration/governance.html#proposed-bep

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BEP042 surface electromyography (HDsEMG/EMG)