Skip to content

Velcolfix branch (for velocity columns memory uninitialization? in linux version 4.2.1197) #45

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

flyingfalling
Copy link
Contributor

This does several things:

  1. clarifies the meaning of the various sample columns while they are being read in edf_read.py from the edf_get_float_data() by reading them into a separate Nx1 array linked to the column name key in a dictionary rather than an NxC numpy ndarray. dtypes are the native dtypes returned by the FSAMPLE struct, rather than the all-float version previously returned. One can afterwards run "convert_dtypes" to convert to more robust datatypes if necessary.
  2. fixes the number_elements problem (already fixed in main, but necessary to fix again if reading method is changed) by resizing each of the Nx1 arrays to the number of samples read.
  3. Adds an "exclude_vel_cols" option in parse() to read_edf, default True, which sets all velocity-related sample columns to NaN. Does not effect events. This is to temporarily address the seeming memory uninitialization/struct offset error (cause uncertain) observed in linux version 4.2.1197 at least. I am not sure how to add "tests" for this in pytest since it requries running multiple processes and is nondeterministic... I added several directories to the tests/ directories which contain python scripts to check this (in R using eyelinkReader, and python using pyedfread)
  4. Adds an "ftime" option to read_edf, which converts the "Time" column to floating point (from default uint32) and adds 0.5 seconds for appropriate rows based on flag. I suppose I need to add a test for this, which requires adding an EDF file recorded at 2000 Hz and a way of identifying what is correct (via e.g. ASC from edf2asc). I will prepare this later...

@flyingfalling
Copy link
Contributor Author

Probably better to wait on what Sam and SR research say, this is a temporary fix at best (although it includes many other updates to the code).

@flyingfalling
Copy link
Contributor Author

Sam has said that it is indeed on their end, and they will release a new version. I will get a testing version and start testing. But, we should add a "version" check to our code, and for any version equal to or earlier than the current (buggy) version, we should provide NaNs for the affected columns.

Do you have windows environment to test with? I have never used the windows version of EDFAPI, but checking may be recommended...

@behinger
Copy link
Member

Great! Yes I have a windows environment to test (and Ubuntu, but you got that covered)

@flyingfalling
Copy link
Contributor Author

OK great.
Are you able to run and recreate the behavior I observed on windows (i.e. the output does not match in the VEL columns over mutliple runs?

Is the version number of the windows version the same as the Ubuntu version? (that 4.2.11... number?)

@behinger
Copy link
Member

behinger commented Apr 25, 2025

I think the criticial result I found is that the velocity columns are broken anyway

I found a version 4.2.762.0 for the edfapi.dll - not entirely sure how to find out the version

@flyingfalling
Copy link
Contributor Author

OK so on windows you found some file named 4.2.762.0?

Sorry, please clarify -- You found the velocity columns are "broken anyway"? What do you mean?

You mean they contain weird values? (this is issue #1).

The other issue is that the weird values are not stable across different repetitions for the same EDF file (issue #2).

Yea, I think we can wrap the "version" function and see if that gives useful information in our library. And if that is properly updated we can use that. Because there is no way to diagnose if the data is safe based on behavior since it is effectively random values in memory.

Honestly probably the safest thing to do is simply blank out affected columns with NANs. That way we guarantee that no one gets "incorrect" data, which to me is worse than no data. In fact it is better to simply not include the columns at all.

@behinger
Copy link
Member

behinger commented Apr 25, 2025

On windows I found the edfapi.dll and in the properties it says that version number 🤷

For me these two issues (broken values and non-repeatable) are the same underlying issue, e.g. the second follows from the first. Sorry for not being clear. I didnt check the repeatedness yet, bit busy unfortunately right this moment.

Detecting the version + filling with NaNs + Warning seems best! Not sure I'd remove the columns though, that would be breaking

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants