Parsing EDF dataset to .MAT (matlab file structure)-CodePudding

Parsing EDF dataset to .MAT (matlab file structure)

I'm trying to parse an EDF dataset to the .MAT file structure described below (originally from https://www.bbci.de/competition/iv/desc_1.html): dict_keys(['header', 'version', 'globals', 'mrk', 'cnt', 'nfo'])

The new .MAT file (i.e. Matlab format *.mat) must contain a set variables, and so far, I was able to convert the following:

   sample_rate = raw.info["sfreq"]
   nchannels=len(raw.ch_names)
   nsamples=raw.n_times
   raw.rename_channels({ch: ch.replace('.', '') for ch in raw.ch_names})
   channel_names = raw.ch_names

    labels = np.zeros((1, nsamples), int)
    labels[0, event_onsets] = event_codes

    cl_lab = ['left', 'right']
    nclasses = len(cl_lab)
    nevents = len(event_onsets)

now, I must parse the continuous EEG signals, size [time x channels]:

    EEG = m['cnt'].T   # Numpy array within the continuous EEG signals, size [time x channels]

    EEG = m['cnt'].T.shape
   (59, 190473)

which looks like the array of data below:

array([[  -36,  -203,  -384, ...,  -256,  -287,  -289],
       [ -138,  -774, -1463, ...,  -262,  -243,  -158],
       [  -50,  -280,  -517, ...,    13,    -8,    -2],
       ...,
       [  -58,  -308,  -532, ...,  -136,  -148,  -123],
       [  -81,  -438,  -787, ...,  -185,  -204,  -195],
       [ -270, -1479, -2709, ...,   106,   193,   322]], dtype=int16)

Plus, I need to parse 2 more event related variables: event_onsets and event_codes

They are:

   event_onsets = m['mrk'][0][0][0]
   array([[  2095,   2895,   3695,   4495,   5295,   6095,   6895,   7695,
          8495,   9295,  10095,  10895,  11695,  12495,  13295,  16294,
         17094,  17894,  18694,  19494,  20294,  21094,  21894,  22694,
         23494,  24295,  25095,  25895,  26695,  27495,  30494,  31294,
         32094,  32894,  33694,  34494,  35294,  36094,  36894,  37694,
         38494,  39294,  40094,  40894,  41694,  44693,  45493,  46293,
         47093,  47893,  48693,  49493,  50293,  51093,  51893,  52693,
         53493,  54293,  55093,  55893,  58892,  59692,  60492,  61292,
         62092,  62892,  63692,  64492,  65292,  66093,  66893,  67693,
         68493,  69293,  70093,  73092,  73892,  74692,  75492,  76292,
         77092,  77892,  78692,  79492,  80292,  81092,  81892,  82692,
         83492,  84292,  87291,  88091,  88891,  89691,  90491,  91291,
         92091,  92891,  93691,  94491,  97292,  98092,  98892,  99692,
        100492, 101292, 102092, 102892, 103692, 104492, 105292, 106092,
        106892, 107692, 108492, 111491, 112291, 113091, 113891, 114691,
        115491, 116291, 117091, 117891, 118691, 119492, 120292, 121091,
        121891, 122692, 125691, 126491, 127291, 128091, 128891, 129691,
        130491, 131291, 132091, 132891, 133691, 134491, 135291, 136091,
        136891, 139890, 140690, 141490, 142290, 143090, 143890, 144690,
        145490, 146290, 147090, 147890, 148690, 149490, 150290, 151090,
        154089, 154889, 155689, 156489, 157289, 158090, 158890, 159690,
        160490, 161290, 162090, 162890, 163690, 164490, 165290, 168289,
        169089, 169889, 170689, 171489, 172289, 173089, 173889, 174689,
        175489, 176289, 177089, 177890, 178689, 179489, 182488, 183288,
        184088, 184888, 185688, 186488, 187288, 188088, 188888, 189688]],
      dtype=int32)

    event_onsets.shape
   (1, 200)

and:

    event_codes = m['mrk'][0][0][1]
array([[ 1,  1,  1, -1,  1,  1,  1, -1,  1,  1,  1, -1,  1, -1, -1, -1,
         1, -1, -1, -1,  1,  1,  1, -1,  1, -1, -1, -1,  1, -1,  1,  1,
         1, -1,  1,  1, -1, -1, -1, -1,  1, -1,  1,  1,  1,  1,  1, -1,
        -1, -1,  1, -1, -1, -1, -1,  1,  1, -1, -1,  1, -1, -1, -1,  1,
         1,  1,  1, -1, -1,  1, -1, -1,  1,  1, -1,  1, -1,  1,  1, -1,
         1,  1, -1, -1,  1,  1, -1,  1, -1,  1,  1, -1,  1, -1, -1,  1,
        -1, -1, -1, -1, -1, -1, -1, -1,  1, -1, -1,  1,  1, -1, -1, -1,
         1,  1,  1,  1, -1, -1,  1, -1,  1, -1, -1,  1,  1,  1,  1,  1,
        -1,  1,  1, -1,  1, -1, -1,  1, -1,  1, -1, -1, -1, -1,  1,  1,
        -1,  1, -1,  1, -1, -1, -1,  1,  1, -1,  1, -1,  1, -1, -1, -1,
         1,  1,  1,  1, -1,  1, -1, -1,  1, -1, -1,  1, -1,  1,  1,  1,
        -1,  1,  1, -1, -1,  1,  1, -1,  1,  1, -1, -1, -1,  1,  1,  1,
         1,  1, -1, -1,  1,  1, -1, -1]], dtype=int16)

    event_codes.shape
    (1, 200)

Thus, where are the data (in the EDF dataset) corresponding to ['cnt'] and ['mkr'] (.MAT) structures?

['cnt'].T # Numpy array within the continuous EEG signals, size [time x channels] ['mrk']: structure of target cue information with fields

Best wishes,

CodePudding user response：

The EEG appears to be [channels x time] not [time x channels]. So index by sample over the second dimension in the array.

The event onsets m['mrk'][0][0][0] correspond to sample numbers (time in samples) in the EEG data. The event codes are the codes that occur at those times---usually corresponding to an experimental stimulus or response code. For example, if you want the 10 samples starting at the first event code, you would do something like: d = EEG[:,event_onsets[0][0]:event_onsets[0][0] 10] or d = EEG[0,event_onsets[0][0]:event_onsets[0][0] 10] if you only want the data for channel 0.

You will have to parse event_onsets for the onset times corresponding to the 1 vs -1 event codes so that you can compare the averaged ERPs (or whatever the analysis is). For example, if you want the indices of all the events coded as 1 you can do this:

ones_events = event_codes==1
d_at_ones_events = EEG[0, event_onsets[0,ones_events]]

which will give you a one sample slice of the EEG data at each sample at which a 1 event occurred.

To get longer lengths of time you will probably want to loop through and construct an array of 2d arrays corresponding to each ERP for both codes. Or you can store them some other way. There are a million ways to parse and organize the data. Hopefully this gives you some ideas.

Use the sampling rate to convert milliseconds to samples and vice versa.

CodePudding user response：

In fact, file extension doesn’t even matter at all. The important part was parsing data (from the EDF file) to the correct variables: event_codes and event_onsets.

Then, it’d be possible to run a few statistical and CNN algorithms (PSD and LDA).

The key documentation is https://mne.tools/stable/glossary.html#term-events

So, It was possible to create [event_onsets] variable with the 1st column of [events], as well as [event_codes] with the 3rd column of [events].