Home > OS >  Convert nested list into data frame wide format
Convert nested list into data frame wide format

Time:08-24

I would like to convert a (nested?) list into a specific kind of wide format data frame. At first I show what the list data looks like at the moment. At the bottom of the post I show how I want the data frame to look like.

The structure of the list str(zF_agg2) looks like this

List of 4
 $ 252: Named num [1:48] 0.1491 0.1209 -0.0786 -0.1141 -0.0642 ...
  ..- attr(*, "names")= chr [1:48] "VA01_F0finEnv_sma" "VA01_F0final_sma" "VA01_jitterLocal_sma" "VA01_shimmerLocal_sma" ...
 $ 306: Named num [1:24] -0.265 -0.217 0.151 0.21 0.114 ...
  ..- attr(*, "names")= chr [1:24] "VA01_F0finEnv_sma" "VA01_F0final_sma" "VA01_jitterLocal_sma" "VA01_shimmerLocal_sma" ...
 $ 371: Named num [1:48] 0.1491 0.1209 -0.0786 -0.1141 -0.0642 ...
  ..- attr(*, "names")= chr [1:48] "VA01_F0finEnv_sma" "VA01_F0final_sma" "VA01_jitterLocal_sma" "VA01_shimmerLocal_sma" ...
 $ 389: Named num [1:24] 1.59e-18 -9.46e-17 -2.37e-17 2.85e-17 -9.63e-17 ...
  ..- attr(*, "names")= chr [1:24] "VA03_F0finEnv_sma" "VA03_F0final_sma" "VA03_jitterLocal_sma" "VA03_shimmerLocal_sma" ...

The list looks like this

$`252`
     VA01_F0finEnv_sma       VA01_F0final_sma   VA01_jitterLocal_sma  VA01_shimmerLocal_sma 
          1.490953e-01           1.209293e-01          -7.857455e-02          -1.141023e-01 
      VA01_mfcc_sma.0.       VA01_mfcc_sma.1.       VA01_mfcc_sma.2.       VA01_mfcc_sma.3. 
         -6.422870e-02          -3.020678e-02          -2.085494e-01          -1.920209e-01 
      VA01_mfcc_sma.4.       VA01_mfcc_sma.5.       VA01_mfcc_sma.6.       VA01_mfcc_sma.7. 
         -3.705184e-02          -9.375184e-02           7.214606e-02          -1.432561e-01 
      VA01_mfcc_sma.8.       VA01_mfcc_sma.9.      VA01_mfcc_sma.10.      VA01_mfcc_sma.11. 
         -7.943067e-02          -2.449599e-01          -7.746337e-02           7.970790e-03 
     VA01_mfcc_sma.12.      VA01_mfcc_sma.13.      VA01_mfcc_sma.14.   VA01_F0finEnv_sma SD 
         -2.187090e-01          -2.304651e-01          -1.662806e-01           9.294909e-01 
   VA01_F0final_sma SD   VA01_F0finEnv_sma RG    VA01_F0final_sma RG VA01_pcm_intensity_sma 
          8.839473e-01           5.717988e 00           5.333469e 00           1.317796e-01 
     VB01_F0finEnv_sma       VB01_F0final_sma   VB01_jitterLocal_sma  VB01_shimmerLocal_sma 
          1.851272e-16          -5.009943e-17          -2.348591e-17          -2.604552e-17 
      VB01_mfcc_sma.0.       VB01_mfcc_sma.1.       VB01_mfcc_sma.2.       VB01_mfcc_sma.3. 
         -2.391990e-17          -3.890029e-17          -2.785709e-17           2.683726e-17 
      VB01_mfcc_sma.4.       VB01_mfcc_sma.5.       VB01_mfcc_sma.6.       VB01_mfcc_sma.7. 
         -2.083230e-17           5.737901e-18          -2.277669e-17           3.112273e-17 
      VB01_mfcc_sma.8.       VB01_mfcc_sma.9.      VB01_mfcc_sma.10.      VB01_mfcc_sma.11. 
         -2.453523e-18          -5.662469e-17           2.266112e-17           1.361854e-18 
     VB01_mfcc_sma.12.      VB01_mfcc_sma.13.      VB01_mfcc_sma.14.      VB01_F0finEnv_sma 
          1.131963e-17          -4.091183e-17           8.561561e-18           1.000000e 00 
      VB01_F0final_sma      VB01_F0finEnv_sma       VB01_F0final_sma VB01_pcm_intensity_sma 
          1.000000e 00           1.124750e 01           1.369369e 01          -3.446556e-17 

$`306`
     VA01_F0finEnv_sma       VA01_F0final_sma   VA01_jitterLocal_sma  VA01_shimmerLocal_sma 
           -0.26540630            -0.21720761             0.15086035             0.21046540 
      VA01_mfcc_sma.0.       VA01_mfcc_sma.1.       VA01_mfcc_sma.2.       VA01_mfcc_sma.3. 
            0.11434077             0.05377450             0.37126233             0.34183814 
      VA01_mfcc_sma.4.       VA01_mfcc_sma.5.       VA01_mfcc_sma.6.       VA01_mfcc_sma.7. 
            0.06596016             0.16689824            -0.12843535             0.25502638 
      VA01_mfcc_sma.8.       VA01_mfcc_sma.9.      VA01_mfcc_sma.10.      VA01_mfcc_sma.11. 
            0.14140350             0.43608087             0.13790130            -0.01418970 
     VA01_mfcc_sma.12.      VA01_mfcc_sma.13.      VA01_mfcc_sma.14.      VA01_F0finEnv_sma 
            0.38934865             0.41027690             0.29601484             1.06416828 
      VA01_F0final_sma      VA01_F0finEnv_sma       VA01_F0final_sma VA01_pcm_intensity_sma 
            1.14875290             5.53095033             5.59523999             0.13177964 

$`371`
     VA01_F0finEnv_sma       VA01_F0final_sma   VA01_jitterLocal_sma  VA01_shimmerLocal_sma 
          1.490953e-01           1.209293e-01          -7.857455e-02          -1.141023e-01 
      VA01_mfcc_sma.0.       VA01_mfcc_sma.1.       VA01_mfcc_sma.2.       VA01_mfcc_sma.3. 
         -6.422870e-02          -3.020678e-02          -2.085494e-01          -1.920209e-01 
      VA01_mfcc_sma.4.       VA01_mfcc_sma.5.       VA01_mfcc_sma.6.       VA01_mfcc_sma.7. 
         -3.705184e-02          -9.375184e-02           7.214606e-02          -1.432561e-01 
      VA01_mfcc_sma.8.       VA01_mfcc_sma.9.      VA01_mfcc_sma.10.      VA01_mfcc_sma.11. 
         -7.943067e-02          -2.449599e-01          -7.746337e-02           7.970790e-03 
     VA01_mfcc_sma.12.      VA01_mfcc_sma.13.      VA01_mfcc_sma.14.      VA01_F0finEnv_sma 
         -2.187090e-01          -2.304651e-01          -1.662806e-01           9.294909e-01 
      VA01_F0final_sma      VA01_F0finEnv_sma       VA01_F0final_sma VA01_pcm_intensity_sma 
          8.839473e-01           5.717988e 00           5.333469e 00          -2.961325e-01 
     VA02_F0finEnv_sma       VA02_F0final_sma   VA02_jitterLocal_sma  VA02_shimmerLocal_sma 
          1.851272e-16          -5.009943e-17          -2.348591e-17          -2.604552e-17 
      VA02_mfcc_sma.0.       VA02_mfcc_sma.1.       VA02_mfcc_sma.2.       VA02_mfcc_sma.3. 
         -2.391990e-17          -3.890029e-17          -2.785709e-17           2.683726e-17 
      VA02_mfcc_sma.4.       VA02_mfcc_sma.5.       VA02_mfcc_sma.6.       VA02_mfcc_sma.7. 
         -2.083230e-17           5.737901e-18          -2.277669e-17           3.112273e-17 
      VA02_mfcc_sma.8.       VA02_mfcc_sma.9.      VA02_mfcc_sma.10.      VA02_mfcc_sma.11. 
         -2.453523e-18          -5.662469e-17           2.266112e-17           1.361854e-18 
     VA02_mfcc_sma.12.      VA02_mfcc_sma.13.      VA02_mfcc_sma.14.      VA02_F0finEnv_sma 
          1.131963e-17          -4.091183e-17           8.561561e-18           1.000000e 00 
      VA02_F0final_sma      VA02_F0finEnv_sma       VA02_F0final_sma VA02_pcm_intensity_sma 
          1.000000e 00           1.124750e 01           1.369369e 01          -6.365529e-16 

$`389`
     VA03_F0finEnv_sma       VA03_F0final_sma   VA03_jitterLocal_sma  VA03_shimmerLocal_sma 
          1.586292e-18          -9.464618e-17          -2.369378e-17           2.853813e-17 
      VA03_mfcc_sma.0.       VA03_mfcc_sma.1.       VA03_mfcc_sma.2.       VA03_mfcc_sma.3. 
         -9.629405e-17          -5.495508e-17          -2.202477e-17          -4.454892e-17 
      VA03_mfcc_sma.4.       VA03_mfcc_sma.5.       VA03_mfcc_sma.6.       VA03_mfcc_sma.7. 
         -7.952470e-17          -1.056807e-17          -6.211858e-17           4.154178e-18 
      VA03_mfcc_sma.8.       VA03_mfcc_sma.9.      VA03_mfcc_sma.10.      VA03_mfcc_sma.11. 
         -8.151347e-18           1.995314e-18           3.121848e-17           2.181543e-17 
     VA03_mfcc_sma.12.      VA03_mfcc_sma.13.      VA03_mfcc_sma.14.      VA03_F0finEnv_sma 
          8.159633e-17           6.164483e-19           2.416510e-17           1.000000e 00 
      VA03_F0final_sma      VA03_F0finEnv_sma       VA03_F0final_sma VA03_pcm_intensity_sma 
          1.000000e 00           3.918357e 00           7.132235e 00          -3.446556e-17 

This is what the list looks like displayed with dput(zF_agg2):

list(`252` = c(VA01_F0finEnv_sma = 0.149095349677244, VA01_F0final_sma = 0.120929343088889, 
VA01_jitterLocal_sma = -0.0785745451433892, VA01_shimmerLocal_sma = -0.114102345203172, 
VA01_mfcc_sma.0. = -0.0642286999362642, VA01_mfcc_sma.1. = -0.030206778340382, 
VA01_mfcc_sma.2. = -0.208549388306997, VA01_mfcc_sma.3. = -0.192020923835602, 
VA01_mfcc_sma.4. = -0.0370518353007777, VA01_mfcc_sma.5. = -0.093751840248999, 
VA01_mfcc_sma.6. = 0.0721460591859715, VA01_mfcc_sma.7. = -0.143256107040908, 
VA01_mfcc_sma.8. = -0.0794306655354017, VA01_mfcc_sma.9. = -0.244959943019604, 
VA01_mfcc_sma.10. = -0.0774633729052873, VA01_mfcc_sma.11. = 0.00797079006761165, 
VA01_mfcc_sma.12. = -0.218709025578709, VA01_mfcc_sma.13. = -0.230465062187873, 
VA01_mfcc_sma.14. = -0.166280574763084, `VA01_F0finEnv_sma SD` = 0.929490898547702, 
`VA01_F0final_sma SD` = 0.88394728516363, `VA01_F0finEnv_sma RG` = 5.71798783504483, 
`VA01_F0final_sma RG` = 5.33346937206177, VA01_pcm_intensity_sma = 0.131779638620608, 
VB01_F0finEnv_sma = 1.85127192814681e-16, VB01_F0final_sma = -5.00994289546507e-17, 
VB01_jitterLocal_sma = -2.34859079707077e-17, VB01_shimmerLocal_sma = -2.60455164544764e-17, 
VB01_mfcc_sma.0. = -2.39198962485355e-17, VB01_mfcc_sma.1. = -3.89002856985193e-17, 
VB01_mfcc_sma.2. = -2.78570899311261e-17, VB01_mfcc_sma.3. = 2.68372615000544e-17, 
VB01_mfcc_sma.4. = -2.08322992987213e-17, VB01_mfcc_sma.5. = 5.73790146364059e-18, 
VB01_mfcc_sma.6. = -2.27766888801672e-17, VB01_mfcc_sma.7. = 3.11227273727354e-17, 
VB01_mfcc_sma.8. = -2.45352291763184e-18, VB01_mfcc_sma.9. = -5.66246851843269e-17, 
VB01_mfcc_sma.10. = 2.26611187363704e-17, VB01_mfcc_sma.11. = 1.36185358636217e-18, 
VB01_mfcc_sma.12. = 1.13196270572855e-17, VB01_mfcc_sma.13. = -4.09118309040634e-17, 
VB01_mfcc_sma.14. = 8.56156119316215e-18, VB01_F0finEnv_sma = 1, 
VB01_F0final_sma = 1, VB01_F0finEnv_sma = 11.2475000713838, VB01_F0final_sma = 13.6936948927086, 
VB01_pcm_intensity_sma = -3.4465563242102e-17), `306` = c(VA01_F0finEnv_sma = -0.265406298807794, 
VA01_F0final_sma = -0.217207612472567, VA01_jitterLocal_sma = 0.150860347120316, 
VA01_shimmerLocal_sma = 0.210465396713397, VA01_mfcc_sma.0. = 0.114340766331976, 
VA01_mfcc_sma.1. = 0.0537744993637855, VA01_mfcc_sma.2. = 0.371262331337075, 
VA01_mfcc_sma.3. = 0.341838144083938, VA01_mfcc_sma.4. = 0.065960158721897, 
VA01_mfcc_sma.5. = 0.166898244394498, VA01_mfcc_sma.6. = -0.128435352160981, 
VA01_mfcc_sma.7. = 0.255026383486623, VA01_mfcc_sma.8. = 0.141403503053762, 
VA01_mfcc_sma.9. = 0.436080874021934, VA01_mfcc_sma.10. = 0.137901303147026, 
VA01_mfcc_sma.11. = -0.0141897040654165, VA01_mfcc_sma.12. = 0.389348649641122, 
VA01_mfcc_sma.13. = 0.41027689879224, VA01_mfcc_sma.14. = 0.296014840147772, 
VA01_F0finEnv_sma = 1.06416827571567, VA01_F0final_sma = 1.14875289621226, 
VA01_F0finEnv_sma = 5.53095033394156, VA01_F0final_sma = 5.59523999359499, 
VA01_pcm_intensity_sma = 0.131779638620608), `371` = c(VA01_F0finEnv_sma = 0.149095349677244, 
VA01_F0final_sma = 0.120929343088889, VA01_jitterLocal_sma = -0.0785745451433892, 
VA01_shimmerLocal_sma = -0.114102345203172, VA01_mfcc_sma.0. = -0.0642286999362642, 
VA01_mfcc_sma.1. = -0.030206778340382, VA01_mfcc_sma.2. = -0.208549388306997, 
VA01_mfcc_sma.3. = -0.192020923835602, VA01_mfcc_sma.4. = -0.0370518353007777, 
VA01_mfcc_sma.5. = -0.093751840248999, VA01_mfcc_sma.6. = 0.0721460591859715, 
VA01_mfcc_sma.7. = -0.143256107040908, VA01_mfcc_sma.8. = -0.0794306655354017, 
VA01_mfcc_sma.9. = -0.244959943019604, VA01_mfcc_sma.10. = -0.0774633729052873, 
VA01_mfcc_sma.11. = 0.00797079006761165, VA01_mfcc_sma.12. = -0.218709025578709, 
VA01_mfcc_sma.13. = -0.230465062187873, VA01_mfcc_sma.14. = -0.166280574763084, 
VA01_F0finEnv_sma = 0.929490898547702, VA01_F0final_sma = 0.88394728516363, 
VA01_F0finEnv_sma = 5.71798783504483, VA01_F0final_sma = 5.33346937206177, 
VA01_pcm_intensity_sma = -0.296132477275194, VA02_F0finEnv_sma = 1.85127192814681e-16, 
VA02_F0final_sma = -5.00994289546507e-17, VA02_jitterLocal_sma = -2.34859079707077e-17, 
VA02_shimmerLocal_sma = -2.60455164544764e-17, VA02_mfcc_sma.0. = -2.39198962485355e-17, 
VA02_mfcc_sma.1. = -3.89002856985193e-17, VA02_mfcc_sma.2. = -2.78570899311261e-17, 
VA02_mfcc_sma.3. = 2.68372615000544e-17, VA02_mfcc_sma.4. = -2.08322992987213e-17, 
VA02_mfcc_sma.5. = 5.73790146364059e-18, VA02_mfcc_sma.6. = -2.27766888801672e-17, 
VA02_mfcc_sma.7. = 3.11227273727354e-17, VA02_mfcc_sma.8. = -2.45352291763184e-18, 
VA02_mfcc_sma.9. = -5.66246851843269e-17, VA02_mfcc_sma.10. = 2.26611187363704e-17, 
VA02_mfcc_sma.11. = 1.36185358636217e-18, VA02_mfcc_sma.12. = 1.13196270572855e-17, 
VA02_mfcc_sma.13. = -4.09118309040634e-17, VA02_mfcc_sma.14. = 8.56156119316215e-18, 
VA02_F0finEnv_sma = 1, VA02_F0final_sma = 1, VA02_F0finEnv_sma = 11.2475000713838, 
VA02_F0final_sma = 13.6936948927086, VA02_pcm_intensity_sma = -6.36552851373548e-16
), `389` = c(VA03_F0finEnv_sma = 1.58629187875084e-18, VA03_F0final_sma = -9.46461808448016e-17, 
VA03_jitterLocal_sma = -2.36937840621222e-17, VA03_shimmerLocal_sma = 2.85381254827705e-17, 
VA03_mfcc_sma.0. = -9.6294053113996e-17, VA03_mfcc_sma.1. = -5.49550802238737e-17, 
VA03_mfcc_sma.2. = -2.20247732336464e-17, VA03_mfcc_sma.3. = -4.454892003863e-17, 
VA03_mfcc_sma.4. = -7.9524696067784e-17, VA03_mfcc_sma.5. = -1.05680749785702e-17, 
VA03_mfcc_sma.6. = -6.21185832013518e-17, VA03_mfcc_sma.7. = 4.15417752575213e-18, 
VA03_mfcc_sma.8. = -8.15134676706638e-18, VA03_mfcc_sma.9. = 1.99531361473134e-18, 
VA03_mfcc_sma.10. = 3.1218480555731e-17, VA03_mfcc_sma.11. = 2.1815428854396e-17, 
VA03_mfcc_sma.12. = 8.15963331541171e-17, VA03_mfcc_sma.13. = 6.16448325097666e-19, 
VA03_mfcc_sma.14. = 2.41651014444211e-17, VA03_F0finEnv_sma = 1, 
VA03_F0final_sma = 1, VA03_F0finEnv_sma = 3.91835747651944, VA03_F0final_sma = 7.13223541696321, 
VA03_pcm_intensity_sma = -3.4465563242102e-17))

I would like a data frame that has an ID column, which holds the titles for the sub-lists (252, 306, 371, ...) and columns for the content inside the sublists. The columns for the content inside the sublists should keep their titles (VA01_F0finEnv_sma, ...). Those shared columns between sublists should be combined: Like 252 and 306 share the VA01_F0finEnv_sma named column. But 389 has columns starting with VA03 which the others dont have.

This is what the data is supposed to LOOK like:

ID VA01_F0finEnv VA01_F0final VA01_jitterLocal_sma ... VA03_pcm_intens.
252 1.490953e-01 1.209293e-01 -7.857455e-02 ... NA
... ... ... ... ... ...
389 NA NA NA ... -3.446556e-17

CodePudding user response:

so currently your list contains a bunch of named vectors, which you need to first convert to lists, then to data frames (I assigned the sample data you supplied to df).

This uses the package, plyr.

df_list <- lapply(df, function(x) as.data.frame(as.list(x)))

This should convert each element in your list to a (sub) list then a data frame.

Then you can bind each of these lists with a new column, id, using the name of said list. After that, bind.fill will bind the rows of the data frames in your list, while filling in non-intersecting columns.

do.call(rbind.fill, unname(Map(cbind, id = names(df_list), df_list)))

Hope this helps!

CodePudding user response:

You can use transpose each element of the list, convert to data.table, and rbindlist(), using the fill=T argument to allow columns names to differ:

library(data.table)
rbindlist(lapply(zF_agg2, \(i) as.data.table(t(i))), fill=T)
  • Related