Home > Enterprise >  How to create a nested array from csv rows using pandas?
How to create a nested array from csv rows using pandas?

Time:11-02

I am trying to figure how to load my csv rows into a nested array.

For e.g., my csv file:

id a1 a2 b1 b2 c1 c2 d1 d2
1 1 2 3 4 5 6 7 8
2 9 10 11 12 13 14 15 16
3 17 18 19 20 21 22 23 24
...

How do I make it into an array like this:

for each row I want to group every two columns into what I am showing below:

[ 
  [[1, 2], [3, 4], [5, 6], [7, 8]],          #row 1
  [[9, 10], [11, 12], [13, 14], [15, 16]],   #row 2
  [[17, 18], [19, 20], [21, 22], [23, 24]],  #row 3
  ...
]

CodePudding user response:

For nested list for all columns in pairs without id column use:

df = df.drop('id', axis=1)

L = np.reshape(df.to_numpy(), (len(df.index),len(df.columns) // 2,2)).tolist()
print (L)
[[[1, 2], [3, 4], [5, 6], [7, 8]],
 [[9, 10], [11, 12], [13, 14], [15, 16]],
 [[17, 18], [19, 20], [21, 22], [23, 24]]]

CodePudding user response:

Group the dataframe by index i.e. level=0 (can be avoided if no duplicate entries in dataframe for any index), then for each group, filter the dataframe to get the pair of columns, then apply list on axis=1 for each filtered dataframe, finally concatenate all of these on axis=1, then apply list again on axis=1, finally call tolist():

(df.groupby(level=0)
.apply(lambda x:pd.concat([x.filter(like=c)
                          .apply(list, axis=1) for c in 'abcd'], axis=1)
       )
.apply(list, axis=1)
.tolist()
 )

OUTPUT:

[
    [
        [1, 2], [3, 4], [5, 6], [7, 8]
    ], 
    [
        [9, 10], [11, 12], [13, 14], [15, 16]
    ], 
    [
        [17, 18], [19, 20], [21, 22], [23, 24]
    ]
]
  • Related