How to create a nested array from csv rows using pandas?-CodePudding

I am trying to figure how to load my csv rows into a nested array.

For e.g., my csv file:

id	a1	a2	b1	b2	c1	c2	d1	d2
1	1	2	3	4	5	6	7	8
2	9	10	11	12	13	14	15	16
3	17	18	19	20	21	22	23	24
...

How do I make it into an array like this:

for each row I want to group every two columns into what I am showing below:

[ 
  [[1, 2], [3, 4], [5, 6], [7, 8]],          #row 1
  [[9, 10], [11, 12], [13, 14], [15, 16]],   #row 2
  [[17, 18], [19, 20], [21, 22], [23, 24]],  #row 3
  ...
]

CodePudding user response：

For nested list for all columns in pairs without id column use:

df = df.drop('id', axis=1)

L = np.reshape(df.to_numpy(), (len(df.index),len(df.columns) // 2,2)).tolist()
print (L)
[[[1, 2], [3, 4], [5, 6], [7, 8]],
 [[9, 10], [11, 12], [13, 14], [15, 16]],
 [[17, 18], [19, 20], [21, 22], [23, 24]]]

CodePudding user response：

Group the dataframe by index i.e. level=0 (can be avoided if no duplicate entries in dataframe for any index), then for each group, filter the dataframe to get the pair of columns, then apply list on axis=1 for each filtered dataframe, finally concatenate all of these on axis=1, then apply list again on axis=1, finally call tolist():

(df.groupby(level=0)
.apply(lambda x:pd.concat([x.filter(like=c)
                          .apply(list, axis=1) for c in 'abcd'], axis=1)
       )
.apply(list, axis=1)
.tolist()
 )

OUTPUT:

[
    [
        [1, 2], [3, 4], [5, 6], [7, 8]
    ], 
    [
        [9, 10], [11, 12], [13, 14], [15, 16]
    ], 
    [
        [17, 18], [19, 20], [21, 22], [23, 24]
    ]
]