I am trying to figure how to load my csv rows into a nested array.
For e.g., my csv file:
id | a1 | a2 | b1 | b2 | c1 | c2 | d1 | d2 |
---|---|---|---|---|---|---|---|---|
1 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
2 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 |
3 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 |
... |
How do I make it into an array like this:
for each row I want to group every two columns into what I am showing below:
[
[[1, 2], [3, 4], [5, 6], [7, 8]], #row 1
[[9, 10], [11, 12], [13, 14], [15, 16]], #row 2
[[17, 18], [19, 20], [21, 22], [23, 24]], #row 3
...
]
CodePudding user response:
For nested list for all columns in pairs without id
column use:
df = df.drop('id', axis=1)
L = np.reshape(df.to_numpy(), (len(df.index),len(df.columns) // 2,2)).tolist()
print (L)
[[[1, 2], [3, 4], [5, 6], [7, 8]],
[[9, 10], [11, 12], [13, 14], [15, 16]],
[[17, 18], [19, 20], [21, 22], [23, 24]]]
CodePudding user response:
Group the dataframe by index i.e. level=0
(can be avoided if no duplicate entries in dataframe for any index), then for each group, filter the dataframe to get the pair of columns, then apply list
on axis=1
for each filtered dataframe, finally concatenate all of these on axis=1
, then apply list again on axis=1
, finally call tolist()
:
(df.groupby(level=0)
.apply(lambda x:pd.concat([x.filter(like=c)
.apply(list, axis=1) for c in 'abcd'], axis=1)
)
.apply(list, axis=1)
.tolist()
)
OUTPUT:
[
[
[1, 2], [3, 4], [5, 6], [7, 8]
],
[
[9, 10], [11, 12], [13, 14], [15, 16]
],
[
[17, 18], [19, 20], [21, 22], [23, 24]
]
]