If this is my data frame how do I convert it to array for each row?
3 4 5 6 97 98 99 100
0 1.0 2.0 3.0 4.0 95.0 96.0 97.0 98.0
1 50699.0 16302.0 50700.0 16294.0 50735.0 16334.0 50737.0 16335.0
2 57530.0 33436.0 57531.0 33438.0 NaN NaN NaN NaN
3 24014.0 24015.0 34630.0 24016.0 NaN NaN NaN NaN
4 44933.0 2611.0 44936.0 2612.0 44982.0 2631.0 44972.0 2633.0
1792 46712.0 35340.0 46713.0 35341.0 46759.0 35387.0 46760.0 35388.0
1793 61283.0 40276.0 61284.0 40277.0 61330.0 40323.0 61331.0 40324.0
1794 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1795 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1796 27156.0 48331.0 27157.0 48332.0 NaN NaN NaN NaN
For example, I want it to be [1.0, 2.0, 3.0, 4.0, 95.0, 96.0, 97.0, 98.0]
for the first one.
CodePudding user response:
You can loop the dataframe's rows and assign the NumPy arrays dynamically to the global symbol table dict
.
To loop rows, you can loop the transposes dataframe's columns.
# sample frame
df = pd.DataFrame({'col1' : [np.nan, 1.0, 4.5, 1.3, np.nan, 6.7],
'col2' : [-0.4, 0.5, -2.3, np.nan, 1.2, 2.4]})
# transpose
df = df.transpose()
# dynamical assignment -> global symbol table
for i in df.columns:
globals()['v{}'.format(i 1)] = np.array(df[i])
v1
>array([ nan, -0.4])
v2
>array([1. , 0.5])
EDIT: Added `tranpose() to provide rows rather than columns as in the initial answer. Thanks, BeRT2me
CodePudding user response:
>>> import numpy as np
>>> out = df.apply(np.array, axis=1) # df.apply(list, axis=1)
>>> print(out.to_frame('arrays'))
arrays
0 [1.0, 2.0, 3.0, 4.0, 95.0, 96.0, 97.0, 98.0]
1 [50699.0, 16302.0, 50700.0, 16294.0, 50735.0, ...
2 [57530.0, 33436.0, 57531.0, 33438.0, nan, nan,...
3 [24014.0, 24015.0, 34630.0, 24016.0, nan, nan,...
4 [44933.0, 2611.0, 44936.0, 2612.0, 44982.0, 26...
1792 [46712.0, 35340.0, 46713.0, 35341.0, 46759.0, ...
1793 [61283.0, 40276.0, 61284.0, 40277.0, 61330.0, ...
1794 [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
1795 [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
1796 [27156.0, 48331.0, 27157.0, 48332.0, nan, nan,...
>>> print(df.to_numpy().tolist())
[[1.0, 2.0, 3.0, 4.0, 95.0, 96.0, 97.0, 98.0],
[50699.0, 16302.0, 50700.0, 16294.0, 50735.0, 16334.0, 50737.0, 16335.0],
[57530.0, 33436.0, 57531.0, 33438.0, nan, nan, nan, nan],
[24014.0, 24015.0, 34630.0, 24016.0, nan, nan, nan, nan],
[44933.0, 2611.0, 44936.0, 2612.0, 44982.0, 2631.0, 44972.0, 2633.0],
[46712.0, 35340.0, 46713.0, 35341.0, 46759.0, 35387.0, 46760.0, 35388.0],
[61283.0, 40276.0, 61284.0, 40277.0, 61330.0, 40323.0, 61331.0, 40324.0],
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
[27156.0, 48331.0, 27157.0, 48332.0, nan, nan, nan, nan]]
CodePudding user response:
What about
>>> rows = [*df.to_numpy()] # list of arrays
>>> rows[0]
array([ 1., 2., 3., 4., 95., 96., 97., 98.])
or since you seem to be using the words list
and array
interchangeably,
>>> [*rows] = map(list, df.to_numpy()) # list of lists
>>> rows[0]
[1.0, 2.0, 3.0, 4.0, 95.0, 96.0, 97.0, 98.0]
?