Assuming I have the following input:
table = pd.DataFrame({'a':[0,0,0,0],'b':[1,1,1,3,],'c':[2,2,5,4],'d':[3,6,6,6]},dtype='float64')
list = [[55,66],
[77]]
#output of the table
a b c d
0 0.0 1.0 2.0 3.0
1 0.0 1.0 2.0 6.0
2 0.0 1.0 5.0 6.0
3 0.0 3.0 4.0 6.0
I want to combine list
with table
so the final shape would be like:
a b c d ID_0 ID_1
0 0.0 1.0 2.0 3.0 55.0 66.0
1 0.0 1.0 2.0 6.0 77.0 NaN
2 0.0 1.0 5.0 6.0 NaN NaN
3 0.0 3.0 4.0 6.0 NaN NaN
I found a way but it looks a bit long and might be a shorter way to do it.
Step1:
x = pd.Series(list, name ="ID")
new = pd.concat([table, x], axis=1)
# output
a b c d ID
0 0.0 1.0 2.0 3.0 [5, 6]
1 0.0 1.0 2.0 6.0 [77]
2 0.0 1.0 5.0 6.0 NaN
3 0.0 3.0 4.0 6.0 NaN
step2:
ID = new['ID'].apply(pd.Series)
ID = ID.rename(columns = lambda x : 'ID_' str(x))
new_x = pd.concat([new[:], ID[:]], axis=1)
# output
a b c d ID ID_0 ID_1
0 0.0 1.0 2.0 3.0 [5, 6] 5.0 6.0
1 0.0 1.0 2.0 6.0 [77] 77.0 NaN
2 0.0 1.0 5.0 6.0 NaN NaN NaN
3 0.0 3.0 4.0 6.0 NaN NaN NaN
step3:
new_x = new_x.drop(columns=['ID'], axis = 1)
Any shorter way to achieve the same result?
CodePudding user response:
Try pd.Series
:
table = pd.DataFrame({'a':[0,0,0,0],'b':[1,1,1,3,],'c':[2,2,5,4],'d':[3,6,6,6]},dtype='float64')
list = [[55,66],
[77]]
table['ID_0'] = pd.Series(list)
print(table)
Result:
a b c d ID_0
0 0.0 1.0 2.0 3.0 [55, 66]
1 0.0 1.0 2.0 6.0 [77]
2 0.0 1.0 5.0 6.0 NaN
3 0.0 3.0 4.0 6.0 NaN
CodePudding user response:
Assuming a default index on table
(as shown in the question), we can simply create a DataFrame (either from_records or with the constructor) and join back to table
and allow the indexes to align. (add_prefix is an easy way to add the 'ID_' prefix to the default numeric columns)
new_df = table.join(
pd.DataFrame.from_records(lst).add_prefix('ID_')
)
new_df
:
a b c d ID_0 ID_1
0 0.0 1.0 2.0 3.0 55.0 66.0
1 0.0 1.0 2.0 6.0 77.0 NaN
2 0.0 1.0 5.0 6.0 NaN NaN
3 0.0 3.0 4.0 6.0 NaN NaN
Working with 2 DataFrames is generally easier than a DataFrame and a list. Here is what from_records does to lst
:
pd.DataFrame.from_records(lst)
0 1
0 55 66.0
1 77 NaN
Index (rows) 0 and 1 will now align with the corresponding index values in table
(0 and 1 respectively).
add_prefix fixes the column names before joining:
pd.DataFrame.from_records(lst).add_prefix('ID_')
ID_0 ID_1
0 55 66.0
1 77 NaN
Setup and imports:
import pandas as pd # v1.4.4
table = pd.DataFrame({
'a': [0, 0, 0, 0],
'b': [1, 1, 1, 3, ],
'c': [2, 2, 5, 4],
'd': [3, 6, 6, 6]
}, dtype='float64')
lst = [[55, 66],
[77]]