Home > Blockchain >  How can I insert a column of a dataframe in pandas as a list of a cell into another dataframe?
How can I insert a column of a dataframe in pandas as a list of a cell into another dataframe?

Time:11-07

I have several of dataframes (df, tmp_df and sub_df) and I want to enter a column of tmp_df into a cell of sub_df as a list. My code and dataframes are shown as below. But the loop part is not working correctly:

import pandas as pd
df = pd.read_csv('myfile.csv')
tmp_df = pd.DataFrame()
sub_df = pd.DataFrame()
tmp_df = df[df['Type'] == True]
for c in tmp_df['Category']:
    sub_df['Data'] , sub_df ['Category'], sub_df['Type'] = [list(set(tmp_df['Data']))],
    tmp_df['Category'], tmp_df['Type']

df:

Data Category Type
30275 A True
35881 C False
28129 C True
30274 D False
30351 D True
35886 A True
39900 C True
35887 A False
35883 A True
35856 D True
35986 C False
30350 D False
28129 C True
31571 C True

tmp_df:

Data Category Type
30275 A True
28129 C True
30351 D True
35886 A True
39900 C True
35883 A True
35856 D True
28129 C True
31571 C True

What should I do if I want the following result?

sub_df:

Data Category Type
[30275,35886,35883] A True
[28129,39900,28129,31571] C True
[30351,35856] D True

CodePudding user response:

you can select the rows withquery, then groupby agg:

(df.query('Type') # or 'Type == "True"' if strings
   .groupby('Category', as_index=False)
   .agg({'Data': list, 'Type': 'first'})
)

output:

  Category                          Data  Type
0        A         [30275, 35886, 35883]  True
1        C  [28129, 39900, 28129, 31571]  True
2        D                [30351, 35856]  True
  • Related