Home > Software engineering >  Pandas/Python: Store values of columns into list based on value in another column
Pandas/Python: Store values of columns into list based on value in another column

Time:02-15

I have the following problem:

I want to store the values of the four different columns (Age_1 - Age_4) within a dataframe into a list, which is depending on the first column 'Year'.

Year Age_1 Age_2 Age_3 Age_4
2000 18 20 25 56
2000 17 32 24 41
2001 20 26 24 39

...

So basically I want a list that then just contains all the ages that there is in the dataset for every year e.g. The first list would be list_2000 = [18,20,25,56,17,32,24,41...], the second would then be list_2001 = [20,26,24,39...]

Actually I assume that this should be easy to do, but my attempts weren't successful yet. So any help is apprechiated

CodePudding user response:

IIUC, use the underlying numpy array and groupby, then flatten the data with ravel and transform to list with tolist:

dic = (
 df.set_index('Year').groupby(level='Year')
   .apply(lambda d: d.to_numpy().ravel().tolist())
   .to_dict()
)

output:

{2000: [18, 20, 25, 56, 17, 32, 24, 41], 2001: [20, 26, 24, 39]}

CodePudding user response:

IIUC,

df.melt('Year',
        value_vars=['Age_1', 'Age_2', 'Age_3', 'Age_4'])\
.groupby('Year')['value'].agg(list).to_dict()
  • Related