Home > Back-end >  Add to a column in dataframe string value from another dataframe
Add to a column in dataframe string value from another dataframe

Time:10-27

I have the following dataframes and Want to column score in df to the column summary in df1. when the year and names are equal in both dataframes, I was not able to add 2 values in the same field ( in ths case i want to add to the row of df1 when name ="juli" and year = "2010" the score "25,54" in the summary column: the code is as following :

import pandas as pd

lst = [['juli', 25,2010], ['krish', 30,2020],
       ['nick', 26, 2021], ['juli', 22,2020],['juli', 54,2010]]
    
df = pd.DataFrame(lst, columns =['Name', 'score',"year"])
lst1 = [['juli', 2,2010], ['krish', 1,2020],
       ['nick', 1,2021], ['juli', 1,2020] ]
    
df1 = pd.DataFrame(lst1, columns =['Name', 'm_count',"year"])

df1.insert(3,"summary",0)

for i in range(len(df1)):
        for j in range(len(df)):
            if df1["m_count"].values[i]==1:
                if df1["Name"].values[i] == df["Name"].values[j] and df1["year"].values[i] == df["year"].values[j]:
                    df1["summary"].values[i]= df["score"].values[j]
            else:
                ...

print(df1)

the current result is :

    Name  m_count  year  summary
0   juli        2  2010        0
1  krish        1  2020       30
2   nick        1  2021       26
3   juli        1  2020       22

the needed result :

    Name  m_count  year  summary
0   juli        2  2010       25,54
1  krish        1  2020       30
2   nick        1  2021       26
3   juli        1  2020       22

CodePudding user response:

Use a groupby to aggregate the scores for each ['Name', 'year'], in this case just string join them to get your desired output. Then merge the result back to the other DataFrame. That column of 0s is necessary, and problematic for the merge, so I drop it.

s = (df['score'].astype(str)
       .groupby([df['Name'], df['year']])
       .agg(','.join)
       .rename('summary'))

df1 = df1.drop(columns='summary').merge(s, on=['Name', 'year'], how='left')

    Name  m_count  year summary
0   juli        2  2010   25,54
1  krish        1  2020      30
2   nick        1  2021      26
3   juli        1  2020      22
  • Related