Home > front end >  How do I create a new column in a dataframe using values from another dataframe?
How do I create a new column in a dataframe using values from another dataframe?

Time:02-13

I have two dataframes, df_1 and df_2, where df_1 has several columns of "codes" and df_2 has the definitions for all of those codes:

df_1 = pd.DataFrame({
    'Age': [42, 35, 64, 53],
    'Code 1': [1234, 3452, 9583, 8753],
    'Code 2': [3857, np.nan, np.nan, 1234]})

df_2 = pd.DataFrame({
    'Code': [3452, 8753, 3857, 1234, 9583],
    'Code Def':['a', 'b', 'c', 'd', 'e']})

How do I create a new column in df_1 that contains the definitions of all the codes from df_2 to look something like this?

Age  Code 1  Code 2  Code def
42    1234    3857    d, c
35    3452    NaN      a
64    9583    NaN      e
53    8753    1234    b, d

I've tried using merge() to combine the two dataframes, but that doesn't work since I want to join on multiple columns in df_1 and just one column in df_2. I also tried creating empty columns in df_1 and filling them using if statements, but that got quite complicated.

Thanks!

CodePudding user response:

You could first stack and groupby agg to form the new column, then merge with the original dataset:

s = df_2.set_index(['Code'])['Code Def']

df_1.merge(df_1.set_index('Age')
               .stack().map(s)
               .groupby(level='Age').agg(','.join)
               .rename('Code def'),
           left_on='Age', right_index=True
           )

output:

   Age  Code 1  Code 2 Code def
0   42    1234  3857.0      d,c
1   35    3452     NaN        a
2   64    9583     NaN        e
3   53    8753  1234.0      b,d

CodePudding user response:

Here anther approach

x= dict(zip(df_2["Code"], df_2['Code Def']))
tmp = df_1[["Code 1","Code 2"]].replace({"Code 1":x,'Code 2':x}).fillna('')
df_1["Code Def"] = tmp["Code 1"]  " "   tmp["Code 2"] 

output

   Age  Code 1  Code 2 Code Def
    0   42    1234  3857.0      d c
    1   35    3452     NaN       a 
    2   64    9583     NaN       e 
    3   53    8753  1234.0      b d
  • Related