Home > Back-end >  Conditional Replace in two data frame
Conditional Replace in two data frame

Time:01-03

I have two data frames with several thousand lines, which are like the following two data frames:

enter image description here enter image description here

I want the value of the goal column in the first data frame to be moved to the second data frame wherever the names of the campaigns are the same in the first and second data frames. I mean something like the following data frame:

enter image description here

CodePudding user response:

You have to do left join with df_1 in the left side and then use the existing goal column in df_1 to fill the nulls produced by join.

df_1 = pd.DataFrame()
df_2 = pd.DataFrame()

df_1['campaign'] = ['a', 'b', 'c', 'd']
df_1['goal'] =['order', 'order', 'off', 'order']

df_2['campaign'] = ['a', 'b', 'c']
df_2['goal'] = ['Subscription', 'order', 'Subscription']

# left join
df = df_1.merge(df_2.rename(columns={'goal': 'new_goal'}), on=['campaign'], how='left')
# replace nulls 
df['new_goal'].fillna(df['goal'], inplace=True)

df

 --- ---------- ------- -------------- 
|   | campaign | goal  |   new_goal   |
 --- ---------- ------- -------------- 
| 0 |    a     | order | Subscription |
| 1 |    b     | order |    order     |
| 2 |    c     |  off  | Subscription |
| 3 |    d     | order |    order     |
 --- ---------- ------- -------------- 

You can select the columns you need and rename them as per your need

df_final = df[['campaign', 'new_goal']].rename(columns={'new_goal': 'goal'})

CodePudding user response:

This will overwrite values in df1

import pandas as pd

df1 = pd.DataFrame({'campaign':['a','b','c','d'],'goal':['order','order','off','order',]})
df2 = pd.DataFrame({'campaign':['a','b','c','d'],'goal':['Subscription','order','Subscription','order',]})
df2.merge(df1, how= 'left')

>>  campaign    goal
    0   a   Subscription
    1   b   order
    2   c   Subscription
    3   d   order

CodePudding user response:

You can use pandas.DataFrame.merge for that:

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.merge.html

df1 = pd.DataFrame({'campaign': ['a','b','c','d'], 'goal': ['order','order','off','order']})
df2 = pd.DataFrame({'campaign': ['a','b','c'], 'goal': ['subscription','order','subscription']})

df_out=pd.merge(df2,df1,on='campaign',how='left',suffixes=('_df2','_df1'))

Result:

  campaign      goal_df2 goal_df1
0        a  subscription    order
1        b         order    order
2        c  subscription      off
  • Related