Home > database >  Adding new data to a Dataframe from another Dataframe based on condition
Adding new data to a Dataframe from another Dataframe based on condition

Time:01-28

So my question here is how can I add data in new column to dataframe based on conditions from another dataframe. It is kinda difficult to say it so I am giving an example here

df1

columns  a   b  c
         0   10  1
         10  15  3
         15  20  5


df2
columns  d      e  
         3.3   10   
         5.5   20
         14.5  11
         17.2  5
   

What I want to do here is to add another column f to df2, and its value is from df1 such that if d[i] is between a[j] and b[j], then copy the value c[j] to the new column f[i] in df2. for example: d[1] = 5.5 so 0< 5.5< 10 hence, the value of f[1] = c[0] = 1

the final results should look like

df2
columns  d      e    f
         3.3   10    1 
         5.5   20    1
         14.5  11    3
         17.2  5     5
   

Any help is greatly appreciated!

Regards,

Steve

CodePudding user response:

Assuming non-overlapping intervals in df1 a and b, you can use pd.cut with a pd.IntervalIndex:

import pandas as pd

# Your dfs here
df1 = pd.read_clipboard()
df2 = pd.read_clipboard()

idx = pd.IntervalIndex.from_arrays(df1["a"], df1["b"])
mapping = df1["c"].set_axis(idx)

df2["f"] = pd.cut(df2["d"], idx).map(mapping)

df2:

      d   e  f
0   3.3  10  1
1   5.5  20  1
2  14.5  11  3
3  17.2   5  5

CodePudding user response:

You could use:

result = []
for item in df2['d']:
    for row in df1.iterrows():
        if row[1]['a'] <= item <= row[1]['b']:
            val = (row[1]['c'])
            break
        else:
            val = None
    result.append(val)
            
df2['f'] = result

print(df2)

CodePudding user response:

import pandas as pd
df1 = pd.DataFrame({'a':[0,10,15],'b':[10,15,20],'c':[1,3,5]})
df2 = pd.DataFrame({'d':[3.3,5.5,9.5,17.2],'e':[10,20,11,5]})
df2['f']=0
for i in range(df2.shape[0]):
    for j in range(df1.shape[0]):
        if df2.d[i]>=df1.a[j] and df2.d[i]<=df1.b[j]:
            df2.f[i]=df1.c[j]
df2

CodePudding user response:

What about this option ?

# merge the two dfs
df = pd.merge(df2, df1, left_on='d', right_on='b', how='left')
df2['f'] = None
df2['f'] = df.apply(lambda x: x['c'] if x['a_x'] <= x['d'] <= x['b_x'] else None, axis=1)
  • Related