Home > Mobile >  Join 2 data frame with special columns matching new
Join 2 data frame with special columns matching new

Time:05-17

i want to join two Dataframe and get result as bellow, i tried many ways but it fails

i want only texts on df2 ['A'] which contain text on df1 ['A']. please help me change the codes

I wanted :

0  A0_link0
1  A1_link1
2  A2_link2
3  A3_link3
import pandas as pd
df1 = pd.DataFrame(
    {
        "A": ["A0", "A1", "A2", "A3"],
    })


df2 = pd.DataFrame(
    {        "A": ["A0_link0", "A1_link1", "A2_link2", "A3_link3", "A4_link4", 'An_linkn'],
             "B" : ["B0_link0", "B1_link1", "B2_link2", "B3_link3", "B4_link4", 'Bn_linkn']
    })
result = pd.concat([df1, df2], ignore_index=True, join= "inner", sort=False)
print(result)

CodePudding user response:

Create an intermediate dataframe and map:

d = (df2.assign(key=df2['A'].str.extract(r'([^_] )'))
     .set_index('key'))

df1['A'].map(d['A'])

Output:

0    A0_link0
1    A1_link1
2    A2_link2
3    A3_link3
Name: A, dtype: object

Or merge if you want several columns from df2 (df1.merge(d, left_on='A', right_index=True))

CodePudding user response:

You can set the index as An and pd.concat on columns

result = (pd.concat([df1.set_index(df1['A']),
                     df2.set_index(df2['A'].str.split('_').str[0])],
                    axis=1, join="inner", sort=False)
          .reset_index(drop=True))
print(result)

    A         A         B
0  A0  A0_link0  B0_link0
1  A1  A1_link1  B1_link1
2  A2  A2_link2  B2_link2
3  A3  A3_link3  B3_link3
  • Related