Home > Net >  DataFrame merge for on specific columns
DataFrame merge for on specific columns

Time:12-24

I have a basic question on dataframe merge. After I merge two dataframe , is there a way to pick only few columns in the result.

For Example:

left = pd.DataFrame({'key1': ['K0', 'K0', 'K1', 'K2'],
                    'key2': ['K0', 'K1', 'K0', 'K1'],
                     'A': ['A0', 'A1', 'A2', 'A3'],
                     'B': ['B0', 'B1', 'B2', 'B3']})


right = pd.DataFrame({'key1': ['K0', 'K1', 'K1', 'K2'],
                          'key2': ['K0', 'K0', 'K0', 'K0'],
                          'C': ['C0', 'C1', 'C2', 'C3'],
                          'D': ['D0', 'D1', 'D2', 'D3']})

result = pd.merge(left, right, on=['key1', 'key2'])

RESULT :

    A   B key1 key2   C   D
0  A0  B0   K0   K0  C0  D0
1  A2  B2   K1   K0  C1  D1
2  A2  B2   K1   K0  C2  D2
None

Is there a way I can chose only column 'C' from 'right' dataframe and 'A' column from left dataframe? For example, I would like my result to be like:

    A     key1  key2   C  
0  A0    K0    K0     C0  
1  A2    K1    K0     C1  
2  A2    K1    K0     C2  
None

CodePudding user response:

Sure, first filter necessary columns columns used for join:

result = pd.merge(left[['A','key1', 'key2']], 
                  right[['C','key1', 'key2']], 
                  on=['key1', 'key2'])

Or:

keys = ['key1', 'key2']
result = pd.merge(left[['A']   keys], right[['C']   keys], on=keys)

CodePudding user response:

mergeDF = pd.merge(left['key1','key2','A'], right[['key1','key2','C']], on=['key1', 'key2'])
  • Related