Home > Software design >  merge varying number of columns in pandas
merge varying number of columns in pandas

Time:11-30

i am trying to merge data frames with various columns. subsetting them i think requires different treatment depending if its with 1 or >1 columns so i tried with if statements, but its not working and im not sure why. any tips would be great thanks so much

edit - the index contains duplicate values so cant use pd.concat - i want to keep these

df1 = pd.DataFrame(data={'cat':[0,2,1], 'dog':[1,2,3]}).set_index([pd.Index([2, 2, 4])])
df2 = pd.DataFrame(data={'mouse':[1,2,3],'parrot':[0,1,2],'elephant':[0,1,2]}).set_index([pd.Index([1, 2, 4])])

# b can be varying but for this instance lets use two columns
# b = 'parrot'
b = 'parrot', 'mouse'

if len(b) > 0:
    if len(b) > 1:
        out = df1.merge(df2[[*b]],left_index=True, right_index=True)
    else:
        out = df1.merge(df2[b],left_index=True, right_index=True)

CodePudding user response:

Assuming the dataframe does not contains multiindex columns, you can use .loc to select the required columns in df2 then use pd.concat to concatenate dataframe df1 with the selected columns along axis=1

pd.concat([df1, df2.loc[:, b]], axis=1)

Sample run:

# b = 'mouse'
   cat  dog  mouse
0    0    1      1
1    1    2      2
2    2    3      3


# b = 'mouse', 'parrot'
   cat  dog  mouse  parrot
0    0    1      1       0
1    1    2      2       1
2    2    3      3       2

CodePudding user response:

b = 'parrot'
len(b) = 6

b = 'parrot', 'mouse'
len(b) = 2

You can fix this by using lists
b = ['parrot']
and 
b = ['parrot', 'mouse']

df2[[*b]] should become df2[b]
  • Related