i am trying to merge data frames with various columns. subsetting them i think requires different treatment depending if its with 1 or >1 columns so i tried with if statements, but its not working and im not sure why. any tips would be great thanks so much
edit - the index contains duplicate values so cant use pd.concat - i want to keep these
df1 = pd.DataFrame(data={'cat':[0,2,1], 'dog':[1,2,3]}).set_index([pd.Index([2, 2, 4])])
df2 = pd.DataFrame(data={'mouse':[1,2,3],'parrot':[0,1,2],'elephant':[0,1,2]}).set_index([pd.Index([1, 2, 4])])
# b can be varying but for this instance lets use two columns
# b = 'parrot'
b = 'parrot', 'mouse'
if len(b) > 0:
if len(b) > 1:
out = df1.merge(df2[[*b]],left_index=True, right_index=True)
else:
out = df1.merge(df2[b],left_index=True, right_index=True)
CodePudding user response:
Assuming the dataframe does not contains multiindex columns, you can use .loc
to select the required columns in df2
then use pd.concat
to concatenate dataframe df1
with the selected columns along axis=1
pd.concat([df1, df2.loc[:, b]], axis=1)
Sample run:
# b = 'mouse'
cat dog mouse
0 0 1 1
1 1 2 2
2 2 3 3
# b = 'mouse', 'parrot'
cat dog mouse parrot
0 0 1 1 0
1 1 2 2 1
2 2 3 3 2
CodePudding user response:
b = 'parrot'
len(b) = 6
b = 'parrot', 'mouse'
len(b) = 2
You can fix this by using lists
b = ['parrot']
and
b = ['parrot', 'mouse']
df2[[*b]] should become df2[b]