concatenate dataframe with overlapping keys non overlapping-CodePudding

I have a number of client rating flat files, one per business unit/division and it contains the rating for that specific client in that business unit. Some clients can cross multiple business units so could have one rating in one unit and a different rating in a different file or none at all.

I want a single data structure where I can look up a particular client and get all applicable ratings for that client.

example data

df1 = pandas.DataFrame({'client': ['Client_A', 'Client_B', 'Client_C'], 'bank_rating': ['A', 'A', 'B']})
df1.set_index('client', inplace=True)

df2 = pandas.DataFrame({'client': ['Client_A', 'Client_C'], 'insurance_rating': ['B', 'A']})
df2.set_index('client', inplace=True)

df3 = pandas.DataFrame({'client': ['Client_B', 'Client_D'], 'wealth_rating': ['D', 'A']})
df3.set_index('client', inplace=True)

frames = [df1, df2, df3] #in reality frames is a dynamic list built from x number of files in a folder

attempted solution

df = pandas.concat(frames)

          bank_rating  insurance_rating  wealth_rating
client
Client_A  A            NaN               NaN
Client_B  A            NaN               NaN
Client_C  B            NaN               NaN
Client_A  NaN          B                 NaN
Client_C  NaN          A                 NaN
Client_B  NaN          NaN               D
Client_D  NaN          NaN               A

what i want is a unique row per customer with each business unit rating as column

          bank_rating  insurance_rating  wealth_rating
client    
Client_A  A            B                 NaN
Client_B  A            NaN               D
Client_C  B            A                 NaN
Client_D  NaN          NaN               A

CodePudding user response：

You are almost there, it's just that concat assumes that you want to add new rows to the dataframes (the 0'th axis). So writing

df = pandas.concat(frames, axis=1)

should do the trick for you.