Join 2 pandas dataframes by adding rows with the same index-CodePudding

I am struggling to join two dataframes by index (I've made column FileName an index for both tables) which look like this:

Table 1

FileName	Transcriber	Transcription
612_000002.wav	100% (80/80)	Are we starting off?
612_000002.wav	100% (50/50)	shall we starting on
612_000002.wav	100% (2/2)	fast mode
612_000002.wav	100% (258/259)	Go and start it up
612_000002.wav	100% (20/20)	Are we starting off?
612_000003.wav	100% (258/259)	here we go, hey well woah woah woah
612_000003.wav	100% (23/23)	evening gulf air
612_000003.wav	100% (32/32)	And as the 1st group reached the bottom of the...
612_000003.wav	100% (80/80)	Happy to go off here, woah woah woah
612_000003.wav	100% (10/10)	Go boom we'll just

Table 2 is similar and looks like this:

FileName	Transcriber	Transcription
612_000002.wav	Quartznet	there was not inl
612_000002.wav	Transducer_M	don't start again
612_000002.wav	Transducer_L	do we start again
612_000003.wav	Transducer_L	anything off yeah i'm willing we'll just
612_000003.wav	Transducer_S	having gone here on will and wolf is

So I've looked into concat, merge, and join. But they don't seem to yield the output I am looking for. What I would like to have is all values from both tables for filename1, all values for filename2 and etc. Basically, adding rows from table2 to table1. Is there any way around it? Thank you <3

CodePudding user response：

Use:

pd.concat([df1,df2]).drop_duplicates()

CodePudding user response：

If I understand it correctly, then what you need is to firstly concatenate 2 dataFrames and afterwards use the function groupby (https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.groupby.html).

This can as well be implemented as in this question: Pandas Groupby of 2 dataframes.

Hope this could help.