Home > Software engineering >  How can I merge multiple different CSV files with foreign keys?
How can I merge multiple different CSV files with foreign keys?

Time:11-26

Good Evening, I've been trying to work with the Instacart Dataset as a part of my online classes using Jupyter Notebook (Python); one of the requirements is to merge all of the files (that come mostly with different columns and one or two foreign keys) into one big CSV, like in this case:

https://github.com/gabrielhpr/InstacartClustering/blob/master/InstacartClustering.ipynb

However I don't know how to accomplish that, each file comes with a foreign key so I guess that's the way to go, but how do you match those foreign keys to the correct rows and compile all the CSV files?

CodePudding user response:

Yes you can it very easily

  1. Set the index column as foreign keys

    df.set_index(foreign_key)

  2. use pd.concat([df1,df2],axis=1) to merge those two dataframes.

Using these two processes you will be able to merge those two CSV files with a foreign key.

  • Related