Good Evening, I've been trying to work with the Instacart Dataset as a part of my online classes using Jupyter Notebook (Python); one of the requirements is to merge all of the files (that come mostly with different columns and one or two foreign keys) into one big CSV, like in this case:
https://github.com/gabrielhpr/InstacartClustering/blob/master/InstacartClustering.ipynb
However I don't know how to accomplish that, each file comes with a foreign key so I guess that's the way to go, but how do you match those foreign keys to the correct rows and compile all the CSV files?
CodePudding user response:
Yes you can it very easily
Set the index column as foreign keys
df.set_index(foreign_key)
use
pd.concat([df1,df2],axis=1)
to merge those two dataframes.
Using these two processes you will be able to merge those two CSV files with a foreign key.