I am starting out learning about python and I wanted to merge 2 data rich .csv files. Can anyone help.
csv1 = pd.read_csv('1.csv')
csv1.str.lower()
csv1.head()
print(csv1.shape)
csv2 = pd.read_csv('2.csv')
csv2.head()
print(csv2.shape)
print (csv1)
I can't go further from this code. I am not experienced as much. Can anyone give me a sample code.
The data is here. I have to merge with reference to columns '2' and '3' in 1.csv and '2A' and '3A' in 2.csv.
I know I have to use merge
function with how = 'inner'
and I can't figure out the on = ???
CodePudding user response:
If you have different names of columns that you want to merge on, use arguments left_on
and right_on
instead of on
.
So the answer is something like this:
csv1.merge(csv2, how='inner', left_on=['2', '3'], right_on=['2A', '3A'])
CodePudding user response:
new_df = pd.merge(csv1, csv2, how='left', left_on=['2','3'], right_on = ['2A','3A'])
left join returns all records from left and matching records from right table. If you only want matching records, use "inner join".