Home > Enterprise >  How to add feature values from one csv to another based on another feature rows
How to add feature values from one csv to another based on another feature rows

Time:10-11

I have 2 csv files as below. I have to add the Lat and Long features from one csv to another based on the ID

enter image description here

Based on the ID, I have to add the columns Latitude and Longitude to CSV 1 (ID 1 has different values and ID 2 has different values and so on.

Can anyone please tell me how to do this using python?

CodePudding user response:

I recommend you use the pandas library. If you've not come across it before, it's a very widely-used library for data manipulation that makes this task very easy. However, it's not part of the standard library that comes packaged with Python, so depending on your environment you may need to install it using PyPi. Guides like this might help if you've not done that before.

With pandas installed in your environment, you can run the following code (substitute in your file paths for csv1.csv and csv2.csv).

import pandas as pd

# Load the csv files into dataframes and set the index of each to the 'ID' column
df1 = pd.read_csv('csv1.csv').set_index('ID')
df2 = pd.read_csv('csv2.csv').set_index('ID')

# Join using how='outer' to keep all rows from both dataframes
joined = df1.join(df2, how='outer')
print(joined)

# Save to a new csv file
joined.to_csv('joined.csv')

I made some simple sample data to demonstrate the result of running the code:

csv1.csv:

ID,my_feature
1,banana
2,apple
3,pear

csv2.csv:

ID,latitude,longitude
1,7,-4
4,10,15

Printed output:

       my_feature  latitude  longitude
ID
1      banana       7.0       -4.0
2       apple       NaN        NaN
3        pear       NaN        NaN
4         NaN      10.0       15.0

Output to joined.csv:

ID,my_feature,latitude,longitude
1,banana,7.0,-4.0
2,apple,,
3,pear,,
4,,10.0,15.0
  • Related