I want to append columns from one csv file into another csv file. I have two csv files with the below data:
First file: names_dept.csv
Name Dept Area
John Smith candy 5
Diana Princ candy 5
Tyler Perry candy 5
Perry Plat wood 3
Jerry Springer clothes 2
Calvin Klein clothes 2
Mary Poppins clothes 2
Ivan Evans clothes 2
Lincoln Tun warehouse 7
Oliver Twist kitchen 6
Herman Sherman kitchen 6
Second file: name_subject.csv
Who Subject
Perry Plat EMAIL RECEIVED
Mary Poppins EMAIL RECEIVED
Ivan Evans EMAIL RECEIVED
Lincoln Tun EMAIL RECEIVED
Oliver Twist EMAIL RECEIVED
This is what I want my final output to look like:
Output file: output.csv
Name Dept Area Who Subject
John Smith candy 5 Perry Plat EMAIL RECEIVED
Diana Princ candy 5 Mary Poppins EMAIL RECEIVED
Tyler Perry candy 5 Ivan Evans EMAIL RECEIVED
Perry Plat wood 3 Lincoln Tun EMAIL RECEIVED
Jerry Springer clothes 2 Oliver Twist EMAIL RECEIVED
Calvin Klein clothes 2
Mary Poppins clothes 2
Ivan Evans clothes 2
Lincoln Tun warehouse 7
Oliver Twist kitchen 6
Herman Sherman kitchen 6
My code so far is:
import pandas as pd
import os, csv, sys
namedept_path = 'names_dept.csv'
namesubject_path = 'name_subject.csv'
output_path = 'output.csv'
df1 = pd.read_csv(namedept_path)
df2 = pd.read_csv(namesubject_path)
#this was my attempt
output = df1 ['Who'] = df2 ['Who']
output = df1 ['Subject'] = df2 ['Subject']
output.to_csv(output_path , index=False)
I get the error: TypeError: string indices must be integers
as the columns do contain strings.
I also tried:
with open(namedept_path, 'r') as name, open(namesubject_path, 'r') as email, \
open(output_path, 'w') as result:
name_reader = csv.reader(name)
email_reader = csv.reader(email)
result = csv.writer(result, lineterminator='\n')
result.writerows(x y for x, y in zip(name_reader , email_reader))
Almost what I needed, but the output ended up looking something like this instead:
Name Dept Area Who Subject
John Smith candy 5 Perry Plat EMAIL RECEIVED
Diana Princ candy 5 Mary Poppins EMAIL RECEIVED
Tyler Perry candy 5 Ivan Evans EMAIL RECEIVED
Perry Plat wood 3 Lincoln Tun EMAIL RECEIVED
Jerry Springer clothes 2 Oliver Twist EMAIL RECEIVED
CodePudding user response:
You can try pd.concat
on columns
out = pd.concat([df1, df2], axis=1)
print(out)
Name Dept Area Who Subject
0 John Smith candy 5 Perry Plat EMAIL RECEIVED
1 Diana Princ candy 5 Mary Poppins EMAIL RECEIVED
2 Tyler Perry candy 5 Ivan Evans EMAIL RECEIVED
3 Perry Plat wood 3 Lincoln Tun EMAIL RECEIVED
4 Jerry Springer clothes 2 Oliver Twist EMAIL RECEIVED
5 Calvin Klein clothes 2 NaN NaN
6 Mary Poppins clothes 2 NaN NaN
7 Ivan Evans clothes 2 NaN NaN
8 Lincoln Tun warehouse 7 NaN NaN
9 Oliver Twist kitchen 6 NaN NaN
10 Herman Sherman kitchen 6 NaN NaN