Home > Back-end >  Inserting multiple columns from one csv file into another and outputting as a new csv file
Inserting multiple columns from one csv file into another and outputting as a new csv file

Time:07-09

I want to append columns from one csv file into another csv file. I have two csv files with the below data:

First file: names_dept.csv

Name             Dept       Area       
John Smith       candy      5  
Diana Princ      candy      5  
Tyler Perry      candy      5  
Perry Plat       wood       3 
Jerry Springer   clothes    2  
Calvin Klein     clothes    2  
Mary Poppins     clothes    2  
Ivan Evans       clothes    2  
Lincoln Tun      warehouse  7  
Oliver Twist     kitchen    6  
Herman Sherman   kitchen    6

Second file: name_subject.csv

Who              Subject
Perry Plat       EMAIL RECEIVED                        
Mary Poppins     EMAIL RECEIVED     
Ivan Evans       EMAIL RECEIVED
Lincoln Tun      EMAIL RECEIVED
Oliver Twist     EMAIL RECEIVED

This is what I want my final output to look like:

Output file: output.csv

Name             Dept       Area   Who              Subject       
John Smith       candy      5      Perry Plat       EMAIL RECEIVED 
Diana Princ      candy      5      Mary Poppins     EMAIL RECEIVED 
Tyler Perry      candy      5      Ivan Evans       EMAIL RECEIVED
Perry Plat       wood       3      Lincoln Tun      EMAIL RECEIVED
Jerry Springer   clothes    2      Oliver Twist     EMAIL RECEIVED
Calvin Klein     clothes    2  
Mary Poppins     clothes    2  
Ivan Evans       clothes    2  
Lincoln Tun      warehouse  7  
Oliver Twist     kitchen    6  
Herman Sherman   kitchen    6

My code so far is:

import pandas as pd
import os, csv, sys

namedept_path = 'names_dept.csv'
namesubject_path = 'name_subject.csv'
output_path = 'output.csv'

df1 = pd.read_csv(namedept_path)
df2 = pd.read_csv(namesubject_path)

#this was my attempt
output = df1 ['Who'] = df2 ['Who']
output = df1 ['Subject'] = df2 ['Subject'] 
output.to_csv(output_path , index=False)

I get the error: TypeError: string indices must be integers as the columns do contain strings.

I also tried:

with open(namedept_path, 'r') as name, open(namesubject_path, 'r') as email, \
    open(output_path, 'w') as result:
    name_reader = csv.reader(name)
    email_reader = csv.reader(email)
    result = csv.writer(result, lineterminator='\n')
    result.writerows(x   y for x, y in zip(name_reader , email_reader))

Almost what I needed, but the output ended up looking something like this instead:

Name             Dept       Area   Who              Subject       
John Smith       candy      5      Perry Plat       EMAIL RECEIVED 
Diana Princ      candy      5      Mary Poppins     EMAIL RECEIVED 
Tyler Perry      candy      5      Ivan Evans       EMAIL RECEIVED
Perry Plat       wood       3      Lincoln Tun      EMAIL RECEIVED
Jerry Springer   clothes    2      Oliver Twist     EMAIL RECEIVED

CodePudding user response:

You can try pd.concat on columns

out = pd.concat([df1, df2], axis=1)
print(out)

              Name       Dept  Area           Who         Subject
0       John Smith      candy     5    Perry Plat  EMAIL RECEIVED
1      Diana Princ      candy     5  Mary Poppins  EMAIL RECEIVED
2      Tyler Perry      candy     5    Ivan Evans  EMAIL RECEIVED
3       Perry Plat       wood     3   Lincoln Tun  EMAIL RECEIVED
4   Jerry Springer    clothes     2  Oliver Twist  EMAIL RECEIVED
5     Calvin Klein    clothes     2           NaN             NaN
6     Mary Poppins    clothes     2           NaN             NaN
7       Ivan Evans    clothes     2           NaN             NaN
8      Lincoln Tun  warehouse     7           NaN             NaN
9     Oliver Twist    kitchen     6           NaN             NaN
10  Herman Sherman    kitchen     6           NaN             NaN
  • Related