Home > front end >  How to add new column to an existing csv file and find age from DOB of that csv file?
How to add new column to an existing csv file and find age from DOB of that csv file?

Time:11-04

My CSV file has three columns, the first column names, the second column is DOB(YYYYMM-DD) the third column is salary looks like this kind of

Name,DOB,Salary
Sam,2000-01-05,23000
Tyson,1989-09-11,29000
Lara,2002-11-19,19000
Brian,1990-04-20,21000
Tessa,2000-08-17,15000

Problem statement- Read the file and display the data and find their age in the terminal.

Therefore, I want to add a new column called 'Age', It'll print their age accordingly.

Name,DOB,Salary,Age
Sam,2000-01-05,23000,22
Tyson,1989-09-11,29000,51
Lara,2002-11-19,19000,20
Brian,1990-04-20,21000,32
Tessa,2000-08-17,15000,22

I did something like this.

import csv
import datetime


def getage(now, dob):
    years = now.year - dob.year
    months = now.month - dob.month
    if now.day < dob.day:
        months -= 1
        while months < 0:
            months  = 12
            years -= 1
    return '%sy%smo' % (years, months)


with open('emp_details.csv', 'r') as fin, open('emp_details_out.csv', 'w') as fout:
    csv_reader = csv.reader(fin)
    csv_writer = csv.writer(fout)

    for data in csv_reader:
        today = datetime.date.today()
        DOB = datetime.datetime.strptime(data["DOB"], "%Y-%m-%d").date()
        data["Age"] = getage(today, DOB)
        csv_writer.writerow(data)

Where it shows the error below

DOB = datetime.datetime.strptime(data["DOB"], "%Y-%m-%d").date()
TypeError: list indices must be integers or slices, not str

I'm new with csv files, That's why i need some help, If This Question looks like inappropriate please consider it. I genuinely need help.

CodePudding user response:

It is probably easiest to use pandas

import pandas as pd
import datetime


df = pd.read_csv('some/file/path/file_name.csv')  # read your csv using the file's path
df['DOB'] = pd.to_datetime(df['DOB'])  # convert date to datetime
df['Age'] = (datetime.datetime.now() - df['DOB']).dt.days // 365  # calculate the age
df.to_csv('/some/file/path/file_name.csv', index=False)  # create a csv file or update existing 

    Name        DOB  Salary  Age
0    Sam 2000-01-05   23000   22
1  Tyson 1989-09-11   29000   33
2   Lara 2002-11-19   19000   19
3  Brian 1990-04-20   21000   32
4  Tessa 2000-08-17   15000   22

CodePudding user response:

I would also recommend using pandas here, credit to @It_is_Chris.

In case you want to know why your code doesn't work, there are few little things you didn't consider.

csv.reader returns a list of lists of your data. The first element of this list is a list with the headers. You can't access elements of each row with the header name of the column, instead you need to access them by index.

You could also use csv.DictReader and csv.DictWriter to solve this. Then you need to add the key value pair of Age for each row (which is a dictionary) with the syntax you used, e.g row['Age']

I didn't change anything in your function getage.

# 1

with open('emp_details.csv', 'r') as fin, open('emp_details_out.csv', 'w') as fout:
    csv_reader = csv.reader(fin)
    csv_writer = csv.writer(fout)
    
    header = next(csv_reader)   ['Age'] # take first row of csv_reader as header and add 'Age'
    csv_writer.writerow(header)
    
    for data in csv_reader:
        # for 1st row, data is a list and looks like this: ['Sam', '2000-01-05', '23000']
        today = datetime.date.today()
        DOB = datetime.datetime.strptime(data[1], "%Y-%m-%d").date()
        data.append(getage(today, DOB))
        csv_writer.writerow(data)

# OR 
# 2

with open('emp_details.csv', 'r') as fin, open('emp_details_out.csv', 'w') as fout:
    csv_reader = csv.DictReader(fin)
    
    header = csv_reader.fieldnames   ['Age']
    csv_writer = csv.DictWriter(fout, fieldnames=header)
    csv_writer.writeheader()
    
    for data in csv_reader:
        # for 1st row, data is a dictionary and looks like this: {'Name': 'Sam', 'DOB': '2000-01-05', 'Salary': '23000'}
        today = datetime.date.today()
        DOB = datetime.datetime.strptime(data['DOB'], "%Y-%m-%d").date()
        data['Age'] = getage(today, DOB)
        csv_writer.writerow(data)
  • Related