Home > Software design >  How to Read Data from CSV row by row and store in database
How to Read Data from CSV row by row and store in database

Time:07-14

I have a csv file with 20 records that need to stored in a table inside a database. I am trying to read every row line by line but not sure how to pass it to a function which would check the connection with database and store in it. I have created a separate config file for connection object for database.

How should I read the csv line by line and pass every row data to function and store it and carry out the same operation for every other row of csv. My code is as follows:

# This variable stores the insert query to store data in database
query = """INSERT INTO product(product_id, product_name, description, product_value) 
values(%s, %s, %s, %s)"""

def create_product():
data = pd.read_csv('path/to/csv')
df = pd.DataFrame(data)
data_list = []

# How to Write This part?
# How will I pass an entire row in the function call and what to have in the argument like a 
# List or something
for row in df.iterrows():
    # print(row)
    input_data = ",".join(row)
    insert_data = output_data_to_DB(query, input_data, connect_db) # Calling Function
    data_list.append(insert_data)
    print(data_list)

# Called Function Here
def output_data_to_DB(insert_query, output_list, conn):
try:
    cur = conn.cursor()
    cur.execute(insert_query, output_list)
    print("row inserted with valueList : ", output_list)
    output_list.commit()
    cur.close()
    return ""
except Exception as e:
    connect_db.rollback()
    cur.close

I would appreciate any kind of help. I am not that familiar with python programs.

CodePudding user response:

Example: pandas

ref: https://www.listendata.com/2019/06/pandas-read-csv.html

import pandas as pd
# read csv
data = pd.read_csv("your-file.csv")
# read csv and skip the header
data = pd.read_csv("your-file.csv", header = 1)
# read csv, define col names
data = pd.read_csv("your-file.csv", skiprows=1, names=['product_id', 'product_name'])

for row in data.iterrows():
  print(row)
  # process row value as you want
  res = output_data_to_DB(query, res, connect_db)

Example: python CSV module (<- i recommend this)

csv library would be enough and simpler to pass every row data to function.

def create_product():
    data_list = []

    with open('your-file.csv', newline='') as csvfile:
        reader = csv.reader(csvfile)
        next(reader)  # discard header
        for row in reader:
            print(row)  # e.g. `['foo', 'bar']`
            insert_data = output_data_to_DB(query, row, connect_db)
            data_list.append(insert_data)
        print(data_list)

--

Edit

Primary key (auto-incremented column)

Some options to add an auto-incremented value to columns like id can be:

  • Related