When same files with same csv is uploading again then how to override those files in RDS through lam-CodePudding

I am trying to load data (CSV Files) from S3 to MySQL RDS through Lambda. So that i have written code in Lambda so whenever csv file is uploaded in S3 bucket then the data will import to Database.

The code is working and data in CSV is uploading in RDS table but My problem is "When I try to upload same CSV file with same files again and again in S3 bucket then same data is again adding in RDS table but it is not replacing or over riding with old data"

How can I solve this ?

CODE:

import json
import boto3
import csv
import mysql.connector
from mysql.connector import Error
from mysql.connector import errorcode
s3_client = boto3.client('s3')

# Read CSV file content from S3 bucket
def lambda_handler(event, context):
    # TODO implement
    # print(event)
    bucket = event['Records'][0]['s3']['bucket']['name']
    csv_file = event['Records'][0]['s3']['object']['key']
    csv_file_obj = s3_client.get_object(Bucket=bucket, Key=csv_file)
    lines = csv_file_obj['Body'].read().decode('utf-8').split('\n')
    results = []
    for row in csv.DictReader(lines, skipinitialspace=True, delimiter=',', quotechar='"', doublequote = True):
        results.append(row.values())
    print(results)
    connection = mysql.connector.connect(host='xxxxxxxx.amazonaws.com',database='xxxxxxb',user='xxxxxxn',password='xxxxx')
    tables_dict = {
        'csvdata/sketching.csv': 'INSERT INTO table1 (empid, empname, empaddress) VALUES (%s, %s, %s)',
        'csvdata/profile.csv': 'INSERT INTO table2 (empid, empname, empaddress) VALUES (%s, %s, %s)'
    }
    if csv_file in tables_dict:
        mysql_empsql_insert_query = tables_dict[csv_file]
        cursor = connection.cursor()
        cursor.executemany(mysql_empsql_insert_query,results)
        connection.commit()
        print(cursor.rowcount, f"Record inserted successfully from {csv_file} file")
    return {
        'statusCode': 200,
        'body': json.dumps('Hello from Lambda!')
    }

CodePudding user response：

The query you are using is INSERT so it will always insert new rows.

What you really need to use is an "upsert" approach. You can use REPLACE INTO instead:

REPLACE INTO table1 (empid, empname, empaddress) VALUES (%s, %s, %s)

REPLACE works exactly like INSERT, except that if an old row in the table has the same value as a new row for a PRIMARY KEY or a UNIQUE index, the old row is deleted before the new row is inserted.

Reference: https://dev.mysql.com/doc/refman/8.0/en/replace.html

CodePudding user response：

Set ID as primary key then it will not insert

use: ALTER TABLE Persons ADD PRIMARY KEY (ID);