Get rid of characters when writing a dataframe to postgress-CodePudding

I have a large script that saves data to a postgres database. Tell me how can I get rid of [' and ']. I need the data in the storage to be saved without these characters

total.to_csv("C:/Users/.../total_type19.csv", index = False, sep =';')

conn5 = psycopg2.connect(dbname='', user='',
                       password='', host='', port = '')
cursor5 = conn5.cursor()


with open("C:/Users/.../total_type19.csv", "r",encoding = 'utf-8') as file:
    reader = csv.reader(file, delimiter = ";")
    header = next(reader )
    # print(f"HEADER {header}")
    counter = 1 
    for row in reader:
        # print(row)
        test_id =row[0]
        test_id = ''.join([i for i in test_id if i.isdigit()])
        if test_id == '':
            counter  =1
            test_id = counter
        else:
            test_id = int(test_id)
        # print(test_id)
        cursor5.execute(
            'INSERT INTO interaction_fillword (test_id, data_size, data_words_selection, data_matrix, data_colors, data_answers) VALUES(%s,%s, %s, %s, %s, %s)',
            (test_id, row[1], row[2], row[3], row[4], row[5]))

conn5.commit()

This is how the data looks in the database

id  test_id data_size                               data_words_selection                                        data_matrix                                        data_colors                                       data_answers
0   224   312229    ['12']  ['["v","b","c","c","a","h","i","e","r","s","f"...  ['[{from:[3,4],to:[7,4],color:2},{from:[3,6],t...  ['"#ff0000","#00fe00","#0000ff","#d2ea9a","#40...  ['"place","cartable","gomme","bureau","bibliot...
1   225   400606    ['12']  ['["v","b","c","c","a","h","i","e","r","s","f"...  ['[{from:[3,4],to:[7,4],color:2},{from:[3,6],t...  ['"#ff0000","#00fe00","#0000ff","#d2ea9a","#40...  ['"place","cartable","gomme","bureau","bibliot...
2   226   410051    ['12']  ['["v","b","c","c","a","h","i","e","r","s","f"...  ['[{from:[3,4],to:[7,4],color:2},{from:[3,6],t...  ['"#ff0000","#00fe00","#0000ff","#d2ea9a","#40...  ['"place","cartable","gomme","bureau","bibliot...

Tell me how to get rid of these characters, perhaps the problem occurs after parsing when writing to a dataframe. It is necessary to get rid of the first [' and '], the following brackets must be left Help please

this is an example of how i would like the data to look like

id  test_id data_size                               data_words_selection                                        data_matrix                                        data_colors                                       data_answers
0   224   312229    12  ["v","b","c","c","a","h","i","e","r","s","f"...  [{from:[3,4],to:[7,4],color:2},{from:[3,6],t...  "#ff0000","#00fe00","#0000ff","#d2ea9a","#40...  "place","cartable","gomme","bureau","bibliot...
1   225   400606    12  ["v","b","c","c","a","h","i","e","r","s","f"...  [{from:[3,4],to:[7,4],color:2},{from:[3,6],t...  "#ff0000","#00fe00","#0000ff","#d2ea9a","#40...  "place","cartable","gomme","bureau","bibliot...
2   226   410051    12  ["v","b","c","c","a","h","i","e","r","s","f"...  [{from:[3,4],to:[7,4],color:2},{from:[3,6],t...  "#ff0000","#00fe00","#0000ff","#d2ea9a","#40...  "place","cartable","gomme","bureau","bibliot...

Replace doesn't help.

CodePudding user response：

You could use dataframe module. You can use the df.replace() function to replace specific characters in a DataFrame before writing it to a PostgreSQL table. Below code ... Removes all instances of the character "#" in all columns

df = df.replace('#', '', regex=True)

Then use the to_sql() function to write the modified DataFrame to a PostgreSQL table

df.to_sql('table_name', engine)

CodePudding user response：

Try map with specific column and add lstrip and rstrip in lambada

df[col] = df[col].map(lambda x: str(x).lstrip("['").rstrip("']")).astype(float)

Hope this will work for you.