Home > front end >  Python Replace special characters in a CSV with specified string
Python Replace special characters in a CSV with specified string

Time:08-05

I'm trying to replace multiple strings (with multiple language characters) in a csv file.

The following code works assuming that I rename my csv files using the .txt extension and then rename back to .csv. I'm wondering if the csv can be read and written directly.

import io

match = {
    "太好奇了": "First String",
    "धेरै एक्लो": "Second String",
    "심각하게 생명이 필요하다": "Third String"
}

f = io.open("input.txt", mode="r", encoding="utf-16")
data = f.read()

def replace_all(text, dic):
    for i, j in dic.items():
        text = text.replace(i, j)
    return text


data = replace_all(data, match)
w = open("updated.txt", "w", encoding="utf-16")
w.write(data)

CodePudding user response:

A csv file is nothing else than a simple txt file that is meant to represent a data table by separating the values by commas. This allows programs to read it efficiently into data using libraries like Python's csv.

Since it still is just a text file, you can also open it as a usual txt using a simple function like open and use it the exact same way you would use a txt file.

f = open("myfile.csv", mode="r", encoding="utf-16")
data = f.read()
f.close()

Note that file extensions actually change nothing about the file, they just signal how the file should be used. You could call a text file myfile.kingkong and it would still behave the same with the open function. In the same way, renaming .csv to .txt does absolutely nothing.

CodePudding user response:

Use pandas.

Here's some code to help you get started.

import pandas as pd

filename = 'data.csv'

df = pd.read_csv(filename, encoding='utf-16')

# this will replace "太好奇了" with "First String"
df.replace(to_replace="太好奇了", value="First String")

df.to_csv('update.csv') # save result
  • Related