I'm trying to replace multiple strings (with multiple language characters) in a csv file.
The following code works assuming that I rename my csv files using the .txt
extension and then rename back to .csv
. I'm wondering if the csv can be read and written directly.
import io
match = {
"太好奇了": "First String",
"धेरै एक्लो": "Second String",
"심각하게 생명이 필요하다": "Third String"
}
f = io.open("input.txt", mode="r", encoding="utf-16")
data = f.read()
def replace_all(text, dic):
for i, j in dic.items():
text = text.replace(i, j)
return text
data = replace_all(data, match)
w = open("updated.txt", "w", encoding="utf-16")
w.write(data)
CodePudding user response:
A csv file is nothing else than a simple txt file that is meant to represent a data table by separating the values by commas. This allows programs to read it efficiently into data using libraries like Python's csv
.
Since it still is just a text file, you can also open it as a usual txt using a simple function like open
and use it the exact same way you would use a txt file.
f = open("myfile.csv", mode="r", encoding="utf-16")
data = f.read()
f.close()
Note that file extensions actually change nothing about the file, they just signal how the file should be used. You could call a text file myfile.kingkong
and it would still behave the same with the open
function. In the same way, renaming .csv
to .txt
does absolutely nothing.
CodePudding user response:
Use pandas
.
Here's some code to help you get started.
import pandas as pd
filename = 'data.csv'
df = pd.read_csv(filename, encoding='utf-16')
# this will replace "太好奇了" with "First String"
df.replace(to_replace="太好奇了", value="First String")
df.to_csv('update.csv') # save result