I have multiple csv files in same directory. Each csv file contains 3 columns but in 3rd column, column name is missing. To read all csv files I had to use error_bad_lines=False
. Now, I want to add column name c3
to third column for multiple csv files.
Sample df:
v info
12 days 6
53 x a
42 y b
Expected output:
v info c3
0 12 days 6
1 53 x a
2 42 y b
CodePudding user response:
First, convert index "v" into column
df = df.reset_index()
Then, you can change the columns simply.
df.columns = ["v", "info", "c3"]
Finally,
import pandas as pd
for file in os.listdir(directory):
if file.endswith(".csv"):
df = pd.read_csv(file)
df = df.reset_index() # this is option line
df.columns = ["v", "info", "c3"]
df.to_csv(file)
CodePudding user response:
Not sure how your csv looks like, so I assumed that your csv would be:
v,info,
12,days,6
53,x,a
42,y,b
Whatever, you can load all the csv file in a directory and change column names as follows:
import pandas as pd
import glob
for f in glob.glob("C:\*.csv"):
print(f) # f is a file name
df = pd.read_csv(f)
df.reset_index() # add index column, 0, 1, 2, ...
df.columns = ['v', 'info', 'c3'] # change column names
df.to_csv(f) # save it (overwriting)
You can see how to use glob
to load files in a directory in detail here: https://www.geeksforgeeks.org/how-to-use-glob-function-to-find-files-recursively-in-python/