Pandas: Add column name to third column for multiple csv files-CodePudding

I have multiple csv files in same directory. Each csv file contains 3 columns but in 3rd column, column name is missing. To read all csv files I had to use error_bad_lines=False. Now, I want to add column name c3 to third column for multiple csv files.

Sample df:

v     info  
12    days    6
53    x       a
42    y       b

Expected output:

    v    info   c3
0  12    days    6
1  53    x       a
2  42    y       b

CodePudding user response：

First, convert index "v" into column

df = df.reset_index()

Then, you can change the columns simply.

df.columns = ["v", "info", "c3"]

Finally,

import pandas as pd
for file in os.listdir(directory):
    if file.endswith(".csv"):
        df = pd.read_csv(file)
        df = df.reset_index() # this is option line
        df.columns = ["v", "info", "c3"]
        df.to_csv(file)

CodePudding user response：

Not sure how your csv looks like, so I assumed that your csv would be:

v,info,
12,days,6
53,x,a
42,y,b

Whatever, you can load all the csv file in a directory and change column names as follows:

import pandas as pd
import glob

for f in glob.glob("C:\*.csv"):
    print(f)  # f is a file name
    df = pd.read_csv(f)
    df.reset_index() # add index column, 0, 1, 2, ...
    df.columns = ['v', 'info', 'c3'] # change column names
    df.to_csv(f)  # save it (overwriting)

You can see how to use glob to load files in a directory in detail here: https://www.geeksforgeeks.org/how-to-use-glob-function-to-find-files-recursively-in-python/