Home > Net >  Python code: Extract single columns from multiple csv files to save as a new csv file while column_h
Python code: Extract single columns from multiple csv files to save as a new csv file while column_h

Time:12-11

I would like to read out specific columns from multiple (100s) CSV files to create a new CSV file. The column header will be renamed with the source filename where the column is extracted. I have written some lines of code to do this, however, I have been unable to effectively rename the column header without the extension (See attachment). I will appreciate some help. Thanks a lot.

output

import os
import csv
import pandas as pd

folder = "C:/Users/Doc/Data"
files = os.scandir(folder)

E2080 = []
with os.scandir(folder) as files:
    for file in files:
        #print(file)
        df = pd.read_csv(file, index_col=None)
        dist = {file: (df['lnt_dist'])}
        E = pd.DataFrame(dist)
        E2080.append(E)

dist = pd.concat(E2080, ignore_index=False, axis=1)
dist.head()
dist.to_csv('E2080', index=False)

CodePudding user response:

You should use file.name instead of file to get string with name.

And with string you can use .split(".") to get name without extension.

for file in os.scandir(folder):
    print(file.name, '=>', file.name.split(".")[0])

Or you could use pathlib.Path instead of os.scandir() to have more functions.

for file in pathlib.Path('test').iterdir():
    print(file.name, '=>', file.stem)

CodePudding user response:

This is the final code that worked for me (see output 1):

E2080 = []
with os.scandir(folder) as files:
    for file in files:
        #print(file)
        df = pd.read_csv(file, index_col=None)
        dist = {file: (df['lnt_dist'])}
        E = pd.DataFrame(dist)
        E_1 = E.rename(columns={file: file.name.split('.')[0]}) # rename df header while dropping the ext **[.csv]** and the `os.scandir` attribute `<DirEntry>`
        E2080.append(E_1)

dist = pd.concat(E_28, ignore_index=False, axis=1)
#dist.head()
dist.to_csv('E2080.csv', index=False)
  • Related