rewrite csv file with python-CodePudding

I have csv file with this structure:

code1     code2     code3      name1     name2    sometnig1   something2

14355     12345     54133      part1     part12   aaaaaaaa    bbbbbbb
54782     57815     52781      part2     part22   ccccccc     ffffffff
14515     52495     52852      part3     part33   ddddddd     sssssss

I need to parse this csv file and create my new csv file with my own headers and only columns, that I need, for example:

code_1    code_2    name_1    name_2   something_2

14355     12345     part1     part12   bbbbbbb
54782     57815     part2     part22   ffffffff
14515     52495     part3     part33   sssssss

I know, that I can select one column that I need and write it to another file using pandas:

df = pd.read_csv(file)
df1 = df[code_1]

But how can I select multiple columns and write in one file?

CodePudding user response：

You can select multiple columns by using a list:

df1 = df[['code1', 'code2', 'name1', 'name2', 'something2']]

You can then change the column names using another list:

df1.columns = ['code_1', 'code_2', 'name_1', 'name_2', 'something_2']

then you can write that back to a csv

df1.to_csv('new filname.csv')

CodePudding user response：

The easiest would be to read only the columns you care about, and save some memory too:

df = pd.read_csv(file, usecols=["code_1", "code_2", "name_1", "name_2", "something_2"])
df.to_csv("other_file.csv", index=False)

Another option, if you already have a df you want to subset, is to use a list to select the columns you care about.

df = df[["code_1", "code_2", "name_1", "name_2", "something_2"]]