Home > Software engineering >  Split csv file into 2 list depending upon column name using python
Split csv file into 2 list depending upon column name using python

Time:11-14

I want to split csv file into 2 lists using column name

CSV file:

Molecule Name,SMILES
ZINC53 (Aspirin),CC(=O)Oc1ccccc1C(=O)O
ZINC7460 (Vatalanib),Clc1ccc(Nc2nnc(Cc3ccncc3)c3ccccc23)cc1
ZINC1493878 (Sorafenib),CNC(=O)c1cc(Oc2ccc(NC(=O)Nc3ccc(Cl)c(C(F)(F)F)c3)cc2)ccn1

Code:

namelist = list()
smileslist = list()
    with open('./file.csv', 'r') as f:
        f = csv.reader(f, delimiter=',')
        columns = next(f)
        type_col1 = columns.index("Molecule Name")
        type_col2 = columns.index("SMILES")
        for column in f:     
            if type_col1 == 'Molecule Name':
                namelist.append(column)
            elif type_col2 == 'SMILES':
                smileslist.append(column)

CodePudding user response:

With pandas library you can do it as easily as :


import pandas as pd


df = pd.read_csv("./file.csv")
namelist = df["Molecule Name"].tolist()
smileslist = df["SMILES"].tolist()

print(namelist)
print(smileslist)

Or if you prefer using the csv reader you can do it as follow :

import csv


namelist = list()
smileslist = list()
with open("./file.csv", "r") as f:
    f = csv.reader(f, delimiter=',')
    columns = next(f)
    index_col1 = columns.index("Molecule Name")
    index_col2 = columns.index("SMILES")
    for column in f:
        namelist.append(column[index_col1])
        smileslist.append(column[index_col2])
  • Related