Python Pandas CSV - reading-CodePudding

i've a problem, i have to read a CSV file and take the value from each rows.

in example

Name Surname Sex Date
Franco Puppi Male 01/01/2022
Max   Pezzali Male 03/4/2022
Fuffi Fuffi  female 03/8/202

the content above is my csv file composed, i want to proceed in reading this kind of CSV file, processing each column alone. In example

dfin = pd.read_csv(var_an.csv)
for index1 in dfin.iterrows():

Name = 
Surname = 
Sex = 
Date =

how you would extract that one? i tried with str(dfin["Name"]), but i got the error that should be integer value inside the tuple, i then changed the "Name" with 0,1,2 but at the first column says that it's ouf of the index. What i'm wrong? i had and easy success with xlsx file.

def analytics(var_an):
    from termcolor import colored, cprint
    import pandas as pd
    dfin = pd.read_csv(var_an)
    for index1 in dfin.iterrows():
        print(index1)
        cprint(f'Found on file : {var_an}', 'red')
       # cprint(f'Obd = {obd} | pallet = {pallet} | loggerid = {loggerid} | system_date = {system_date} | system_time = {system_time} | house = {house} | hub = {hub}', 'on_green')

when i did this above it extract the entire row, but i can't manage it each file alone like

Name = 
Surname = 
Sex =

CodePudding user response：

Not sure what you want to accomplish, but to take the values from the rows you can simply:

for idx, row in dfin.iterrows():
    name = row['name']
    surname = row['surname']
    sex = row['sex']

    ...

CodePudding user response：

That's not a CSV which expects comma-separated values in each line. When you used read_csv, you got a table with a single column named "Name Surname Sex Date". Turning your fragments into a running script

import pandas as pd
import io

the_file = io.StringIO("""Name Surname Sex Date
Franco Puppi Male 01/01/2022
Max   Pezzali Male 03/4/2022
Fuffi Fuffi  female 03/8/202""")

dfin = pd.read_csv(the_file)
print(dfin.columns)

outputs

1 columns: Index(['Name Surname Sex Date'], dtype='object')

So, the file didn't parse correctly. You can change the separator from a comma to a regular expression and use all whitespace as column separators and you'll get the right values for this sample data

import pandas as pd
import io

the_file = io.StringIO("""Name Surname Sex Date
Franco Puppi Male 01/01/2022
Max   Pezzali Male 03/4/2022
Fuffi Fuffi  female 03/8/202""")

dfin = pd.read_csv(the_file, sep=r"\s ")
print(dfin.columns)

for i, row in dfin.iterrows():
    print(f"====\nRow {i}:\n{row}")
    Name = row["Name"]
    Surname = row["Surname"]
    Sex = row["Sex"]
    Date = row["Date"]
    print("Extracted:", Name, Surname, Sex, Date)

This gets the right stuff:

Index(['Name', 'Surname', 'Sex', 'Date'], dtype='object')
====
Row 0:
Name           Franco
Surname         Puppi
Sex              Male
Date       01/01/2022
Name: 0, dtype: object
Extracted: Franco Puppi Male 01/01/2022
====
Row 1:
Name             Max
Surname      Pezzali
Sex             Male
Date       03/4/2022
Name: 1, dtype: object
Extracted: Max Pezzali Male 03/4/2022
====
Row 2:
Name          Fuffi
Surname       Fuffi
Sex          female
Date       03/8/202

Kinda good. But there is still a huge problem. What if one of these people have a space in their name? Pandas would split each part of the name into a separate column and the parsing would fail. You need a better file format than what you've been given.