Modify all files in directory using Python (Loop)-CodePudding

I am having a lot of troubles to make a loop for my script. Originally the script is modifying one CSV file but I have edited it to loop thrue one folder "CSVtoGD" and modify every csv in this folder. So far with no luck. The modified script is:

from pathlib import Path 
import pandas as pd
dir = r'/users/krzysztofpaszta/CSVtoGD' 
csv_files = [f for f in Path(dir).glob('*.csv')] 



for csv in csv_files: #iterate list
   
    df = pd.read_csv(csv, sep=": \s ", engine='python', names=['dane', 'wartosc'])
    # creating columns with names: ścieżka_do_pliku:czcionka.ttf 
    df['dana_czcionka'] = df['dane'].str.split(':').str[0]

    print('\n--- df ---\n')
    print(df.to_string())

    with open('csv', 'w') as f_out:
        writer = csv.writer(f_out)
    
# sorting data by columns: ścieżka_do_pliku:czcionka.ttf 
        for name, data in df.groupby('dana_czcionka'):
            print('\n---', name, '---\n')
        
            headers = (data['dane']   ":").to_list()
            print(headers)
    
            values = data['wartosc'].to_list()
            print(values)
            values.insert(0, name) # - Adding name (path) to every row
            values.insert(0, name)
            #writer.writerow(headers) 
            writer.writerow(values)
            
# showing results in terminal, saving to file

    print(f'{csv.name} saved.')

I receive this error:

AttributeError                            Traceback (most recent call last)
/var/folders/zw/12ns4dw96zb34ktc_vfn0zp80000gp/T/ipykernel_49714/1288759270.py in <module>
     16 
     17     with open('csv', 'w') as f_out:
---> 18         writer = csv.writer(f_out)
     19 
     20 # grupowanie danych według kolumn ścieżka_do_pliku:czcionka.ttf

AttributeError: 'PosixPath' object has no attribute 'writer'

I was trying to achevie this by modifying the 'writer' but I guess my knowledge is to little for now. I think an easy loop would get results but I have no idea if my loop is build wrong or just what is the problem..

Original script, without loop is working correctly. Original script (modifying one csv) look like this:

import pandas as pd
import csv
df = pd.read_csv('TTF-Projects-INFO.csv', sep=": \s ", engine='python', names=['dane', 'wartosc'])

# creating columns with names like: ścieżka_do_pliku:czcionka.ttf 
df['dana_czcionka'] = df['dane'].str.split(':').str[0]

print('\n--- df ---\n')
print(df.to_string())

with open('newTTF-Projects-INFO.csv', 'w') as f_out:
    writer = csv.writer(f_out)
    
# sorting data by columns: ścieżka_do_pliku:czcionka.ttf 
    for name, data in df.groupby('dana_czcionka'):
        print('\n---', name, '---\n')
        
        headers = (data['dane']   ":").to_list()
        print(headers)
    
        values = data['wartosc'].to_list()
        print(values)
        values.insert(0, name) # - add name (path) to every row with data
        #writer.writerow(headers) 
        writer.writerow(values)
            
# showing effect in terminal, saving to file

print('\n--- file ---\n')
print(open('newTTF-Projects-INFO.csv').read())

CodePudding user response：

Your problem is a name clash. The original code uses import csv to import the library. But you also use csv as a variable in your loop

for csv in csv_files: #iterate list

So change your variable name to something like that:

for csv_file in csv_files: #iterate list

and the following occurrences of the variable:

...
for csv_file in csv_files: #iterate list
    df = pd.read_csv(csv_file, sep=": \s ", engine='python', names=['dane', 'wartosc'])
    # creating columns with names: ścieżka_do_pliku:czcionka.ttf 
    df['dana_czcionka'] = df['dane'].str.split(':').str[0]

    print('\n--- df ---\n')
    print(df.to_string())

    with open(csv_file, 'w') as f_out:
        writer = csv.writer(f_out)
...
print(f'{csv_file} saved.')

Finally, add the line import csv at the beginning.
Now your code should work as expected.

Alternatively you can use the line

import csv as cs

and change the writer line to

writer = cs.writer(f_out)

In this case you can stay with the csv variable name.
But IMHO this is less clear in relation to naming conventions.