this is the equivalent of my csv file;
customer,quantity
a,250
a,166
c,354
b,185
a,58
d,68
c,263
c,254
d,320
b,176
d,127
...
this csv file has 8000 data. I want to separate the "a", "b", ,"c", ... "z" in the customer column with the quantity column. this csv file is just an example, actually customers are too many. I don't know the customer names. what i want is for each client to have their own csv file. and I have to do them using python.
I'm sorry for my bad english.
CodePudding user response:
I am not good in the pandas
module but I get what you want this code makes the .csv
of the user name and the inserts his/her name and quantity in the file. You can try this if you get any errors then please let me know in the comment.
Note: Please try this with a copy of the same data file
# pip install pandas
import pandas as pd
data = pd.read_csv('abc.csv') # write you csv file name here.
all_customer_value = list(data['customer'])
all_customer_quantity = list(data['quantity'])
all_customer_name = set(data['customer'])
for user in all_customer_name:
with open(f'{user}.csv','w')as file:
file.write('customer,quantity\n') # edited
for index,value in enumerate(all_customer_value):
with open(f'{value}.csv','a') as file:
file.write(f'{value}, {all_customer_quantity[index]}\n')
CodePudding user response:
I believe the simplest way is using a dict
to store the customer name and all related quantities:
with open('the_file.csv', 'r') as file_in:
csv_in = reader(file_in)
customers = {}
first_line = True
for cust, qty in csv_in:
if first_line:
first_line = False
continue
if cust not in customers:
customers[cust] = [qty]
else:
customers[cust].append(qty)
for cust in customers:
with open(f'{cust}.csv', 'w', newline='') as file_out:
csv_out = writer(file_out)
for qty in customers[cust]:
csv_out.writerow([cust, qty])
CodePudding user response:
If you're okay with Pandas (replace file.csv
with your input filename):
import pandas as pd
for customer, df in pd.read_csv("file.csv").groupby("customer"):
df.to_csv(f"{customer}.csv", index=False)
If you want to do more or less the same without Pandas:
import csv
from operator import itemgetter
from itertools import groupby
with open("file.csv", "r") as file:
data = list(csv.reader(file))
header, data = data[0], data[1:]
key = itemgetter(0)
for customer, group in groupby(sorted(data, key=key), key=key):
with open(f"{customer}.csv", "w") as file:
writer = csv.writer(file)
writer.writerow(header)
writer.writerows(group)
If you want to avoid the sorting, then a solution like the one from @gimix might be better.