I have a directory of CSV files that list fruits and there quantities, and I want to search all those and create a new CSV that only has a particular fruit (e.g., "apple").
list1.csv
| Name | Qty |
| -------- | --- |
| apple |15 |
| apple |50 |
| mango |20 |
| grapes |49 |
list2.csv
| Name | Qty |
| -------- | --- |
| apple |25 |
| apple |50 |
| Banana |34 |
| mango |20 |
| grapes |49 |
list3.csv
| Name | Qty |
| -------- | --- |
| apple |125 |
| apple |530 |
| mango |20 |
| grapes |49 |
I want, for "apple":
new.csv
| Name | Qty |
| -------- | --- |
| apple |15 |
| apple |50 |
| apple |25 |
| apple |50 |
| apple |125 |
| apple |530 |
import pandas as pd
import glob, os
path = ("E:/Data/Fdata")
all_files = glob.glob(path "/*.csv")
li=[]
for filename in all_files:
df=pd.read_csv(filename, index_col=None, header=0)
ndf = df[df["Name"].str.contains("Apple")]
li.append(ndf)
ndf.to_csv("E:/Data/Fdata/onlyapple.csv", index=True)
CodePudding user response:
- Read all your csv files to a
master
DataFrame - Filter on the "Name" you want and write
to_csv
import os
import pandas as pd
master = pd.DataFrame()
for file in [f for f in os.listdir(".") if f.endswith("csv")]:
master = master.append(pd.read_csv(file), ignore_index=True)
master[master["Name"].eq("apple")].reset_index(drop=True).to_csv("onlyapple.csv")
onlyapple.csv:
,Name,quantity,price
0,apple,15,500
1,apple,50,400
2,apple,15,500
3,apple,50,400
CodePudding user response:
Here's how to do it without Pandas:
import csv
import glob
fruit = 'apple'
final = []
header = []
for file in glob.glob('./*.csv'):
with open(file, newline='') as f:
reader = csv.reader(f)
header = next(reader) # header should be same for each file
for row in reader:
if row[0] == fruit:
final.append(row)
with open('output.csv', 'w', newline='') as f:
writer = csv.writer(f)
writer.writerow(header) # use the last file's header
writer.writerows(final)
And here's how to do it without code using GoCSV's commands, stack (to stack your files one on top of each other) and filter (to filter out only the rows you want):
gocsv stack *.csv | gocsv filter -c Name -eq apple