So i have a big csv file and my code prints all the rows but i want to print, for example, only 20 random rows from 100000 rows. I know that somehow with random.sample
u can do that, but i don't really know how. Any suggestions?
There is my code:
import csv
with open(r'Z:/datasets/room-segmentation/labeling/test_examples_doors/labels.csv') as csvfile:
data = csv.DictReader(csvfile)
for row in data:
if row['open']=='1':
print(row['image'], row['open'])
CodePudding user response:
I assume you want to randomly sample your data, rather than just take the first 20 rows?
In this case you can convert data
to a list and then sample it:
import csv
import random
with open(r'Z:/datasets/room-segmentation/labeling/test_examples_doors/labels.csv') as csvfile:
data = csv.DictReader(csvfile)
sampled_data = random.sample(list(data), 20)
CodePudding user response:
If you don't need to code this yourself, GoCSV has the sample command which does just this:
gocsv sample -n 20 labels.csv
CodePudding user response:
I don't quite understand your question but to just get 20 rows you can just replace the code with a counter
x = 0
for row in data:
x = 1
print(row['image'], row['open'])
if x == 20:
break
EDIT: Okay I get it, just get a random sample from the list and then delete that sample, then get another sample.
for x in range(20):
num = random.randint(0,len(data))
print(data[num]['image'], data[num]['open'])
del data[num]