I am struggling with an issue regarding CSV files and Python. How would I generate a random number in a csv file row, based of a condition in that row.
Essentially, if the value in the first column is 'A' I want a random number between (1 and 5). If the value is B I want a random number between (2 and 6) and if the value is C, and random number between (3 and 7).
Letter | Color | Random Number |
---|---|---|
A | Green | |
C | Red | |
B | Red | |
B | Green | |
C | Blue |
Thanks in advance
The only thing I have found was creating a new random number dataframe. But I need to create a random number for an existing df.
CodePudding user response:
One of the ways is to use numpy.random.randint
with numpy.select
:
import pandas as pd
import numpy as np
df = pd.read_csv("inputfile.csv", sep=",")
#change the separator according to the actual format of your csv
categories = [df["Letter"].eq("A"),
df["Letter"].eq("B"),
df["Letter"].eq("C")]
#random.randint(low, high=None, size=None, dtype=int)
choices = [np.random.randint(1, 5 1), #high is exclusive
np.random.randint(2, 6 1), #high is exclusive
np.random.randint(3, 7 1)] #high is exclusive
#numpy.select(condlist, choicelist, default=0)
df["Random Number"] = np.select(categories, choices)
# Output :
print(df)
Letter Color Random Number
0 A Green 5
1 C Red 6
2 B Red 5
3 B Green 5
4 C Blue 6
If needed, you can use pandas.DataFrame.to_csv
to generate a new (.csv
) :
df.to_csv("output_file.csv", sep=",", index=False)
CodePudding user response:
Here is a simple way doing it without using pandas. this program modifies the third column by random number from a CSV file:
if the value in the first column is 'A' I want a random number between (1 and 5). If the value is B I want a random number between (2 and 6) and if the value is C, and random number between (3 and 7).
import csv
import random
letters_randoms = {
'A': [1, 5],
'B': [2, 6],
'C': [3, 7],
}
rows = [] #result
with open('file.csv', 'r', encoding='utf-8') as file:
reader = csv.reader(file)
rows.append(next(reader)) # Skip the first line (header)
for row in reader:
letter = row[0].upper()
row[2] = random.randint(letters_randoms[letter]
[0], letters_randoms[letter][1])# or just *letters_randoms[letter]
rows.append(row)
# modify csv file
with open('file.csv', 'w', newline='', encoding='utf-8') as file:
writer = csv.writer(file)
writer.writerows(rows)
Result:(file.csv)
LETTER,COLOR,Random Number
A,Green,3
c,Red,5
B,Red,2
B,Green,2
c,Blue,5
A,Purple,5
B,Green,3
A,Orange,3
c,Black,4
c,Red,5