Generate values in R and Python-CodePudding

I am trying to generate dummy data with some probability. Let say I want to have dummy data about people by gender. I already prepare this in R and you can see my code line below.

gender = sample(x=c("M","F"), prob = c(.6, .4),size=100,replace=TRUE)

Now I want to prepare the same thing but now in Python in Pandas Data Frame. Can anybody help me how to solve this problem?

CodePudding user response：

You can use numpy.random.choice , replace is True by default.

>>> np.random.choice(a=["M", "F"], size=100, p=[0.6, 0.4])

array(['F', 'M', 'F', 'M', 'M', 'M', 'M', 'M', 'M', 'F', 'M', 'F', 'M',
       'M', 'F', 'M', 'M', 'M', 'F', 'F', 'M', 'M', 'F', 'F', 'M', 'F',
       'F', 'M', 'M', 'F', 'F', 'M', 'F', 'F', 'M', 'F', 'M', 'M', 'F',
       'M', 'M', 'F', 'F', 'M', 'M', 'F', 'M', 'M', 'F', 'M', 'M', 'M',
       'F', 'F', 'M', 'F', 'M', 'M', 'M', 'M', 'M', 'M', 'F', 'F', 'M',
       'M', 'F', 'F', 'M', 'F', 'M', 'F', 'M', 'M', 'F', 'M', 'F', 'F',
       'M', 'F', 'M', 'F', 'M', 'F', 'M', 'M', 'F', 'M', 'M', 'M', 'F',
       'F', 'F', 'F', 'F', 'M', 'M', 'F', 'F', 'F'], dtype='<U1')

CodePudding user response：

Try this. random.choices gets k choices from the iterable provided:

import random
print(random.choices("MF", weights=[.6,.4], k=100))

Testing:

>>> l = random.choices("MF", weights=[.6,.4], k=100)
>>> l
['M', 'F', 'F', 'M', 'M', 'M', 'M', 'M', 'F', 'M', 'M', 'F', 'M', 'M', 'M', 'F', 'M', 'M', 'M', 'M', 'F', 'M', 'M', 'M', 'F', 'F', 'M', 'F', 'F', 'M', 'M', 'F', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'F', 'M', 'M', 'F', 'M', 'F', 'M', 'F', 'F', 'F', 'F', 'F', 'F', 'F', 'M', 'F', 'F', 'M', 'M', 'M', 'F', 'F', 'M', 'M', 'M', 'F', 'F', 'F', 'M', 'F', 'F', 'M', 'M', 'F', 'F', 'M', 'M', 'M', 'F', 'M', 'M', 'F', 'M', 'M', 'M', 'M', 'M', 'F', 'M', 'M', 'M', 'F', 'F', 'F', 'M', 'F', 'F', 'M']
>>> l.count("M")
60
>>> l.count("F")
40