Home > Mobile >  Pandas different random string for each row
Pandas different random string for each row

Time:10-02

I'm trying a fill a dataframe column with random string characters - but using the code below I get a new 10 character string each time I run it - but it's the same for every row.

How do I generate a new string for each row?

print(df)

0          eqFSwEJQqD
1          eqFSwEJQqD
2          eqFSwEJQqD
3          eqFSwEJQqD
4          eqFSwEJQqD
              ...    
1019920    eqFSwEJQqD
1019921    eqFSwEJQqD
1019922    eqFSwEJQqD
1019923    eqFSwEJQqD
1019924    eqFSwEJQqD

I'd like, eg:

0 fGtryghjuYt
1 jUiKlOpYtrd

etc...

The code:

import random
from string import ascii_letters

df['ff'] = ''.join(random.choice(ascii_letters) for x in range(10))

CodePudding user response:

Use:

df['ff'] = [''.join(random.choice(ascii_letters) for x in range(10)) for _ in range(len(df))]
print(df)

Like in the example below:

import random
from string import ascii_letters
import pandas as pd

df = pd.DataFrame(data=list(range(10)), columns=["id"])
df['ff'] = [''.join(random.choice(ascii_letters) for x in range(10)) for _ in range(len(df))]
print(df)

Output

   id          ff
0   0  UKCYsUXRYi
1   1  vvYweriLfb
2   2  eYcCCnXfhW
3   3  xiyPisioWt
4   4  cjMOxAcULS
5   5  lgkxtFCbBx
6   6  pPeEOmfgkB
7   7  EBhBfticnM
8   8  hdQxePBmCq
9   9  KCPosrHfgz

The problem with your approach is that it creates a unique single value and uses it to assign it to the whole 'ff' column

CodePudding user response:

Use:

df['ff'] = df['col'].apply(''.join(random.choice(ascii_letters) for x in range(10)))

Or:

df['ff'] = [''.join(random.choice(ascii_letters) for x in range(10)) for _ in df.index]

Or:

from string import ascii_letters
a = np.random.choice(list(ascii_letters), size=(10, len(df)))
df['ff'] = np.apply_along_axis(''.join, 0, a)
  • Related