sorry if this may seem like a simple question, but I am new to python. I would like to create a DataFrame containing 10 values for family names, 10 values for city of birth and for each pair of family name-city of birth, 3 members of that family, which have the "name" a random string up to 8 characters. How can i create such a DataFrame? I don't really know how to use the same pair of family name-city of birth for more than one value for "member".
CodePudding user response:
There are a few ways to go about this, but here's a simple one that's easy to follow (with 5 values instead of the required 10 but you get the idea) :
import random
import string
import pandas as pd
cities = ["New York", "London", "Paris", "Beijing", "Casablanca"]
names = ["Smith", "Heston", "Dupont", "Torvalds", "Clooney"]
df = pd.DataFrame(
[
{
"city": cities[i],
"family_name": names[i],
"first_name": "".join([random.choice(string.ascii_lowercase) for _ in range(8)]),
}
for i in range(5)
for _ in range(3)
]
)
print(df)