I have a dataframe with a column with item names and a column with a number. I would like to create a list with the item names repeated the number of times in the column.
item | number |
---|---|
cat | 2 |
dog | 3 |
parrot | 4 |
My desired output is
item |
---|
cat |
cat |
dog |
dog |
dog |
parrot |
parrot |
parrot |
parrot |
I feel like I'm quite close with this code:
for index in df.iterrows():
for x in range(2):
print(df.item)
However, I can't find a way to replace 2 in range with the number out of the dataframe. df.numbers doesn't seem to work.
CodePudding user response:
As you said, your desired output is a list
, using @Michael's comment, you can do this:
list(df.item.repeat(df.number))
The output would be:
['cat', 'cat', 'dog', 'dog', 'dog', 'parrot', 'parrot', 'parrot', 'parrot']
CodePudding user response:
If you were keen on using .iterrows()
then you might do something like:
import pandas
df = pandas.DataFrame([
{"item": "cat", "number": 2},
{"item": "dog", "number": 3},
{"item": "parrot", "number": 4},
])
new_list = []
for index, row in df.iterrows():
new_list.extend([row["item"]] * row["number"])
print(new_list)
This relies on the fact that ["x"] * 3 === ["x", "x", "x"]