I have a pandas dataframe where I'm trying to add a new column that repeats the number in the score column as many times as there are words in the words column (which contains tuples of words). I have something like this:
In [7]: df
Out[7]:
titles score
0 [cat, father, bakery] 43
1 [brick, swordsmith, park, apple] 68
And I want something like this:
In [8]: df
Out[8]:
titles score score_repeat
0 [cat, father, bakery] 43 [43, 43, 43]
1 [brick, swordsmith, park, apple] 68 [68, 68, 68, 68]
I am not very experienced but I have tried something like:
df['score_repeat'] = len(df['titles'])*df['score']
But that just gives me a column where the score is multiplied by the length of the column.
CodePudding user response:
Use list comprehension for multplied one element list:
df['score_repeat'] = [[y] * x for x, y in zip(df['titles'].str.len(), df['score'])]
print (df)
titles score score_repeat
0 [cat, father, bakery] 43 [43, 43, 43]
1 [brick, swordsmith, park, apple] 68 [68, 68, 68, 68]
CodePudding user response:
Check Below code using apply:
df['score_repeat'] = df.apply(lambda x: [x.score] * (x.titles.count(',') 1) , axis =1)
df
Output:
CodePudding user response:
Why not make things easier for you and those working with you (if that's the case)? You can modify the original data, I suppose it resembled something like this:
data = [
{
"titles": ['cat', 'father', 'bakery'],
"score": 43
},
{
"titles": ['brick','swordsmith','park','apple'],
"score": 64
}
]
You can add a "score_repeat"
key:value like this:
for d in data:
d['score_repeat'] = [d['score'] for title in d['titles']]
Now your data looks like this:
data = [
{
"titles": ['cat', 'father', 'bakery'],
"score": 43,
'score_repeat': [43, 43, 43]
},
{
"titles": ['brick','swordsmith','park','apple'],
"score": 64,
'score_repeat': [64, 64, 64, 64]
}
]
And you just have to create the dataframe:
df = pd.DataFrame(data)
And the result:
titles score score_repeat
0 [cat, father, bakery] 43 [43, 43, 43]
1 [brick, swordsmith, park, apple] 64 [64, 64, 64, 64]
This is easier and way more readable.