Is there a way to multiply the number of numbers in one column with the number of words in another?-CodePudding

I have a pandas dataframe where I'm trying to add a new column that repeats the number in the score column as many times as there are words in the words column (which contains tuples of words). I have something like this:

    In [7]: df
Out[7]:                           
    titles                             score 
0  [cat, father, bakery]                43            
1  [brick, swordsmith, park, apple]     68

And I want something like this:

In [8]: df
Out[8]: 
    titles                             score           score_repeat
0  [cat, father, bakery]                43             [43, 43, 43]
1  [brick, swordsmith, park, apple]     68             [68, 68, 68, 68]

I am not very experienced but I have tried something like:

df['score_repeat'] = len(df['titles'])*df['score']

But that just gives me a column where the score is multiplied by the length of the column.

CodePudding user response：

Use list comprehension for multplied one element list:

df['score_repeat'] = [[y] * x for x, y in zip(df['titles'].str.len(), df['score'])]
print (df)
                             titles  score      score_repeat
0             [cat, father, bakery]     43      [43, 43, 43]
1  [brick, swordsmith, park, apple]     68  [68, 68, 68, 68]

CodePudding user response：

Check Below code using apply:

df['score_repeat'] = df.apply(lambda x: [x.score] * (x.titles.count(',') 1) , axis =1)
df

Output:

CodePudding user response：

Why not make things easier for you and those working with you (if that's the case)? You can modify the original data, I suppose it resembled something like this:

data = [
    {
        "titles": ['cat', 'father', 'bakery'],
        "score": 43
    },
    {
        "titles": ['brick','swordsmith','park','apple'],
        "score": 64
    }
]

You can add a "score_repeat" key:value like this:

for d in data:
    d['score_repeat'] = [d['score'] for title in d['titles']]

Now your data looks like this:

data = [
    {
        "titles": ['cat', 'father', 'bakery'],
        "score": 43,
        'score_repeat': [43, 43, 43]
    },
    {
        "titles": ['brick','swordsmith','park','apple'],
        "score": 64,
        'score_repeat': [64, 64, 64, 64]
    }
]

And you just have to create the dataframe:

df = pd.DataFrame(data)

And the result:

                             titles  score      score_repeat
0             [cat, father, bakery]     43      [43, 43, 43]
1  [brick, swordsmith, park, apple]     64  [64, 64, 64, 64]

This is easier and way more readable.