Change number in left column based on number of apostrophes in right column pandas df-CodePudding

I have a pandas Dataframe like the following.

Num	Words
1	hello
2	can't
3 3 3	i like this
4 4 4 4 4	i don't can't do this

Basically, each number corresponds to a word right now.

I want to add the same number twice based on how many apostrophes are in the 'Words' column, like the following. If there are two apostrophes, we would add 4 of the same numbers to the Num column.

Num	Words
1	hello
2 2 2	can't
3 3 3	i like this
4 4 4 4 4 4 4 4 4	i don't can't do this

I tried to do the following:

if "'" in str(x["Word"]):
        x["Number"] = repeat(x["Number"])

This only detects if there is an apostrophe, not if there are multiple and adds numbers based on that.

CodePudding user response：

You can use str.count. NB. The following code is vectorial (i.e fast,).

df['Num'] (' ' df['Num'].str[0].astype(str))*df['Words'].str.count("'")*2

Output:

0                    1
1                2 2 2
2                3 3 3
3    4 4 4 4 4 4 4 4 4

To assign to the original dataframe:

df['Num'] = df['Num'] (' ' df['Num'].str[0].astype(str))*df['Words'].str.count("'")*2

Output:

                 Num                  Words
0                  1                  hello
1              2 2 2                  can't
2              3 3 3            i like this
3  4 4 4 4 4 4 4 4 4  i don't can't do this

CodePudding user response：

You can use str.count("'"), and then add it that many times.

for x in range(str(x['Word']).count("'")):
    x["Number"] = repeat(x["Number"])

CodePudding user response：

Looks like it might work... A little bit "workarroundish" but seems to work.

df = pd.DataFrame({'Num': ['1', '2', '3 3 3', '4 4 4 4 4'], 
              'Words': ["hello", "can't", "i like this", "i don't can't do this"]})

cont = df['Words'].str.count("'")
num = df['Num'].str.slice(0,1)
df['Num'] = df['Num'] (" " num)*cont
df

	Num	Words
0	1	hello
1	2 2	can't
2	3 3 3	i like this
3	4 4 4 4 4 4 4	i don't can't do this

CodePudding user response：

You can use Series.str.count to count the number of apostrophes first, and then multiply the 'Num' column by that amount.

# get the number of apostrophes of each word 
num_apostrophes = df['Words'].str.count("'")
# get the first word of the 'Num' column, 
# assuming that it be anything (and not just 1 digit numbers)
first_num = df['Num'].str.extract("^(\w )", expand=False)
# update 'Num'
df['Num']  = (" "   first_num) * 2 * num_apostrophes