I have a pandas Dataframe like the following.
Num | Words |
---|---|
1 | hello |
2 | can't |
3 3 3 | i like this |
4 4 4 4 4 | i don't can't do this |
Basically, each number corresponds to a word right now.
I want to add the same number twice based on how many apostrophes are in the 'Words' column, like the following. If there are two apostrophes, we would add 4 of the same numbers to the Num column.
Num | Words |
---|---|
1 | hello |
2 2 2 | can't |
3 3 3 | i like this |
4 4 4 4 4 4 4 4 4 | i don't can't do this |
I tried to do the following:
if "'" in str(x["Word"]):
x["Number"] = repeat(x["Number"])
This only detects if there is an apostrophe, not if there are multiple and adds numbers based on that.
CodePudding user response:
You can use str.count
. NB. The following code is vectorial (i.e fast,).
df['Num'] (' ' df['Num'].str[0].astype(str))*df['Words'].str.count("'")*2
Output:
0 1
1 2 2 2
2 3 3 3
3 4 4 4 4 4 4 4 4 4
To assign to the original dataframe:
df['Num'] = df['Num'] (' ' df['Num'].str[0].astype(str))*df['Words'].str.count("'")*2
Output:
Num Words
0 1 hello
1 2 2 2 can't
2 3 3 3 i like this
3 4 4 4 4 4 4 4 4 4 i don't can't do this
CodePudding user response:
You can use str.count("'")
, and then add it that many times.
for x in range(str(x['Word']).count("'")):
x["Number"] = repeat(x["Number"])
CodePudding user response:
Looks like it might work... A little bit "workarroundish" but seems to work.
df = pd.DataFrame({'Num': ['1', '2', '3 3 3', '4 4 4 4 4'],
'Words': ["hello", "can't", "i like this", "i don't can't do this"]})
cont = df['Words'].str.count("'")
num = df['Num'].str.slice(0,1)
df['Num'] = df['Num'] (" " num)*cont
df
Num | Words | |
---|---|---|
0 | 1 | hello |
1 | 2 2 | can't |
2 | 3 3 3 | i like this |
3 | 4 4 4 4 4 4 4 | i don't can't do this |
CodePudding user response:
You can use Series.str.count
to count the number of apostrophes first, and then multiply the 'Num' column by that amount.
# get the number of apostrophes of each word
num_apostrophes = df['Words'].str.count("'")
# get the first word of the 'Num' column,
# assuming that it be anything (and not just 1 digit numbers)
first_num = df['Num'].str.extract("^(\w )", expand=False)
# update 'Num'
df['Num'] = (" " first_num) * 2 * num_apostrophes