Is it possible to replace "firstword' with 'BBBBB' in all rows that contain 'firstword' ?
Please, what is wrong in my code ?
It would fiind the word I select from the most_occur1, and would print 'found it' as many times as this word is present (I use this just for test), but it would not replace it.
def finds_firts_word():
# find first word
for cell in range(1, 2000):
data = sheet.cell(row=cell, column=2).value
if data is not None:
data=data.lower()
data_no_pct = data.translate(str.maketrans('', '', string.punctuation))
big_data1.append(data_no_pct)
x1 = " ".join(big_data1)
split_it1 = x1.split()
Count1 = Counter(split_it1)
most_occur1 = Count1.most_common(40)
print(most_occur1)
firstword = input('Please type word from list: ')
print('this is: ' firstword)
print('Replacing ' firstword ' with bbbbbbb')
REPLACE_TXTS = {
firstword: 'BBBBBBBB',
}
for n in split_it1:
if n == firstword:
print('found it')
for search_txt, replace_Txt in REPLACE_TXTS.items():
x = str(split_it1)
x.replace(search_txt, replace_Txt)
print('done')
CodePudding user response:
The main thing I see here is that you're splitting the text, then replacing things in the list of split texts, which won't affect the original one.
Like, this:
text = "one, two, three, four, things"
for x in text.split(", "):
x.replace("o", "q")
won't change the variable text
. It just changes the elements in the list ["one", "two", "three", "four", "things"]
that was generated to be used within the for
.
Not sure if that alone is enough to point you to the right direction.
One workaround would be:
text = "one, two, three, four, things"
new_text = []
for x in text.split(", "):
x.replace("o", "q")
new_text.append(x)
text = ", ".join(new_text)
Then you'd have "qne, twq, three, fqur, things"
in the end.
CodePudding user response:
I hope this code helps you:
from typing import Counter
import pandas as pd
import re
df = pd.read_excel (r'yourexcelfilename.xlsx', engine='openpyxl', header=None)
most_occur = Counter(df[1]).most_common(40)
print (most_occur)
firstword = input('Please type word from list: ')
print('this is: ' firstword)
print('Replacing ' firstword ' with bbbbbbb')
for row in df[1].index:
df.loc[row, 1] = re.sub(firstword,"bbbbbbb", df.loc[row,1] )
print(df)
with regex
it doesn't matter what is in column B. You can replace firstword
with anything anywhere.