Home > Net >  Getting rows that contain specific keywords from a column
Getting rows that contain specific keywords from a column

Time:06-03

I have a dataset with text message data. i want to take out specific rows that contain specific keywords into another csv file. enter image description here

Please find the sample dataset here: https://docs.google.com/spreadsheets/d/1B7LgkNn2pLchbmjRggAWkq6O7GrWi79aiJJOUpmDGIc/edit?usp=sharing

I wrote this. Its not working out well. Need some assistance to point me in the right direction

keywords = ["SBI", "HDFC", "Canara", "HSBC", "KTK"]
listMatchPosition = []
listMatchDescription = []


df = pd.read_csv("SMS.csv", sep=",")

for i in range(len(df.index)):
    if any(df['text'][i] for x in keywords):
        
        listMatchDescription.append(df['text'][i])


output = pd.DataFrame({'senderAddress':listMatchDescription})
output.to_csv("new_data.csv", index=False)

CodePudding user response:

SO won't let me post sample code w/bit.ly links in it, so here's a solution w/o the sample data setup.

I had to add Zomato to the keywords list so that there is an actual match, as none of the other keywords you have are present in your sample text.

keywords = ["Zomato", "HDFC", "Canara", "HSBC", "KTK"]

matches = df.loc[df.text.apply(lambda x: any(k for k in keywords if k in x))][['senderAddress','text']]

print(matches)

Output

 senderAddress                                               text
0     JK-SmplPL  Rs.95.15 on Zomato charged via Simpl.\r\n--\r\...
3     BP-ACKOGI  Mohd,\nCheck the incredible Acko insurance pol...
​

CodePudding user response:

Edit: use apply to execute a custom function:

def check_string(x):
    for i in keywords:
        if i in x.title:
            return x

output = df.apply(check_string)

I hope this works

  • Related