Home > Mobile >  Checking if a word from dataset A is present in the sentence of dataset B
Checking if a word from dataset A is present in the sentence of dataset B

Time:01-14

I am a beginner in Python so I get stuck sometimes in easy stuff.

Please, consider a column of names (securities) and tickers in df1:

Security Tickers
Google GOOG
Twitter TWTR
Logitech LOGI

and then consider a column of news headlines in df2:

headlines
Twitter bought by rich entrepreneur
Netflix lost 5m subscribers
Amazon stocks raised 3 percent

I want to create a new column in df2 with the ticker associated to that precise news if the security of df1 is present in df2["headlines"]. Otherwise, delete that row from df1.

I tried several versions of code.

The simplest one was:

for i in range(len(df2["headlines"])):
    if df1["Security"][i] in df2["Headlines"][i]: 
        df2["Tickers"] = df1["Tickers"][i]
        
    else:
        data.drop(labels=[i],axis=0)

Here the problem was that df1 has 500 rows, while df2 has 30k rows. The loop should restart for df1 since I want to check that any security is present or not in any of the headlines of df2.

From there on I tried other things, including df.isin etc..., but it never worked. What do you suggest? Thanks!

CodePudding user response:

Try this:

#To create the new column in df2
for i in range(len(df1)):
    for j in range(len(df2)):
        if df2['Headlines'].str.contains(df1["Security"][i])[j]:
            df2.loc[j, "Tickers"]=df1["Tickers"][i]  

#Restrict df1 just to companies included in df2 
df1=df1[df1['Tickers'].isin(df2['Tickers'])]
  • Related