Filter rows based on value-CodePudding

I want to keep the rows of dataframe df if the strings of "b" contain strings of "b2" from dataframe df2

import pandas as pd

d = {'a': [100, 125, 300, 235], 'b': ["abc","ghf" "dfg", "hij"]}
df = pd.DataFrame(data=d, index=[1, 2, 3, 4])

print(df)

     a    b
1  100  abc
2  125  ghf
3  300  dfg
4  235  hij


d2 = {'a2': [10, 25, 30], 'b2': ["bc", "fg", "op"]}
df2 = pd.DataFrame(data=d2, index=[1, 2, 3])

print(df2)
   a2  b2
1  10  bc
2  25  fg
3  30  op

The output should look like this:

     a    b
1  100  abc
2  300  dfg

I tried the following but it did not work.

for majstring in df.b:
    for substring in set(df2.b2):
        if substring in majstring:
            pass
        else:
            df.drop(df.loc[df['b'] == majstring], inplace=True)

CodePudding user response：

Try this:

mask = sum([df['b'].str.contains(v) for v in df2['b2']]).astype(bool)
filtered_df = df[mask]

Output:

>>> filtered_df
     a    b
1  100  abc
3  300  dfg