Home > Software engineering >  Get the substring between two given words in a dataframe
Get the substring between two given words in a dataframe

Time:04-25

I have a dataframe, as shown below. What i want is to get the substring between the two given words in columns [word1] and [word2] and save the result a new column [result]:

index   string                                      word1   word2   result

1       This is a very simple string!               is      string  a very simple
2       Again! This is a very simple string         Again   very    !This is a

I know how to do it in a simple string, like below, but what i want is how to apply it in a dataframe.

s = "123ddddddabc"

def find_between(s, first, last):
    try:
        start = s.index( first )   len( first )
        end = s.index( last, start )
        return s[start:end]
    except ValueError:
        return ""
print (find_between( s, "123", "abc" ))

CodePudding user response:

Apply Lambda shall do the trick.

df['result'] = df.apply(lambda row: find_between(row['string'], row['word1'], row['word2']), axis=1)
  • Related