Home > OS >  How to only select the very next string after target string but the next string after the target str
How to only select the very next string after target string but the next string after the target str

Time:05-27

I have a df that looks like this:

id        query
1         select * from table1 where col1 = 1
2         select a.columns FROM table2 a

I want to only select the string (table if you know sql) after the string FROM into a new column. FROM can be spelled with different capitalizations (ie From, from,FROM,etc).

How do I select the string directly after the From but not the very next string after the FROM string

I tried:

df['tableName'] = df['query'].str.extract('[^from]*$')

but this is not working. I am not sure if I should make the entire df lowercase right off the bat.

New df should look like this:

id        query                                 tableName      
1         select * from table1 where col1 = 1   table1
2         select a.columns FROM table2 a        table2

Thank you in advance.

CodePudding user response:

You can try

df['tableName'] = df['query'].str.extract('(?i)from ([^ ]*)')

(?i) means ignore case.

print(df)

   id                                query tableName
0   1  select * from table1 where col1 = 1    table1
1   2       select a.columns FROM table2 a    table2

CodePudding user response:

This will get you your answer without a regex and should account for all capitalizations types of "table"

df['Table_Name'] = df['query'].apply(lambda x : x.lower().split('from')[1]).apply(lambda x : x.split()[0])
  • Related