I have a df that looks like this:
id query
1 select * from table1 where col1 = 1
2 select a.columns FROM table2 a
I want to only select the string (table if you know sql) after the string FROM
into a new column. FROM
can be spelled with different capitalizations (ie From, from,FROM,etc
).
How do I select the string directly after the From
but not the very next string after the FROM
string
I tried:
df['tableName'] = df['query'].str.extract('[^from]*$')
but this is not working. I am not sure if I should make the entire df lowercase right off the bat.
New df should look like this:
id query tableName
1 select * from table1 where col1 = 1 table1
2 select a.columns FROM table2 a table2
Thank you in advance.
CodePudding user response:
You can try
df['tableName'] = df['query'].str.extract('(?i)from ([^ ]*)')
(?i)
means ignore case.
print(df)
id query tableName
0 1 select * from table1 where col1 = 1 table1
1 2 select a.columns FROM table2 a table2
CodePudding user response:
This will get you your answer without a regex and should account for all capitalizations types of "table"
df['Table_Name'] = df['query'].apply(lambda x : x.lower().split('from')[1]).apply(lambda x : x.split()[0])