Home > Net >  Snowflake regexp_substr to fetch folder name as a column from s3 bucket
Snowflake regexp_substr to fetch folder name as a column from s3 bucket

Time:12-31

Can someone help me fetch "Client name" using regexp_substr from below path in Snowflake. The client names here can have spaces or underscores

s3://student-com/student-to-employee/student1/Student_Party/20221223/Client ABC/Employee_1st_data_20221223115427_4048241.csv

s3://student-com/student-to-employee/student1/Student_Party/20221223/Client_XYZ/Employee_1st_data_20221223115427_4048241.csv

I used something like this:

select regexp_substr(METADATA$FILENAME, '(. \/)*(. .csv)$', 1,1,'e') as name

CodePudding user response:

You can use

select regexp_substr(METADATA$FILENAME, '([^/] )/[^/]*\.csv$', 1,1,'e') as name

See the regex demo.

Details:

  • ([^/] ) - Group 1: any one or more chars other than /
  • / - a / char
  • [^/]* - zero or more chars other than /
  • \.csv - .csv
  • $ - end of string.

CodePudding user response:

If client name is always the subfolder preceding the file name, you can use split_part

select split_part(col,'/',-2)
  • Related