Home > Back-end >  Regex for extracting certain information from a string
Regex for extracting certain information from a string

Time:11-11

Below is the string that I have -

vdp_plus_forecast_aucc_VDP_20221024_variance_analysis_20221107_backcasting_actuals_asp_True_vlt_True.csv

I need RegEx to take out following items from the string -

20221107
vlt_True

Need help with writing right RegEx for these two extractions. I'm performing the operation on a PySpark DF.

CodePudding user response:

I'm assuming that the answer is based on the variable in front of it so it's capturing the value of variance analysis:

(?<=_variance_analysis_)[0-9] |vlt_(True|False)

This should capture the variables you wanted, if you only need the value of vlt, you can replace vlt_ with (?<=_vlt) which will just capture the value without the variable

  • Related