I created regex expression in JAVA for 2 links at once:
- https://downloads.test.test.testagain.tes/test-test/test/te25st24w/te43s5t25x/0twt42ts/test0218.pdf
- https://downloads.test.test.testagain.tes/test-test/test/te25st24w/te43s5t25x/0twt42ts/TestTes-09-05-2018.pdf
Regex: String REGEX_LINK = "https:..downloads.test.test.testagain.tes.test-test.test."
Pattern pattern = Pattern.compile( REGEX_LINK ".[\w*/]*.((\d{2}-\d{2}-)?\d{4}).pdf" );
But I have to create regex expression for 3 links at once and I don't know how to do that, I need help with this:
- https://downloads.test.test.testagain.tes/test-test/test/te25st24w/te43s5t25x/0twt42ts/test0218.pdf
- https://downloads.test.test.testagain.tes/test-test/test/te25st24w/te43s5t25x/0twt42ts/TestTes-09-05-2018.pdf
- https://downloads.test.test.testagain.tes/test-test/test/te25st24w/te43s5t25x/0twt42ts/01-01-18_Testt_Testing_ASB_Test_Final.pdf
I have to create one regex expression to extract String from 1 link: "0218", from 2 link: "09-05-2018", from 3 link: "01-01-18"
Maybe someone has a any idea how to do this?
CodePudding user response:
You could match 2 times 2 digits with an optional hyphen, and then optionally 4 or 2 digits preceded by a hyphen.
Note that the pattern by itself does not verify a valid date.
(?<!\d)(\d{2}-?\d{2}(?:-(?:\d{4}|\d{2}))?)\S*\.pdf\b
Explanation
(?<!\d)
Negative lookbehind, assert not a digit to the left(
Capture group 1\d{2}-?\d{2}
Match 2 digits, optional hyphen and 2 digits(?:-(?:\d{4}|\d{2}))?
Optionally match-
and either 4 or 2 digits
)
Close group 1\S*
Match optional non whitespace chars\.pdf\b
Match a dot andpdf
followed by a word boundary
Or if there can not be any other digits following till the end of the string:
(?<!\d)(\d{2}-?\d{2}(?:-(?:\d{4}|\d{2}))?)[^\d\s]*\.pdf\b