I have a .csv file of strings containing paths to certain files. I want to capture all filenames from that file.
Example data:
/second/path/to/something-4-5_4.pdf,
/path/to/certain/file.pdf
randomnoise,
What I want to capture: all occurences of words after slash and ending with .pdf, in this case:
something-4-5_4.pdf
file.pdf
What I tried:
\/(.*)\.pdf
This unfortunately catches everything between / and .pdf, the whole path. I have trouble coming up with condition for it to catch the part I want
CodePudding user response:
The point is that the .
pattern matches any char other than line break chars. You need to restrict the pattern to only match any chars other than a slash.
There are several solutions, including
\/([^\/]*\.pdf)
[^\/]*\.pdf
[^\/]*\.pdf$
See the regex demo. Details:
\/([^\/]*\.pdf)
matches/
, then captures one or more chars other than/
as many as possible and then.pdf
into Group 1[^\/]*\.pdf
just matches one or more chars other than/
as many as possible and then.pdf
[^\/]*\.pdf$
works the same as above, but also makes sure thepdf
is at the end of string.