Home > other >  Regex: match string between mandatory and optional groups
Regex: match string between mandatory and optional groups

Time:11-20

I'm trying to parse file with list of movies where strings like:

id,title (year),genre1|genre2|genre3

Year field is optional, but there are movies with some parts of title in brackets

So I have such regex:

(?:^\s*(\d )\s*,.*?)(?:.*?\((\d{4})\))?(?:.*,\s*(.*)$)

my regex result

How can I improve it to catch title which is between id and optional year (or genres if there is no year)?

Data example:

1,Ace Ventura: When Nature Calls(1995),Comedy
20,Money Train (1995),Action|Comedy|Crime|Drama|Thriller
21,Get Shorty (1995),Comedy|Crime|Thriller
22,Copycat ,Crime|Drama|Horror|Mystery|Thriller
23,Assassins (1995),Action|Crime|Thriller
24,"Powder (1995)",Drama|Sci-Fi
25,Leaving (5) Las Vegas ,Drama|Romance

CodePudding user response:

The year is always before a comma, so don't put .* before the comma after the year.

^\s*(\d )\s*,(.*?)(?:\((\d{4})\))?\s*,\s*(.*)$

DEMO

  • Related