Home > Enterprise >  Match timestamps in WebVTT files with sed
Match timestamps in WebVTT files with sed

Time:05-14

I have the following PCRE2 regex that works to match and remove timestamp lines in a .webVTT subtitle file (the default for YouTube):

^[0-9].:[0-9].:[0-9]. $

This changes this:

00:00:00.126 --> 00:00:10.058
How are you today?

00:00:10.309 --> 00:00:19.272
Not bad, you?

00:00:19.559 --> 00:00:29.365
Been better.

To this:

How are you today?

Not bad, you?

Been better.

How would I convert this PCRE2 regex to an idiomatic (read: sane-looking) equivalent for sed's flavour of regex?

CodePudding user response:

Using your regex with sed

$ sed -En '/^[0-9].:[0-9].:[0-9]. $/!p' file
How are you today?

Not bad, you?

Been better.

Or, do not match lines that end with an integer

$ sed  -n '/[0-9]$/!p' file
How are you today?

Not bad, you?

Been better.
  • Related