Home > Enterprise >  Convert Python Regex to Bash Regex
Convert Python Regex to Bash Regex

Time:11-22

I am trying to write a bash script to convert files for streaming on the home network.

I am wondering if the community could recommend something that would allow me to use my existing regex to search a string for the presence of a pattern and replace the text following a pattern.

Part of this involves naming the file to include the quality, release year and episode information (if any of these are available).

I have some Python regex I am trying to convert to a bash regex search and replace.

There are a few options such as Sed, Grep or AWK but I am not sure what is best for my approach.

My existing python regex apparently uses an extended perl form of regex.

# Captures quality 1080p or 720p
determinedQuality = re.findall("[0-9]{3}[PpIi]{1}|[0-9]{4}[PpIi]{1}", next_line)

# Captures year (4 characters long and only numeric)
yearInitial = str(re.findall("[0-9]{4}[^A-Za-z]", next_line))

# Lazy programming on my part to clear up the string gathered from the year
determinedYear = re.findall("[0-9]{4}", yearInitial)

# If the string has either S00E00 or 1X99 present then its a TV show
determinedEpisode = re.findall("[Ss]{1}[0-9]{2}[Ee]{1}[0-9]{2}|[0-9]{1}[x]{1}[0-9]{2}", next_line)

My aim is to end up with a filename all in lowercase with underscores instead of spaces in the filename along with quality information if possible:

# Sample of desired file names
harry_potter_2001_720p_philosphers_stone.mkv

S01E05_fringe_1080p.mkv

CodePudding user response:

I simplified the regexs, for example if you need 3 or 4 you can use {3,4} and {1} is redundant you can remove it.

#!/bin/bash

INPUT="harry_potter_2001_720p_philosphers_stone.mkv"
#INPUT="S01E05_fringe_1080p.mkv"

determinedQuality=$(echo "$INPUT" | grep -Po '[0-9]{3,4}[PpIi]')
determinedYear=$(echo "$INPUT" | grep -Po '[0-9]{4}[^A-Za-z]' | grep -Po '[0-9]{4}')
determinedEpisode=$(echo "$INPUT" | grep -Po '[Ss]{1}[0-9]{2}[Ee][0-9]{2}|[0-9]x[0-9]{2}')

echo "quality: $determinedQuality"
echo "year: $determinedYear"
echo "episode: $determinedEpisode"

output for first one:

quality: 720p
year: 2001
episode: 

output for second one:

quality: 1080p
year: 
episode: S01E05
  • Related