Home > Enterprise >  put grep output at the end of another line into another file
put grep output at the end of another line into another file

Time:12-10

I got a list containing movie names like this

Film Name - Film.information.lanugage.2160p.more.info
Film Name - Film.info.information.1080p.more.info
Film Name - Film.information.lanugage.1080p.information.info
Film Name - Film.information.more.720p.more.info
Film Name - Film.more.lanugage.2160p.more.info

I am using grep '[0-9][0-9][0-9][0-9]p' list.txt > resolution.txt to filter the resolution. And I will search for a sed cmd to delete everything with and past the -

Should look smth like this I think sed 's/-.*$//g' list.txt > cleanList.txt after that I want to add the resolution from the resolution.txt to the end of the lines from the cleanList.txt

the final file should look like this

Film Name 2160p
Film Name 1080p
Film Name 1080p
Film Name 720p
Film Name 2160p

CodePudding user response:

You can use

sed -E 's/(.*) - (.*[^0-9])?((480|720|1080|1440|2160|4320)p?)([^0-9].*)?/\1 \3/' list.txt > output.txt

Details:

  • (.*) - matches and captures into Group 1 as many any chars as possible
  • - - space - space
  • (.*[^0-9])? - Group 2 (optional): any text and then a non-digit char
  • ((480|720|1080|1440|2160|4320)p?) - Group 3: any of the common resolution values (in Group 4) and then an optional p
  • ([^0-9].*)? - Group 5 (optional): a non-digit char and then any text.

The \1 \2 replacement replaces the matched line with Group 1 space Group 2 values.

See the online demo:

#!/bin/bash
s='Film Name - Film.information.lanugage.2160p.more.info
Film Name - name name - Film.info.information.1080p.more.info
Star Wars - Episode V - Das Imperium schlägt zurück - Star.Wars.Episode.V.Das.Imperium.schlaegt.zurueck.1980.German.DL.2160p.UHD.BluRay.x265-ENDSTATiON
Film Name - Film.information.lanugage.1080p.information.info
Film Name - asfasfaf - Film.information.more.720p.more.info
Film Name - Film.more.lanugage.2160p.more.info
Boss Baby - Schluss mit Kindergarten - pso-bossbaby2_bd.1080p
Sicario 2 - encounters-si2so_1080p
Skyscraper - encounters-skyscraper_1080p
Unsere Zeit ist jetzt - roor-unserezeit-1080p
Schindlers Liste - d-schindlersliste-1080p
South Park: Der Film – größer, länger, ungeschnitten - in-southpark1080p
Ein Hund namens Palma - rf-ehnp2021.1080
Taxi Driver (1976) - d-taxidriver-1080p
The Taking of Deborah Logan - The.Taking.of.Deborah.Logan.2014.LIMITED.1080p.BluRay.X264-CADAVER
Die Feuerzangenbowle 1944 - d-feuerzangenbowle-1080p
Hooligans - rsg-hooligans-1080p
Geständnisse - Confessions - wombat-gestaendnisse-1080p
Greyhound - greyhound.2020.german.dl.1080p.web.h264-wayne'
 
sed -E 's/(.*) - (.*[^0-9])?((480|720|1080|1440|2160|4320)p?)([^0-9].*)?/\1 \3/' <<< "$s"

Output:

Film Name 2160p
Film Name - name name 1080p
Star Wars - Episode V - Das Imperium schlägt zurück 2160p
Film Name 1080p
Film Name - asfasfaf 720p
Film Name 2160p
Boss Baby - Schluss mit Kindergarten 1080p
Sicario 2 1080p
Skyscraper 1080p
Unsere Zeit ist jetzt 1080p
Schindlers Liste 1080p
South Park: Der Film – größer, länger, ungeschnitten 1080p
Ein Hund namens Palma 1080
Taxi Driver (1976) 1080p
The Taking of Deborah Logan 1080p
Die Feuerzangenbowle 1944 1080p
Hooligans 1080p
Geständnisse - Confessions 1080p
Greyhound 1080p

CodePudding user response:

You can use the pipe '|' operand to pass the output of one command as the input of a second command. For example:

grep '[0-9][0-9][0-9][0-9]p' list.txt | sed 's/-.*$//g' list.txt > cleanList.txt

If you want to save the output of the first to a file AND process it with the second, you should use the command tee (tree) to write the same output to both. Example: grep '...' list.txt | tee resolution.txt | sed '...' > cleanList.txt

See: https://www.geeksforgeeks.org/tee-command-linux-example/ How to redirect output to a file and stdout How does a pipe work in Linux?

CodePudding user response:

I suggest you to use awk which gives you a cleaner solution, in one pass, rather than using grep and sed.

Try:

awk -F" - " '{match($2, "[0-9] p"); print $1, substr ($2, RSTART, RLENGTH)}' list.txt > cleanList.txt

I use the string " - " as field separator between $1 and $2 on each input line.

The function match() looks for some regex corresponding to digits followed by the letter p inside of $2. This function sets the variables RSTART and RLENGTH in a way that is suitable for the function substr() to extract the matching pattern and to print it.

  • Related