I have a file that has multiple blocks of lines like so
line1
line1
-----
machine:chrome
purpose:
language:v2
request:v3
additional: v4
os:v4
-----
machine:firefox
purpose:
language:v2
request:v6
os:v4
-----
machine:helper
purpose:
language:v2
request:v8
os:v4
-----
another line
The lines don't necessarily have the same elements but they all start with machine and end with os. I can only use shell commands so what I want to do is parse the line starting with machine in each block starting with machine and ending in os and use the parsed result is a command whose result is to be inserted in request.
so parse each line that has machine in it and use that value to run a different shell command with its result and then populate request with that result. As a challenge I was wondering if this could be done using only sed and awk.
My expected output for the above would be:
line1
line1
-----
machine:chrome
purpose:
language:v2
request:[output of result of ps -ef chrome | awk '{print $2}']
additional: v4
os:v4
-----
machine:firefox
purpose:
language:v2
request:[output of result of ps -ef firefox | awk '{print $2}']
os:v4
-----
machine:helper
purpose:
language:v2
request:[output of result of ps -ef helper | awk '{print $2}']
os:v4
-----
another line
Update: Trying to do this in sed alone I got the following:
gsed -r '/machine/,/os/ {/machine/ {s/.*?:\s*([^\s] )/ps -ef | grep \1\n/e}; /request/ {s/.*?:\s*([^\s] )//}}' filename
Which does not work but It runs the ps -ef | grep [machinename] and stores it in the buffer. Now I'd like to know if I can use the buffer value in the request substitution and if so how?
A
CodePudding user response:
Edit: Because of changed requirements I am updating the script. The folowing produces the required output:
#!/bin/bash
function processlines() {
local line machine request
# Skips first three lines, but it is not really necessary.
#for k in `seq 1 3`; do read line; echo "$line"; done
while true; do
read -r line || return 0
if echo "$line" | grep '^machine:' >> /dev/null; then
machine="$(echo "$line" | cut -d ':' -f 2)"
echo "$line"
elif echo "$line" | grep '^request:' >> /dev/null; then
request="$(echo YOUR COMMAND HERE "$machine")"
echo "request:$request"
else
echo "$line"
fi
done
}
processlines < test.txt
Note: This works as long as the fields appear in the order shown by you. If "request" appears before "machine" or if one of both is missing in the file, the script would break. Please let me know if this can be the case.
Old answer: You don't need sed or awk for that. It's doable almost by pure bash tail/cut:
cat test.txt | tail -n 4 | while read machineline; do
[[ "$machineline" == "another line" ]] && break
read purposeline
read languageline
read requestline
read osline
read separatorline
machine="$(echo $machineline | cut -d ':' -f 2)"
purpose="$(echo $purposeline | cut -d ':' -f 2)"
language="$(echo $languageline | cut -d ':' -f 2)"
request="$(echo $requestline | cut -d ':' -f 2)"
os="$(echo $osline | cut -d ':' -f 2)"
separator="$(echo $separatorline | cut -d ':' -f 2)"
# Here do anything with the variables...
echo "machine is '$machine'" \
"purpose is '$purpose'" \
"language is '$language'" \
"request is '$request'" \
"os is '$os'" \
"separator is '$separator'"
done
And if you need the "machine" value only, then it is way easier:
cat test.txt | grep '^machine:' | cut -d ':' -f 2 | while read machinevalue; do
# call your other command here...
echo "machine value is '$machinevalue'"
done
A word of caution: If your values contain the character ":" this script would break and then you would have to use sed 's/^machine://g'
instead of cut -d ':' -f 2
.
A possible optimization would be to use bash for extracting the parts of the string but I am too lazy for that and unless I need the performance, I prefer using shell commands because I remember them more easily.
CodePudding user response:
Regarding I was wondering if this could be done using only sed and awk
- no, it can't because the task requires a shell to call ps
so any sed
or awk
script would need to spawn a subshell to call ps
, they can't call it on their own. So if you tried to do that then in terms of calls you'd end up with something like shell { awk { system { subshell { ps } } } }
(which clearly isn't only using awk
anyway) instead of simply shell { ps }
.
Using md5sum
(a very common application for this technique) for the example instead of ps -ef
which would produce different output on everyone's different machines, you can tweak it to use ps -ef
later, you COULD do the following (but don't, see the 2nd script below for a better approach):
$ cat tst.sh
#!/usr/bin/env bash
infile="$1"
while IFS= read -r line; do
if [[ "$line" =~ ^([^:] ):(.*) ]]; then
tag="${BASH_REMATCH[1]}"
val="${BASH_REMATCH[2]}"
case "$tag" in
machine )
mac="$val"
;;
request )
val="$(printf '%s' "$mac" | md5sum | cut -d' ' -f1)"
;;
esac
line="${tag}:${val}"
fi
printf '%s\n' "$line"
done < "$infile"
$ ./tst.sh file
line1
line1
-----
machine:chrome
purpose:
language:v2
request:554838a8451ac36cb977e719e9d6623c
additional: v4
os:v4
-----
machine:firefox
purpose:
language:v2
request:d6a5c9544eca9b5ce2266d1c34a93222
os:v4
-----
machine:helper
purpose:
language:v2
request:fde5d67bfb6dc4b598291cc2ce35ee4a
os:v4
-----
another line
While the above would work it'd be very inefficient (see why-is-using-a-shell-loop-to-process-text-considered-bad-practice) since it's looping through every line of input using shell so the following is how I'd really approach a task like this as it's far more efficient since it only has shell loop through each of the machine:
lines from the input (which is unavoidable and is a far smaller number of iterations than if it had to read every input line) and the rest is done with a single call to sed to generate the input for the shell loop and a single call to awk to produce the output:
$ cat tst.sh
#!/usr/bin/env bash
infile="$1"
sed -n 's/^machine://p' "$infile" |
while IFS= read -r mac; do
printf '%s\n%s\n' "$mac" "$(printf '%s' "$mac" | md5sum | cut -d' ' -f1)"
done |
awk '
NR==FNR {
if ( NR % 2 ) {
mac = $0
}
else {
map[mac] = $0
}
next
}
{
tag = val = $0
sub(/:.*/,"",tag)
sub(/[^:]*:/,"",val)
}
tag == "machine" { mac = val }
tag == "request" { $0 = tag ":" map[mac] }
{ print }
' - "$infile"
$ ./tst.sh file
line1
line1
-----
machine:chrome
purpose:
language:v2
request:554838a8451ac36cb977e719e9d6623c
additional: v4
os:v4
-----
machine:firefox
purpose:
language:v2
request:d6a5c9544eca9b5ce2266d1c34a93222
os:v4
-----
machine:helper
purpose:
language:v2
request:fde5d67bfb6dc4b598291cc2ce35ee4a
os:v4
-----
another line
Here's what each of the above steps does:
- Get just the
machine:
lines and remove themachine:
part so we can have shell loop through just the parts it needs to call some command (e.g.ps -ef
ormd5sum
) on:
$ sed -n 's/^machine://p' "$infile"
chrome
firefox
helper
- Loop through each of those lines producing a mapping from that word to the output of the shell command you need to run on it (we generate the mapping in pairs of lines so the subsequent awk can parse it robustly even if the machine name from the input contained
:
s):
$ sed -n 's/^machine://p' "$infile" |
while IFS= read -r mac; do
printf '%s\n%s\n' "$mac" "$(printf '%s' "$mac" | md5sum | cut -d' ' -f1)"
done
chrome
554838a8451ac36cb977e719e9d6623c
firefox
d6a5c9544eca9b5ce2266d1c34a93222
helper
fde5d67bfb6dc4b598291cc2ce35ee4a
- Pass that mapping to awk which separates it into the part before the
:
(which I'm calling a tag) and the part after it (which I'm calling aval
ue) and stores the mapping in an array:
NR==FNR {
if ( NR % 2 ) {
mac = $0
}
else {
map[mac] = $0
}
next
}
- It now reads the input file again, then using that array to modify the
request:
lines before printing each line (populatingtag
andval
this way instead of settingFS
to:
and using$1
and$2
so we can again handle any input that contains:
s in other locations):
{
tag = val = $0
sub(/:.*/,"",tag)
sub(/[^:]*:/,"",val)
}
tag == "machine" { mac = val }
tag == "request" { $0 = tag ":" map[mac] }
{ print }
The above assumes the shell command output is a single line each time it's called.
CodePudding user response:
In pure bash:
#!/bin/bash
while IFS= read -r line; do
if [[ $line = machine:* ]]; then mach=${line#*:}
elif [[ $line = os:* ]]; then mach=""; fi
if [[ $line = request:* && $mach ]]; then
printf 'request:'
your_command "$mach"
else
printf '%s\n' "$line"
fi
done < file
If the output of your command doesn't end with a newline character, then place an echo
after your command.