this has been giving me a lot of trouble
URL: http://123.123.123.123
file: php
124.124.124.124|user1|email|phone
URL: http://1.2.3.4
file: php
2.1.3.1|userx|emailx|phonex
and the file contains more sets of data just like this one
i used
grep http -A 3|tr '\n' ' '|tr '|' ' '|awk '{print $2,$7,$8}'|tr ' ' ':'
the outcome is only from the first set of data
123.123.123.123:email:phone
intended outcome
123.123.123.123:email:phone
1.2.3.4:emailx:phonex
CodePudding user response:
If you are using Awk anyway, you can get rid of grep
and tr
.
If you can rely on the empty line to separate arguments, try RS='\n\n'
. Here's a refactoring which instead extracts the information from the third line after the hit.
awk '/http/ { l=2; ip=$0; sub(/.*\/\//, "", ip); next }
l && --l == 0 { tail=$0; sub(/^[^|]*[|][^|]*[|]/, "", tail);
sub(/[|]/, ":", tail); print ip ":" tail }'
Perhaps /^URL:/
would be a better regex than /http/
for finding the beginning of a record.
CodePudding user response:
gawk 'gsub("[|]", ":", $!(NF = NF))' RS= OFS= FS='. //|\n[^|]*[|][^|]*'
123.123.123.123:email:phone
1.2.3.4:emailx:phonex
CodePudding user response:
I'd do it like that:
awk -F\| '
/^URL:/ { sub(/.*\/\//,""); url=$0; next }
NF==4 { printf "%s:%s:%s\n", url, $3, $4 }
' file
CodePudding user response:
If ed
is available/acceptable.
The script.ed
g/^$/d
g|^URL: http://|s|||\
d
%s/^.*user[^|]*//
g/./; j
%s/|/:/g
,p
Q
Run
ed -s file.txt < script.ed
CodePudding user response:
I would exploit getline
function for this task as follows, let file.txt
content be
URL: http://123.123.123.123
file: php
124.124.124.124|user1|email|phone
URL: http://1.2.3.4
file: php
2.1.3.1|userx|emailx|phonex
then
awk 'BEGIN{FS="|";OFS=":"}sub(/^URL: /,""){url=$0;getline;getline;print url,$3,$4}' file.txt
gives output
http://123.123.123.123:email:phone
http://1.2.3.4:emailx:phonex
Explanation: I inform GNU AWK
that field separator (FS
) is pipe (|
) whilst output field separator (OFS
) is colon (:
), I use two effects of sub
: alteration of line and return value, if alteration occurred I save current line (with leading URL:
removed by sub
) I do use getline twice to get line after next line, after that I print
url
, 3rd and 4th columns.
(tested in GNU Awk 5.0.1)