I need again your expertise, I am trying to do some conditional using awk to get the columns.
If I look at the $5
the data can have year
and in some places a date
.
So when year
is there it's good to print, but other values where I have date and time
like 05:17:27
then I need to print the last field.
2021
2021
05:17:27
20:33:17
05:17:20
2020
2020
2021
2020
2021
Below is my sample data.
data_file.
yogutdb01 Mon 28 Jun 2021 11:19:56 PM MST
yogutdb02 Thu 30 Sep 2021 02:02:53 AM MST
yogutdb03 Thu Jul 13 05:17:27 2017
yogutdb04 Fri Jun 23 20:33:17 2017
yogutdb05 Thu Jul 13 05:17:20 2017
yogutdb06 Wed 24 Jun 2020 03:49:16 PM MST
yogutdb07 Wed 24 Jun 2020 04:05:10 PM MST
yogutdb08 Sat 22 May 2021 04:19:14 AM MST
yogutdb09 Thu 09 Apr 2020 12:16:32 PM CEST
yogutdb10 Tue 11 May 2021 03:03:02 PM MST
My trial: I am using below but getting syntax error on the else
condition.
$ awk '{ ($5=="[^0-9] $")print $1,$2,$3,$4,$5; else print $1,$2,$3,$4,$NF}' my_data.text
Desired Should be:
yogutdb01 2021
yogutdb02 2021
yogutdb03 2017
yogutdb04 2017
yogutdb05 2017
yogutdb06 2020
yogutdb07 2020
yogutdb08 2021
yogutdb09 2020
yogutdb10 2021
OR
yogutdb01 Mon 28 Jun 2021
yogutdb02 Thu 30 Sep 2021
yogutdb03 Thu Jul 13 2017
yogutdb04 Fri Jun 23 2017
yogutdb05 Thu Jul 13 2017
yogutdb06 Wed 24 Jun 2020
yogutdb07 Wed 24 Jun 2020
yogutdb08 Sat 22 May 2021
yogutdb09 Thu 09 Apr 2020
yogutdb10 Tue 11 May 2021
CodePudding user response:
- You cannot use the
==
operator to test the regex match. Instead you can usematch()
function or~
operator. - You should place the
^
regex in front of[0-9]
, not inside.
Then would you please try:
awk '{if (match($5,/^[0-9] $/)) print $1, $2, $3, $4, $5; else print $1, $2, $3, $4, $NF}' my_data.text
Output:
yogutdb01 Mon 28 Jun 2021
yogutdb02 Thu 30 Sep 2021
yogutdb03 Thu Jul 13 2017
yogutdb04 Fri Jun 23 2017
yogutdb05 Thu Jul 13 2017
yogutdb06 Wed 24 Jun 2020
yogutdb07 Wed 24 Jun 2020
yogutdb08 Sat 22 May 2021
yogutdb09 Thu 09 Apr 2020
yogutdb10 Tue 11 May 2021
Here is an alternative using ~
operator:
awk '$5 ~ /^[0-9] $/ {print $1, $2, $3, $4, $5; next} {print $1, $2, $3, $4, $NF}' my_data.text
CodePudding user response:
You could print the first 4 fields, and check the 5th field for only 4 digits. If there are not only 4 digits, print the last field.
awk '{print $1, $2, $3, $4, $5 ~ /^[0-9] $/ ? $5 : $NF}' my_data.text
Output
yogutdb01 Mon 28 Jun 2021
yogutdb02 Thu 30 Sep 2021
yogutdb03 Thu Jul 13 2017
yogutdb04 Fri Jun 23 2017
yogutdb05 Thu Jul 13 2017
yogutdb06 Wed 24 Jun 2020
yogutdb07 Wed 24 Jun 2020
yogutdb08 Sat 22 May 2021
yogutdb09 Thu 09 Apr 2020
yogutdb10 Tue 11 May 2021
CodePudding user response:
As per your desired outcome, you should try below which will work.
You can use Regular expression matches like ~
.
$ awk '{ if ($5 !~ /:/) { print $1,$2,$3,$4,$5; next } { print $1,$2,$3,$4, $NF } }' exampl_data1
Result:
yogutdb01 Mon 28 Jun 2021
yogutdb02 Thu 30 Sep 2021
yogutdb03 Thu Jul 13 2017
yogutdb04 Fri Jun 23 2017
yogutdb05 Thu Jul 13 2017
yogutdb06 Wed 24 Jun 2020
yogutdb07 Wed 24 Jun 2020
yogutdb08 Sat 22 May 2021
yogutdb09 Thu 09 Apr 2020
yogutdb10 Tue 11 May 2021
Just to mention, as @tshiono also asked in the comment,to get the output in order, you can use below.
$ awk '{ if ($5 !~ /:/) { print $1, $2, $3, $4, $5; next } { print $1, $2, $4, $3, $NF } }' exampl_data1