Home > Enterprise >  awk get the data based on if else condition
awk get the data based on if else condition

Time:04-18

I need again your expertise, I am trying to do some conditional using awk to get the columns.

If I look at the $5 the data can have year and in some places a date.

So when year is there it's good to print, but other values where I have date and time like 05:17:27 then I need to print the last field.

2021
2021
05:17:27
20:33:17
05:17:20
2020
2020
2021
2020
2021

Below is my sample data.

data_file.

yogutdb01   Mon 28 Jun 2021 11:19:56 PM MST
yogutdb02   Thu 30 Sep 2021 02:02:53 AM MST
yogutdb03   Thu Jul 13 05:17:27 2017
yogutdb04   Fri Jun 23 20:33:17 2017
yogutdb05   Thu Jul 13 05:17:20 2017
yogutdb06   Wed 24 Jun 2020 03:49:16 PM MST
yogutdb07   Wed 24 Jun 2020 04:05:10 PM MST
yogutdb08   Sat 22 May 2021 04:19:14 AM MST
yogutdb09   Thu 09 Apr 2020 12:16:32 PM CEST
yogutdb10   Tue 11 May 2021 03:03:02 PM MST

My trial: I am using below but getting syntax error on the else condition.

$ awk '{ ($5=="[^0-9] $")print $1,$2,$3,$4,$5; else print  $1,$2,$3,$4,$NF}' my_data.text

Desired Should be:

yogutdb01   2021 
yogutdb02   2021
yogutdb03   2017
yogutdb04   2017    
yogutdb05   2017
yogutdb06   2020
yogutdb07   2020
yogutdb08   2021
yogutdb09   2020
yogutdb10   2021

OR

yogutdb01   Mon 28 Jun 2021
yogutdb02   Thu 30 Sep 2021
yogutdb03   Thu Jul 13 2017
yogutdb04   Fri Jun 23 2017
yogutdb05   Thu Jul 13 2017
yogutdb06   Wed 24 Jun 2020 
yogutdb07   Wed 24 Jun 2020 
yogutdb08   Sat 22 May 2021 
yogutdb09   Thu 09 Apr 2020 
yogutdb10   Tue 11 May 2021 

CodePudding user response:

  • You cannot use the == operator to test the regex match. Instead you can use match() function or ~ operator.
  • You should place the ^ regex in front of [0-9], not inside.

Then would you please try:

awk '{if (match($5,/^[0-9] $/)) print $1, $2, $3, $4, $5; else print $1, $2, $3, $4, $NF}' my_data.text

Output:

yogutdb01 Mon 28 Jun 2021
yogutdb02 Thu 30 Sep 2021
yogutdb03 Thu Jul 13 2017
yogutdb04 Fri Jun 23 2017
yogutdb05 Thu Jul 13 2017
yogutdb06 Wed 24 Jun 2020
yogutdb07 Wed 24 Jun 2020
yogutdb08 Sat 22 May 2021
yogutdb09 Thu 09 Apr 2020
yogutdb10 Tue 11 May 2021

Here is an alternative using ~ operator:

awk '$5 ~ /^[0-9] $/ {print $1, $2, $3, $4, $5; next} {print $1, $2, $3, $4, $NF}' my_data.text

CodePudding user response:

You could print the first 4 fields, and check the 5th field for only 4 digits. If there are not only 4 digits, print the last field.

awk '{print $1, $2, $3, $4, $5 ~ /^[0-9] $/ ? $5 : $NF}' my_data.text

Output

yogutdb01 Mon 28 Jun 2021
yogutdb02 Thu 30 Sep 2021
yogutdb03 Thu Jul 13 2017
yogutdb04 Fri Jun 23 2017
yogutdb05 Thu Jul 13 2017
yogutdb06 Wed 24 Jun 2020
yogutdb07 Wed 24 Jun 2020
yogutdb08 Sat 22 May 2021
yogutdb09 Thu 09 Apr 2020
yogutdb10 Tue 11 May 2021

CodePudding user response:

As per your desired outcome, you should try below which will work.

You can use Regular expression matches like ~.

$ awk '{ if ($5 !~ /:/) { print $1,$2,$3,$4,$5; next } { print $1,$2,$3,$4, $NF } }'   exampl_data1

Result:

yogutdb01 Mon 28 Jun 2021
yogutdb02 Thu 30 Sep 2021
yogutdb03 Thu Jul 13 2017
yogutdb04 Fri Jun 23 2017
yogutdb05 Thu Jul 13 2017
yogutdb06 Wed 24 Jun 2020
yogutdb07 Wed 24 Jun 2020
yogutdb08 Sat 22 May 2021
yogutdb09 Thu 09 Apr 2020
yogutdb10 Tue 11 May 2021

Just to mention, as @tshiono also asked in the comment,to get the output in order, you can use below.

$ awk '{ if ($5 !~ /:/) { print $1, $2, $3, $4, $5; next } { print $1, $2, $4, $3, $NF } }'   exampl_data1
  • Related