Home > Software engineering >  Use AWK with delimiter to print specific columns
Use AWK with delimiter to print specific columns

Time:04-18

My file looks as follows:

 ------------------------------------------ --------------- ---------------- ------------------ ------------------ ----------------- 
| Message                                  | Status        | Adress         | Changes          | Test             | Calibration     |
|------------------------------------------ --------------- ---------------- ------------------ ------------------ -----------------|
| Hello World                              | Active        | up             |                1 |               up |            done |
| Hello Everyone Here                      | Passive       | up             |                2 |             down |            none |
| Hi there. My name is Eric. How are you?  | Down          | up             |                3 |         inactive |            done |
 ------------------------------------------ --------------- ---------------- ------------------ ------------------ ----------------- 
 ---------------------------- --------------- ---------------- ------------------ ------------------ ----------------- 
| Message                    | Status        | Adress         | Changes          | Test             | Calibration     |
|---------------------------- --------------- ---------------- ------------------ ------------------ -----------------|
| What's up?                 | Active        | up             |                1 |               up |            done |
| Hi. I'm Otilia             | Passive       | up             |                2 |             down |            none |
| Hi there. This is Marcus   | Up            | up             |                3 |         inactive |            done |
 ---------------------------- --------------- ---------------- ------------------ ------------------ ----------------- 

I want to extract a specific column using AWK. I can use CUT to do it; however when the length of each table varies depending on how many characters are present in each column, I'm not getting the desired output.

cat File.txt | cut -c -44
 ------------------------------------------ 
| Message                                  |
|------------------------------------------ 
| Hello World                              |
| Hello Everyone Here                      |
| Hi there. My name is Eric. How are you?  |
 ------------------------------------------ 
 ---------------------------- --------------
| Message                    | Status
|---------------------------- --------------
| What's up?                 | Active
| Hi. I'm Otilia             | Passive
| Hi there. This is Marcus   | Up
 ---------------------------- --------------

or

cat File.txt | cut -c 44-60
 --------------- 
| Status        |
 --------------- 
| Active        |
| Passive       |
| Down          |
 --------------- 
-- --------------
  | Adress
-- --------------
  | up
  | up
  | up
-- --------------

I tried using AWK but I don't know how to add 2 different delimiters which would take care of all the lines.

cat File.txt | awk 'BEGIN {FS="|";}{print $2,$3}'

 Message                                    Status
------------------------------------------ --------------- ---------------- ------------------ ------------------ -----------------
 Hello World                                Active
 Hello Everyone Here                        Passive
 Hi there. My name is Eric. How are you?    Down


 Message                      Status
---------------------------- --------------- ---------------- ------------------ ------------------ -----------------
 What's up?                   Active
 Hi. I'm Otilia               Passive
 Hi there. This is Marcus     Up

The output I'm looking for:

 ------------------------------------------ 
| Message                                  |
|------------------------------------------ 
| Hello World                              |
| Hello Everyone Here                      |
| Hi there. My name is Eric. How are you?  |
 ------------------------------------------ 
 ---------------------------- 
| Message                    |
|---------------------------- 
| What's up?                 | 
| Hi. I'm Otilia             | 
| Hi there. This is Marcus   | 
 ---------------------------- 

or

 ------------------------------------------ --------------- 
| Message                                  | Status        |
|------------------------------------------ --------------- 
| Hello World                              | Active        |
| Hello Everyone Here                      | Passive       |
| Hi there. My name is Eric. How are you?  | Down          |
 ------------------------------------------ --------------- 
 ---------------------------- --------------- 
| Message                    | Status        | 
|---------------------------- --------------- 
| What's up?                 | Active        | 
| Hi. I'm Otilia             | Passive       | 
| Hi there. This is Marcus   | Up            | 
 ---------------------------- --------------- 

or random other columns

 ------------------------------------------ ---------------- ------------------ 
| Message                                  | Adress         | Test             |
|------------------------------------------ ---------------- ------------------ 
| Hello World                              | up             |               up |
| Hello Everyone Here                      | up             |             down |
| Hi there. My name is Eric. How are you?  | up             |         inactive |
 ------------------------------------------ ---------------- ------------------ 
 ---------------------------- --------------- ------------------ 
| Message                    |Adress         | Test             |
|---------------------------- --------------- ------------------ 
| What's up?                 |up             |               up |
| Hi. I'm Otilia             |up             |             down |
| Hi there. This is Marcus   |up             |         inactive |
 ---------------------------- --------------- ------------------ 

Thanks in advance.

CodePudding user response:

If GNU awk is available, please try markp-fuso's nice solution. If not, here is a posix-compliant alternative:

awk -v col=2 '
NR==FNR {
    if (match($0, /^[| ]/)) {           # the record contains a table
        if (match($0, /^[| ]-/))        # horizontally ruled line
            n = split($0, a, /[| ]/)    # split into columns
        else                            # "cell" line
            n = split($0, a, /\|/)
        len = 0
        for (i = 1; i < n; i  ) {
            len  = length(a[i])   1     # accumulated column position
            pos[FNR, i] = len
        }
    }
    next
}
{
    if (pos[FNR, col] && pos[FNR, col 1])
        print(substr($0, pos[FNR, col], pos[FNR, col   1] - pos[FNR, col]   1))
}
' file.txt file.txt

Result with col=2 as shown above:

 --------------- 
| Status        |
 --------------- 
| Active        |
| Passive       |
| Down          |
 --------------- 
 --------------- 
| Status        |
 --------------- 
| Active        |
| Passive       |
| Up            |
 --------------- 

It detects the column width in the 1st pass then splits the line on the column position in the 2nd pass. It prints single column only so far. If you need to print multiple columns together, please let me know.

CodePudding user response:

One idea using GNU awk:

awk -v fldlist='2,3' '
BEGIN { fldcnt=split(fldlist,fields,",") }                      # split fldlist into array fields[]

      { split($0,arr,/[| ]/,seps)                               # split current line on dual delimiters "|" and " "
        for (i=1;i<=fldcnt;i  )                                 # loop through our array of fields (fldlist)
            printf "%s%s", seps[fields[i]-1], arr[fields[i]]    # print leading separator/delimiter and field
        printf "%s\n", seps[fields[fldcnt]]                     # print trailing separator/delimiter and terminate line
      }
' File.txt

NOTES:

  • requires GNU awk for the 4th argument to the split() function (seps == array of separators; see gawk string functions for details)
  • assumes our field delimiters (|, ) do not show up as part of the data
  • the input variable fldlist is a comma-delimited list of columns that mimics what would be passed to cut (eg, when a line starts with a delimiter then field #1 is blank)

For fldlist='2,3' this generates:

 ------------------------------------------ --------------- 
| Message                                  | Status        |
|------------------------------------------ --------------- 
| Hello World                              | Active        |
| Hello Everyone Here                      | Passive       |
| Hi there. My name is Eric. How are you?  | Down          |
 ------------------------------------------ --------------- 
 ---------------------------- --------------- 
| Message                    | Status        |
|---------------------------- --------------- 
| What's up?                 | Active        |
| Hi. I'm Otilia             | Passive       |
| Hi there. This is Marcus   | Up            |
 ---------------------------- --------------- 

For fldlist='2,4,6' this generates:

 ------------------------------------------ ---------------- ------------------ 
| Message                                  | Adress         | Test             |
|------------------------------------------ ---------------- ------------------ 
| Hello World                              | up             |               up |
| Hello Everyone Here                      | up             |             down |
| Hi there. My name is Eric. How are you?  | up             |         inactive |
 ------------------------------------------ ---------------- ------------------ 
 ---------------------------- ---------------- ------------------ 
| Message                    | Adress         | Test             |
|---------------------------- ---------------- ------------------ 
| What's up?                 | up             |               up |
| Hi. I'm Otilia             | up             |             down |
| Hi there. This is Marcus   | up             |         inactive |
 ---------------------------- ---------------- ------------------ 

For fldlist='4,3,2' this generates:

 ---------------- --------------- ------------------------------------------ 
| Adress         | Status        | Message                                  |
 ---------------- ---------------|------------------------------------------ 
| up             | Active        | Hello World                              |
| up             | Passive       | Hello Everyone Here                      |
| up             | Down          | Hi there. My name is Eric. How are you?  |
 ---------------- --------------- ------------------------------------------ 
 ---------------- --------------- ---------------------------- 
| Adress         | Status        | Message                    |
 ---------------- ---------------|---------------------------- 
| up             | Active        | What's up?                 |
| up             | Passive       | Hi. I'm Otilia             |
| up             | Up            | Hi there. This is Marcus   |
 ---------------- --------------- ---------------------------- 

Say that again? (fldlist='3,3,3'):

 --------------- --------------- --------------- 
| Status        | Status        | Status        |
 --------------- --------------- --------------- 
| Active        | Active        | Active        |
| Passive       | Passive       | Passive       |
| Down          | Down          | Down          |
 --------------- --------------- --------------- 
 --------------- --------------- --------------- 
| Status        | Status        | Status        |
 --------------- --------------- --------------- 
| Active        | Active        | Active        |
| Passive       | Passive       | Passive       |
| Up            | Up            | Up            |
 --------------- --------------- --------------- 

And if you make the mistake of trying to print the '1st' column, ie, fldlist='1':

 
|
|
|
|
|
 
 
|
|
|
|
|
 
  • Related