Home > OS >  How to remove first n columns containing blanks from text file by linux shell scripting
How to remove first n columns containing blanks from text file by linux shell scripting

Time:05-26

I want to remove the first 6 columns containing blanks of this text file sample.txt

      2022-05-26 Mary  Jane
                 foo   bar
      2022-05-27 Tom   Powels
                 lorem ipsum
                 bar   foo
      2022-05-28 Honky Tonk
      2022-05-28 Hill  Billy
      ...

by linux shell scripting, e.g. by using sed, awk and/or cut.

Hence the expected output is

2022-05-26 Mary  Jane
           foo   bar
2022-05-27 Tom   Powels
           lorem ipsum
           bar   foo
2022-05-28 Honky Tonk
2022-05-28 Hill  Billy
...

I've searched in SE, but only found solutions to remove all blanks at the beginning of each line, e.g.

$ sed 's/^ *//' sample.txt > output.txt

which results in this file

2022-05-26 Mary  Jane
foo   bar
2022-05-27 Tom   Powels
lorem ipsum
bar   foo
2022-05-28 Honky Tonk
2022-05-28 Hill  Billy
...

where the formatting of the columns is lost.

Unfortunately this call of sed

$ sed 's/^ {6}//' sample.txt > output.txt

doesn't work.

Hence how could I remove the first 6 columns containing blanks by linux shell scripting?

CodePudding user response:

Removing arbitrary columns from a text file could be done by colrm on linux shell. This command line tool from IBM is documented here.

Hence removing the first 6 columns from sample.txt could be done by

$ colrm 1 6 < sample.txt > output.txt

resulting in the desired output

2022-05-26 Mary  Jane
           foo   bar
2022-05-27 Tom   Powels
           lorem ipsum
           bar   foo
2022-05-28 Honky Tonk
2022-05-28 Hill  Billy
...

CodePudding user response:

 sed -E 's/^ {6}//' sample.txt > output.txt
 awk '{gsub(/^ {6}/,""); print > "output.txt"}' sample.txt
 

CodePudding user response:

If you need to remove n first characters from each line, then GNU AWK substr function is handy, let file.txt content be

  2022-05-26 Mary  Jane
             foo   bar
  2022-05-27 Tom   Powels
             lorem ipsum
             bar   foo
  2022-05-28 Honky Tonk
  2022-05-28 Hill  Billy
  ...

then

awk '{print substr($0,7)}' file.txt

output

2022-05-26 Mary  Jane
           foo   bar
2022-05-27 Tom   Powels
           lorem ipsum
           bar   foo
2022-05-28 Honky Tonk
2022-05-28 Hill  Billy
...

Explanation: print part of current line ($0) starting at 7th character.

(tested in gawk 4.2.1)

  • Related