Home > other >  Removing first n character's from text file in Bash Unix
Removing first n character's from text file in Bash Unix

Time:09-27

I am trying to remove the first N characters from a text file and whats Important is that it is done NOT LINE BY LINE.

Currently this code that I have written deletes 'i' number of chars from EACH Line. But I want to delete from the whole text.

for FILE in *; 
    do  x=$(wc -c < "$FILE"); for ((i=1; i <= $x;   i));
            do sed "s/^.\{$i\}//" $FILE > $i; 
        done;
done;

for example I have this xml file in the directory xml/root.xml

<ticket id="usa-001" REFUND="NO" TEST="TEST">
        <airline>Us Airlines</airline>
        <emptytag id="usa-001" REFUND="NO" TEST="TEST"/>
        <preis>30</preis><seat>
            <allseats>120</allseats>
</ticket>

What I want is deleting the first N chars and saving it into a new file. lets say 5 so it would be

et id="usa-001" REFUND="NO" TEST="TEST">
        <airline>Us Airlines</airline>
        <emptytag id="usa-001" REFUND="NO" TEST="TEST"/>
        <preis>30</preis><seat>
            <allseats>120</allseats>
</ticket>

CodePudding user response:

Using GNU sed:

$ sed -Ez 's/^.{5}//' root.xml > 5

$ cat 5
et id="usa-001" REFUND="NO" TEST="TEST">
        <airline>Us Airlines</airline>
        <emptytag id="usa-001" REFUND="NO" TEST="TEST"/>
        <preis>30</preis><seat>
            <allseats>120</allseats>
</ticket>

if you want to remove up to 5 chars in files that have less than 5 then use {1,5} instead of {5}.

CodePudding user response:

With your shown samples please try following awk code. Written and tested in GNU awk.

For single Input_file:

awk -i inplace -v RS='^.{5}' -v ORS='' 'END{print}'  Input_file

For multiple Input_file(s) with GNU awk: Using ENDFILE function here which will process all lines at the end of each Input_file as names suggests.

awk -i inplace -v RS='^.{5}' -v ORS='' 'ENDFILE{print}' *

CodePudding user response:

If you really just want to filter out the first n characters of a file, the tool you want is dd which allows you to specify the number of blocks to skip. If you want a block size of 1, specify that with bs. For example, to skip the first 2 characters of the input file, use:

$ echo foobarbaz | dd bs=1 skip=2 2> /dev/null
obarbaz

You can specify an input file with if, but it's probably simpler to redirect. dd writes a bunch of diagnostics to stderr, and the output redirection is just there to suppress those messages. This will be slow as dirt since the block size is so small, but (if you have a dd which supports this) you can be much faster than sed with:

dd iflag=skip_bytes skip=5

CodePudding user response:

You can also use tail:

# display from 4th byte
# in other words, remove first 3 bytes
$ printf 'apple\nbanana\nfig\ncherry\n' | tail -c  4
le
banana
fig
cherry

CodePudding user response:

With cut

n=5; cut -c$n- file.txt

It looks like you want to save each line in a file.

n=5; cut -c$n- file.txt | awk '{print $0 > NR}'

n=5; cut -c$n- file.txt | awk '{print $0 > NR; exit}'

CodePudding user response:

You know, you can also use hexdump:

hexdump -s 5 -ve '/1 "%c"' inputfile > outfile

CodePudding user response:

You could do something hacky and ugly like this -

awk 'BEGIN{ left=100 } { if (left>0) { len=length($0); if (len<left) { left-=len 1; next } else {  print substr($0,left); len=0; next } } else print $0 }' infile

Don't, please... Use Ed's sed instead.

You could use Perl -

perl -e 'seek(STDIN,100,0) && print <>' < infile # simpler
perl -e '$/=undef; open(my $fh,$ARGV[0]); seek($fh,100,0) && print <$fh>' infile # cleaner

but William's dd works on binaries without requiring any code...

dd bs=1 skip=100 < infile > outfile 

and Sundeep's is probably most on target for text files if your version understands the option -

tail -c  101 infile # start at byte 101, having skipped the first 100
  • Related