I have 500k line of fix length data, but in some line there is enter character in between data.
Eg. Each line length is 26 character.
ABCDEFGHIJKLMNOUPQRSTUVWXTZ
ABCDEFGHIJKLM<BR>
NOUPQRSTUVWXYZ
ABCDEFGHIJKLMNOUPQRSTUVWXTZ
Line 2 is having enter character. I Want to remove enter character from line 2 and combine it with line below it. E.g.
ABCDEFGHIJKLMNOUPQRSTUVWXTZ
ABCDEFGHIJKLMNOUPQRSTUVWXYZ
ABCDEFGHIJKLMNOUPQRSTUVWXTZ
I tried to use awk and sed but result is not correct
CodePudding user response:
If you have Perl in your system, you can simply do this.
$ perl -pe 's/<BR>\n//' your_file_name
It is a one-liner. You simply run it at your command line.
Or with awk:
awk '{ORS = sub(/<BR>/,"") ? "" : "\n"; print $0}' your_file_name
CodePudding user response:
This might work for you (GNU sed):
sed 'N;s/<BR>\n//;P;D' file
or:
sed -z 's/<BR>\n//g' file
CodePudding user response:
One, slightly off-the-wall, way of doing this is to:
- remove all existing linefeeds
- insert new linefeeds every 27 characters
That looks like this:
tr -d '\n' < YOURFILE | fold -w 27
ABCDEFGHIJKLMNOUPQRSTUVWXTZ
ABCDEFGHIJKLMNOUPQRSTUVWXYZ
ABCDEFGHIJKLMNOUPQRSTUVWXTZ