Delete every 2nd word of 3rd column a text-CodePudding

I am looking for a way to delete every 2nd word of 3rd column of a text file

4444    pm  7654    army    3687    anywhere    5650    infection
7332    thesis  0638    nasa    3976    condition   0738    los
3549    partners    7584    fee 3930    move    6535    friends
5693    matter  8801    visits  5350    grid    8917    honest
4039    facing  5453    cp  6101    bedrooms    5268    ford

Expecting

4444    pm      army    3687    anywhere    5650    infection
7332    thesis  0638    nasa    3976    condition   0738    los
3549    partners    fee 3930    move    6535    friends
5693    matter  8801    visits  5350    grid    8917    honest
4039    facing  cp  6101    bedrooms    5268    ford

I am aware of two commands.

awk '{print $3}' input.txt
sed '1~2d' input.txt

But I am not sure how to combine this.

Looking forward to any sort of help/suggestions

CodePudding user response：

This might work for you (GNU sed):

sed 's/\S\ //3;n'  file

Delete the 3rd column, print the result and fetch the next line, repeat.

CodePudding user response：

With gnu awk you could get every second row, and use a pattern to capture the first 2 words, match the leading whitespace chars and the third word, and capture the rest after it in group 2.

Then you can print the values of the 2 capture groups.

awk 'NR%2==1 && match($0, /^(\S \s \S )\s \S (.*)/, a) {
    print a[1], a[2]
    next
}1' file

Output

4444    pm      army    3687    anywhere        5650    infection
7332    thesis  0638    nasa    3976    condition       0738    los
3549    partners        fee     3930    move    6535    friends
5693    matter  8801    visits  5350    grid    8917    honest
4039    facing  cp      6101    bedrooms        5268    ford

CodePudding user response：

Assuming your file is indeed called input.txt:

sed -r '1~2s/^(\w \W \w \W )\w \W (.*)/\1\2/' input.txt
4444    pm  army    3687    anywhere    5650    infection
7332    thesis  0638    nasa    3976    condition   0738    los
3549    partners    fee 3930    move    6535    friends
5693    matter  8801    visits  5350    grid    8917    honest
4039    facing  cp  6101    bedrooms    5268    ford

The address operator 1~2 (which, btw, is GNU sed specific) does the "modulo", operates on every unevenly numbered line.
The replacement operation s/// remembers the first two lots of word/whitespace pairs, matches the 3rd, and remembers everything after; then replaces the original line with all but the third column.

CodePudding user response：

How about:

 awk '{if (NR % 2 == 1){$3="";}print}' input.txt

NR => Row Number starting at 1.
So (NR % 2 == 1) every second row starting at first row.

$3="" => Delete the 3rd word.

print => Print the line