Sort and display uniq IP's in a file containing lots of different IP's-CodePudding

I have a file containing lots of lines as following formnat.

10.192.18.24   nfs3     tpq      export_policy_tpq 1m 52s 25230      0
10.192.18.25   nfs3     tpq      export_policy_tpq 5s 18222      0
10.192.18.26   nfs3     tpq      export_policy_tpq 5m 29s 116      0
10.192.18.41   nfs3     tpq      export_policy_tpq 4m 43s 9473      0
10.192.82.41   nfs3     tpq      export_policy_tpq 2m 22s 12183      0
10.192.82.42   nfs3     tpq      export_policy_tpq 2m 46s 33085      0
10.192.82.48   nfs3     tpq      export_policy_tpq 9m 20s 7213      0
10.192.96.21   nfs3     tpq      export_policy_tpq 8m 27s 49290      0
10.192.96.22   nfs3     tpq      export_policy_tpq 5m 13s 15502      0
10.192.98.15   nfs3     tpq      export_policy_tpq 10s 460387      0
128.59.30.7    nfs3     tpq      export_policy_tpq 6m 28s 10168      0
128.59.30.8    nfs3     tpq      export_policy_tpq 3m 44s 36638      0
128.59.30.9    nfs3     tpq      export_policy_tpq 3m 24s 27983      0
128.59.30.11   nfs3     tpq      export_policy_tpq 3m 6s 29637      0

The first column would be IP's, and IP could be duplicated. The other columns don't have to be sorted. If the first column just a number I can use "sort -u k1,1".However, in this case, IP has 4 numbers. Can you please help to sort lines in IP's order, and remove duplicates, only list lines with unique IP's?

Thank you in advance!

CodePudding user response：

Lets say your file containing the data is called data.txt, you can do:

awk '{print $1}' data.txt | sort | uniq

awk keeps only the first column, the IP addresses
sort: sort the IPs
uniq: remove duplicates

If you need to know how many times each IP appears in the file, you can add option -c to uniq.

CodePudding user response：

This should work sorting each column individually and using number order:

awk '{print $1}' file.txt | sort -ut . -k1,1n -k2,2n -k 3,3n -k 4,4n

CodePudding user response：

Assuming what you want to do is sort based on IP addresses while removing duplicate lines only based on the IP address, below code of sorting and traversing the sorted file to remove duplicates should work:

#!/bin/bash
originalFile=/path/to/original/file
outputFile=/path/to/intermediate/file
cleanFile=/path/to/final/file
sort $originalFile > $outputFile
lastIP=""
while read -r line; do
  words=("${line// / }")
  if [ "${words[0]}" != "$lastIP" ]
  then
    printf "%s\n" "$line" >> $cleanFile
  fi
  lastIP="${words[0]}"
done < $outputFile