Given following output derived from df -P | awk '!/udev|boot|tmpfs|none/ && NR>1 {printf ("%-10s\t%-10s\t%-10s\n", $1, $2, $6)}' | grep -wv /
.
/dev/sda2 576075280 /hdd
/dev/sda1 1344681704 /home
/dev/vda2 2468687261 /media/user/backup
/dev/vda1 823581356 /media/user/movie
/dev/sdb2 676075280 /media/user/db2
/dev/sdb1 1691481049 /media/user/db1
I want to select row with largest storage from each partition, the desired output would be.
/dev/sda1 1344681704 /home
/dev/vda2 2468687261 /media/pi/backup
/dev/sdb1 1691481049 /media/pi/db1
CodePudding user response:
Solution
cat input.txt |
awk '{print substr($1, 1, match($1, "[[:digit:]]") - 1), $0}' |
sort -k1,1 -k3,3nr |
awk 'id!=$1{ print; id = $1}' | cut -d ' ' -f2-
Input
λ cat input.txt
/dev/sda2 576075280 /hdd
/dev/sda1 1344681704 /home
/dev/vda2 2468687261 /media/user/backup
/dev/vda1 823581356 /media/user/movie
/dev/sdb2 676075280 /media/user/db2
/dev/sdb1 1691481049 /media/user/db1
Output
/dev/sda1 1344681704 /home
/dev/sdb1 1691481049 /media/user/db1
/dev/vda2 2468687261 /media/user/backup
Explanation
Here we use a technique called Schwartzian transform.
Your question is ambiguous because we don't know how you would consider 2 partitions are the same. Here I use the command
awk '{print substr($1, 1, match($1, "[[:digit:]]") - 1), $0}'
but you can change it to achieve your needs.λ cat input.txt | awk '{print substr($1, 1, match($1, "[[:digit:]]") - 1), $0}' /dev/sda /dev/sda2 576075280 /hdd /dev/sda /dev/sda1 1344681704 /home /dev/vda /dev/vda2 2468687261 /media/user/backup /dev/vda /dev/vda1 823581356 /media/user/movie /dev/sdb /dev/sdb2 676075280 /media/user/db2 /dev/sdb /dev/sdb1 1691481049 /media/user/db1
After adding an extra field as partition identifier, we can easily solve your problem by using combination of
sort
,awk
andcut
.
CodePudding user response:
In Linux you can just use lsblk
instead of df
for finding the biggest partition of each disk:
lsblk -nPpbo KNAME,SIZE,PKNAME,MOUNTPOINT |
awk -F'="|" ?' -v OFS='\t' '
{
kname = $2 # device name, for ex. /dev/sda1
size = $4 # size of the device, in Bytes
pkname = $6 # parent device name, for ex. /dev/sda
mountpoint = $8 # where the device is mounted, absolute path
}
pkname !~ "^/" { next }
mountpoint !~ "^/" { next }
mountpoint == "/" { next } # not sure why you want to exclude /
size > sizes[pkname] {
knames[pkname] = kname
sizes[pkname] = size
mountpoints[pkname] = mountpoint
}
END {
for (pkname in knames)
print knames[pkname], sizes[pkname], mountpoints[pkname]
}
'
remark: the size will be displayed in Bytes instead of 512 or 1024 blocks, and possibly problematic characters in the fields (mostly in the mount point) will be escaped with a two digits hexadecimal notation \xHH
. IMHO both of those are good points because you'll be able to read and unescape the resulting TSV accurately with bash.
Here are the relevant options from lsblk
manual and help:
-b, --bytes
Print the SIZE column in bytes rather than in a human-readable format.
-n, --noheadings
Do not print a header line.
-P, --pairs
Produce output in the form ofkey="value"
pairs.
All potentially unsafe characters are hex-escaped (\x<code>
).
-p, --paths
Print full device paths.
-o, --output list
Specify which output columns to print. [...]
KNAME internal kernel device name
MOUNTPOINT where the device is mounted
SIZE size of the device
PKNAME internal parent kernel device name