How to split a column into two in Bash shell-CodePudding

I have a huge file with many column. I want to count the number of occurences of each values in 1 column. Therefore, I use cut -f 2 "file" | sort | uniq -c . I got the result as I want. However, when I read this file to R, It shows that I have only 1 column but the data is like the example below Example:

123 Chelsea
65 Liverpool
77 Manchester city
2 Brentford

The thing I want is two columns, one for the counts the other for the names. However, I got one only. Can anyone help me to split the column into 2 or a better method to extract from the big file?

Thanks in advance !!!!

CodePudding user response：

If you want to simply count the unique instances in each column, your best bet would be the cut command with the custom delimiter. For instance, it would be the whitespace delimiter.

In this case you have to consider that you have subsequent spaces after the first one e.g. Manchester city.

So, in order to count the unique occurrences of the first column:

cat <your_file> | cut -d ' ' -f1 | uniq | wc -l

where -d sets the delimiter to whitespace ' ', and -f1 gives you the first column; uniq keeps the unique instances and wc -l counts the number of occurrences.

Similarly, to count the unique occurrences of the second column:

cat <your_file> | cut -d ' ' -f2- | uniq | wc -l

where all parameters/commands are the same except for -f2- which allows you to get the from the second column to the last (see cut man page -f<from>-<to>).

CodePudding user response：

Not a beautiful solution, but try this. Pipe the output of the previous command into this while loop:

"your program" | while read count city
 do
 printf " s\t%s" $count $city
 done