I'm trying to rearrange the columns using the awk command.
Whenever the cell of csv contains commas ex: (Ok, this is an example, thanks). Then awk creates more columns as it sees the commas in the cell. Due to which csv format is getting messed up.
So how can we ignore the delimiter while rearranging the columns in a csv?
command that I'm using is,
awk 'BEGIN {FS=","; OFS=","} {print "$4", "$2", "$3", "$1"}' ols.csv > rearranged.csv
CodePudding user response:
Using the csvkit
$ cat file.csv
first,"the second","third, this is","fourth column"
$ csvcut -c 4,2,3,1 file.csv
fourth column,the second,"third, this is",first
Or miller
$ mlr -N --icsv --ocsv cut -of 4,2,3,1 file.csv
fourth column,the second,"third, this is",first
Or ruby
$ ruby -rcsv -e 'CSV.foreach(ARGV.shift) do |row|
puts CSV.generate_line([row[3], row[1], row[2], row[0]])
end' file.csv
fourth column,the second,"third, this is",first
CodePudding user response:
If you have quoted CSV then you might use GNU AWK
with FPAT
suggested in GNU AWK manual that is FPAT="([^,]*)|(\"[^\"] \")"
. Simple example, let file.csv
content be
"uno","dos,dos","tres,tres,tres"
thenk
awk 'BEGIN{FPAT="([^,]*)|(\"[^\"] \")";OFS=","}{print $1,$3,$2}' file.csv
output
"uno","tres,tres,tres","dos,dos"
Explanation: inform GNU AWK
using field pattern (FPAT
) that column is zero or more (*
) repetitions of not-commas ([^,]
) or (|
) "
followed by 1 or more (
) not-"
([^\"]
) followed by "
. Note that "
must be escaped os otherwise they would be interpreted as string termination.
(tested in gawk 4.2.1)