Home > Software design >  Truncate CSV Header Names
Truncate CSV Header Names

Time:12-10

I'm looking for a relatively simple method for truncating CSV header names to a given maximum length. For example a file like:

one,two,three,four,five,six,seven
data,more data,words,,,data,the end

Could limit all header names to a max of 3 characters and become:

one,two,thr,fou,fiv,six,sev
data,more data,words,,,data,the end

Requirements:

  • Only the first row is affected
  • I don't know what the headers are going to be, so it has to dynamically read and write the values and lengths

I tried a few things with awk and sed, but am not proficient at either. The closest I found was this snippet:

csvcut -c 3 file.csv |
sed -r 's/^"|"$//g' |
awk -F';' -vOFS=';' '{ for (i=1; i<=NF;   i) $i = substr($i, 0, 2) } { printf("\"%s\"\n", $0) }' >tmp-3rd

But it was focusing on columns and also feels more complicated than necessary to use csvcut.

Any help is appreciated.

CodePudding user response:

With GNU sed:

sed -E '1s/([^,]{1,3})[^,]*/\1/g' file

Output:

one,two,thr,fou,fiv,six,sev
data,more data,words,,,data,the end

See: man sed and The Stack Overflow Regular Expressions FAQ

  • Related