Home > Blockchain >  Iterate over a csv and change the values of a column that meets a condition
Iterate over a csv and change the values of a column that meets a condition

Time:05-13

I have to use bash to iterate over a CSV file and replace the values of a column that meets a condition. Finally, the results have to be stored in an output file.

I have written this code, which reads the file and stores the content in an array. On iterating over the file, if the value at column 13 is equal to "NULL" then the value of this record has to be replaced by "0". Once the file is reviewed the output with the replaced values is stored at file_b.

#!/bin/bash
file="./2022_Accidentalidad.csv"
while IFS=; read -ra array
do
    if [[ ${array[13]} == "NULL" ]]; then
    echo "${array[13]}" | sed -n 's/NULL/0/g' 
    fi
done < $file > file_b.csv

The problem is that file_b is empty. Nothing is written there. How could I do this?

I cannot use AWK, and have to use or a FOR or a WHILE command to iterate over the file.

Sample input:

num_expediente;fecha;hora;localizacion;numero;cod_distrito;distrito;tipo_accidente;estado_meteorológico;tipo_vehiculo;tipo_persona;rango_edad;sexo;cod_lesividad;lesividad;coordenada_x_utm;coordenada_y_utm;positiva_alcohol;positiva_droga
2022S000001;01/01/2022;1:30:00;AVDA. ALBUFERA, 19;19;13;PUENTE DE VALLECAS;Alcance;Despejado;Turismo;Conductor;De 18 a 30 años;Mujer;NULL;NULL;443359,226;4472082,272;N;NULL

Expected output

num_expediente;fecha;hora;localizacion;numero;cod_distrito;distrito;tipo_accidente;estado_meteorológico;tipo_vehiculo;tipo_persona;rango_edad;sexo;cod_lesividad;lesividad;coordenada_x_utm;coordenada_y_utm;positiva_alcohol;positiva_droga
2022S000001;01/01/2022;1:30:00;AVDA. ALBUFERA, 19;19;13;PUENTE DE VALLECAS;Alcance;Despejado;Turismo;Conductor;De 18 a 30 años;Mujer;0;NULL;443359,226;4472082,272;N;NULL

Thanks a lot in advance.

CodePudding user response:

You don't need sed. Just replace $array[13] with 0. Then print the entire array with ; separators between the fields.

(             # in a subshell
    IFS=';'   # set IFS, that affects `read` and `"${array[*]}"`
    while read -ra array
    do
        if [[ ${array[13]} == "NULL" ]]; then
            array[13]=0
        fi
        echo "${array[*]}"
    done
) < $file > file_b.csv

echo uses the first character of $IFS as the output field separator.

CodePudding user response:

When awk is also possible:

awk 'BEGIN{FS=OFS=";"} NR==2 && $14=="NULL"{$14=0} {print}' "$file" > file_b.csv

See: 8 Powerful Awk Built-in Variables – FS, OFS, RS, ORS, NR, NF, FILENAME, FNR

CodePudding user response:

One idea using a regex and the BASH_REMATCH array:

regex='(([^;]*;){13})(NULL)(;.*)'

while read -r line
do
    [[ "${line}" =~ $regex ]] && 
    line="${BASH_REMATCH[1]}0${BASH_REMATCH[4]}"

    # uncomment following line to display contents of BASH_REMATCH[] array
    # declare -p BASH_REMATCH

    echo "${line}"

done < file.csv > file_b.csv

This generates:

$ cat file_b.csv
num_expediente;fecha;hora;localizacion;numero;cod_distrito;distrito;tipo_accidente;estado_meteorológico;tipo_vehiculo;tipo_persona;rango_edad;sexo;cod_lesividad;lesividad;coordenada_x_utm;coordenada_y_utm;positiva_alcohol;positiva_droga
2022S000001;01/01/2022;1:30:00;AVDA. ALBUFERA, 19;19;13;PUENTE DE VALLECAS;Alcance;Despejado;Turismo;Conductor;De 18 a 30 años;Mujer;0;NULL;443359,226;4472082,272;N;NULL
  • Related