Home > Mobile >  How can I change the values in one column of a df without changing the ones in other columns? (with
How can I change the values in one column of a df without changing the ones in other columns? (with

Time:10-16

I've been trying to change the values of all the columns of my df. What happens is that my df has different columns with the same values, but I want to change them in different ways depending on the column (for example, I want the 1 in column "SEXO" to be "Varón" and the 1 in column "NIVEL_EDUCATIVO" to be "Primario incompleto"). I've tried this:

renombrada.SEXO.loc[renombrada.SEXO == 1] = "Varón"
renombrada.SEXO.loc[renombrada.SEXO == 2] = "Mujer"
renombrada.NIVEL_EDUCATIVO.loc[renombrada.NIVEL_EDUCATIVO == 1] = "Primario incompleto"
renombrada.NIVEL_EDUCATIVO.loc[renombrada.NIVEL_EDUCATIVO == 2] = "Primario completo"
renombrada.NIVEL_EDUCATIVO.loc[renombrada.NIVEL_EDUCATIVO == 3] = "Secundario incompleto"

But what happens is that it replaces all the 1s in my df with "Varón". What am I doing wrong? Thanks!

CodePudding user response:

An easier way is to use map to change the value.

Consider the following sample data:

import pandas as pd
renombrada = pd.DataFrame({
  'SEXO': [1, 2, 1, 2, 1],
  'NIVEL_EDUCATIVO': [1, 1, 2, 3, 1]
})

Dataframe:

>>> renombrada
    SEXO    NIVEL_EDUCATIVO
0   1       1
1   2       1
2   1       2
3   2       3
4   1       1

Prepare map for SEXO:

ren_sexo = {
  1: "Varón",
  2: "Mujer"
}

Then, use that to change SEXO value as follow:

renombrada.SEXO = renombrada.SEXO.map(ren_sexo)

This will give us:

    SEXO    NIVEL_EDUCATIVO
0   Varón   1
1   Mujer   1
2   Varón   2
3   Mujer   3
4   Varón   1

CodePudding user response:

@garagnoth provides a nice solution. This is what you were trying to do, but working:

from pandas import DataFrame

renombrada = DataFrame({
    'SEXO': [1, 2, 1, 2, 1, 2],
    'NIVEL_EDUCATIVO': [1, 1, 2, 2, 3, 3]
})

renombrada.loc[renombrada.SEXO == 1, 'SEXO'] = "Varón"
renombrada.loc[renombrada.SEXO == 2, 'SEXO'] = "Mujer"
renombrada.loc[renombrada.NIVEL_EDUCATIVO == 1, 'NIVEL_EDUCATIVO'] = "Primario incompleto"
renombrada.loc[renombrada.NIVEL_EDUCATIVO == 2, 'NIVEL_EDUCATIVO'] = "Primario completo"
renombrada.loc[renombrada.NIVEL_EDUCATIVO == 3, 'NIVEL_EDUCATIVO'] = "Secundario incompleto"

print(renombrada)

You were missing the column selector in there.

Note the definition of renombrada in the answer - that's assuming your dataframe is structured similarly. In future questions, please include something similar as part of the code in your question, so people know what you're operating on.

  • Related