Home > Enterprise >  Python states Key Error when creating a new variable in a data frame
Python states Key Error when creating a new variable in a data frame

Time:02-08

I am in the process of cleaning many variables(columns in a data frame) to perform text analysis on said variables.

I have a data frame called econ_data.

Here I create a 'list' of all the variables that need to be transformed, for example transforming all text to lower case and removing stop words.

open_responses = ['choice_open_1_f', 'choice_open_1_m', 'choice_open_2_f ', 'choice_open_2_m']

Then I want to create a for loop that cleans up these variables so that I can perform text analysis.

for z in open_responses:
    econ_data[z] = econ_data[z].astype(str).str.replace('/',' ')
    econ_data[z] = econ_data[z].apply(lambda x: " ".join(x.lower() for x in x.split()))
    locals()[econ_data[f"{z}_stop"]] = econ_data[f"{z}"].apply(lambda x: " ".join(x for x in x.split() if x not in stop_words))
    

The first 2 lines in the for loop work, however, when I try to add a new variable to the data frame when stop words have been removed from the entry, I receive a Key Error message ("KeyError: 'choice_open_1_f_stop'").

Error message

Please can someone explain how I can solve this issue?

Many thanks!

CodePudding user response:

You get an error because you are trying to get value of locals()[econ_data[f"{z}_stop"]] which is not defined. You should do a simple assignment econ_data[f"{z}_stop"] = which dataframe handles and creates a key that does not exist if try to assign to it.

  •  Tags:  
  • Related