I am having an issue understanding a line of code which has been used for one-hot encoding using Pandas in python language
dummies = pd.get_dummies(dataframe[each], prefix=each, drop_first=False)
I am totally new to this due to which can't figure out the code snippet. I tried looking in the pandas documentation, but didn't find any specific answer. Please if you have the understanding of this line, do let me know. TIA
Here is the One-Hot encoding section.
def one_hot(dataframe, col):
for each in col:
dummies = pd.get_dummies(dataframe[each], prefix=each, drop_first=False)
dataframe = pd.concat([dataframe, dummies], axis=1)
dataframe = dataframe.drop(each, 1)
return dataframe
CodePudding user response:
I add comments for explain code:
def one_hot(dataframe, col):
#loop by columns names in varible col
#in another words - for each column processing code bellow
for each in col:
#for each column call get_dummies method
dummies = pd.get_dummies(dataframe[each], prefix=each, drop_first=False)
#add dummies DataFrame to original DataFrame
dataframe = pd.concat([dataframe, dummies], axis=1)
#remove column
dataframe = dataframe.drop(each, 1)
return dataframe
CodePudding user response:
You are looping through each of the columns. "each" indicates one column each time which will perform one-hot encoding.
Also, you are getting a Name Error because your function is not intended correctly
def one_hot(dataframe, col):
for each in col:
dummies = pd.get_dummies(dataframe[each], prefix=each, drop_first=False)
dataframe = pd.concat([dataframe, dummies], axis=1)
dataframe = dataframe.drop(each, 1)
return dataframe