I'm attempting to create a function which removes unwanted columns from a dataframe based on values from a list, and separates the remaining columns into two different dataframes, by moving one column of the dataframe into another dataframe.
Unwanted columns are removed in the first for-cycle, where if Tkinter variable variabletype is equal to 1, the column with index i gets removed from the table. As the columns are dropped, the index of the following columns seems to decrease by 1, and to ensure we don't miss any columns because of this, I implemented the count variable, which takes care of this problem. If no columns are dropped during the iteration, we append the i-th element of variabletype into a local variable usedvartypes, which we will use in the second for-cycle.
The first one works fine, however the second one keeps giving me the same error over and over. What it's supposed to do is iterate through the remaining columns by using the length of usedvartypes, and if i-th element in usedvartypes is equal to 0, we want to copy i-th column into a new dataframe, and remove it from the previous one. However, anytime I try to run this, I get a KeyError at i-th index. I don't understand why, am I attempting to access a pandas dataframe the wrong way?
def createFinalDataframe():
global data
global finaldata_x
global finaldata_y
global variabletype #each value represents a single column in the dataframe; equal to 0 (y) 1(unwanted) or 2(x)
finaldata_x = data
count = 0
usedvartypes=[]
for i in range(len(variabletype)):
if (variabletype[i].get() == 1):
finaldata_x = finaldata_x.drop(finaldata_x.columns[count], axis=1)
count = count - 1
else:
usedvartypes.append(variabletype[i].get())
count = count 1
for i in range(len(usedvartypes)):
if (usedvartypes[i]==0):
finaldata_y = []
print(finaldata_x[i])
finaldata_y= finaldata_x[i].copy()
finaldata_x = finaldata_x.drop(finaldata_x.columns[i], axis=1)
break
CodePudding user response:
Us iloc
here. Change print(finaldata_x[i])
to print(finaldata_x.iloc[:, i])
.
Updated logic:
def createFinalDataframe():
global data, finaldata_x, finaldata_y, variabletype
finaldata_x = data
count = 0
usedvartypes=[]
for i in range(len(variabletype)):
if (variabletype[i].get() == 1):
finaldata_x = finaldata_x.drop(finaldata_x.columns[count], axis=1)
count = count - 1
else:
usedvartypes.append(variabletype[i].get())
count = count 1
for i in range(len(usedvartypes)):
if (usedvartypes[i]==0):
finaldata_y = []
print(finaldata_x.iloc[:, i])
finaldata_y= finaldata_x.iloc[:, i].copy()
finaldata_x = finaldata_x.drop(finaldata_x.columns[i], axis=1)
break