I dont know why im getting this error, my list has a length of 21, but when it gets to 18 I get the list index out of range error. Help please
import pandas as pd
import os
mainpath = r"D:\Epoca de Cambio\Curso Python Machine Learning\python-ml-course-master\datasets"
filename = r"customer-churn-model\Customer Churn Model.csv"
fullpath = os.path.join(mainpath,filename)
data = pd.read_csv(fullpath,sep=",")
col_desired = ["Account Length","Phone","Eve Charge","Day Calls"]
columns = data.columns.values.tolist()
print(len(columns))
for i in range(len(columns)):
print(i)
if (columns[i] in col_desired):
columns.pop(i)
CodePudding user response:
Because your for-loop is destroying the list (columns.pop()
) as it goes. This is not the right way to do it, see How to remove items from a list while iterating?.
Here's why your code is not doing what you intended:
- in the first iteration, i=0, list starts with length 21, ends with length 20
- in the second iteration, i=1, list starts with length 20, ends with length 19
- ... eventually you will run out of items before the for-loop reaches i=20
Anyway, don't write code like that. What are you trying to achieve? If you want a list of all columns that are not in col_desired
, you don't even need any loop, just use a list comprehension:
[col for col in data.columns if not col not in col_desired]
or you could use:
set(df.columns) - set(col_desired)
But, tell us what your code is trying to do, then rewrite it.
CodePudding user response:
This is because you are using pop() statement. Whenever the column[i] is present in col_desired, you are reducing the length of the list by using pop() operation.
Instead you should do this :
for col in col_desired:
if (col in columns):
columns.remove(col)
CodePudding user response:
You can really simplify your code:
col_desired = ['Account Length', 'Phone', 'Eve Charge', 'Day Calls']
data = pd.read_csv(fullpath, usecols=col_desired)