I was reading this thread here
For reasons I don't understand, pandas will not read the value of the formula in the last column of my Excel.
Column W in the Excel is just the sum of columns D to V for each row.
I need to save this Excel as a .csv to import into MySQL. So I use pandas.
print('Now converting to csv .... ')
newname = target.split('.')
savename = newname[0] '.csv'
#read_file = pd.read_excel (pathTarget target)
df = pd.read_excel (pathTarget target, sheet_name=clas)
#read_file.to_csv (pathTarget savename, index = None, header=True)
df.to_csv(pathTarget savename, index=False)
But I don't get anything in the last column, so I thought, modify the dataframe then save it as csv.
I can easily get the sum of the columns I want:
new_val = df.index
for x in new_val:
print(x)
ys = df.iloc[x]['Week1':'Week19'].sum()
print('your total score is', ys)
I can easily set the whole column 'your_score' to 1 value, like 606, as in the thread in the link above.
df['your_score'] = ys
But I can't run a loop and assign each row sum to the last column 'your_score'
df.iloc[0]['your_score'] = ys
I get the error:
Warning (from warnings module): File "<pyshell#26>", line 1 SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
I tried various things, but in the end I just used Pandas to write the Excel as csv, then used Python's csv module to do what I want.
How can I put each row sum
ys = df.iloc[x]['Week1':'Week19'].sum()
in the last column at that iloc location in Pandas??
CodePudding user response:
One of the advantages of pandas is that you can often use vectorized operations instead of loops. Thus in your case it's possible to sum over a 2-dimensional slice of the dataframe like this:
df['your_score'] = df.loc[:, 'Week1':'Week19'].sum(axis=1)
The loc
operator allows indexing and slicing by labels. :
selects all the rows, so we get a sub-dataframe. The df.sum()
method is based on the NumPy function of the same name, so you can select a dimension to sum over with the axis
argument.