At the request of many, I have simplified the problem as far as I can imagine (my imagination doesn't go that far), and I think it's reproducable also. The two different Excel files I've been using are called: "first apples.xlsx" and "second apples.xlsx". I've been using the following data:
import os
import numpy as np
import pandas as pd
import glob
#%%
path = os.getcwd() r"\apples"
file_locations = glob.glob(path "\*.xlsx")
#%%
df = {}
for i, file in enumerate(file_locations):
df[i] = pd.read_excel(file, usecols=['Description', 'Price'])
#%%
price_standard_apple = []
price_special_apple = []
special_apple_description = ['Golden', 'Diamond', 'Titanium']
#%%
for file in range(len(df)):
df_description = pd.DataFrame(df[file].iloc[:,-2])
df_prices = pd.DataFrame(df[file].iloc[:,-1])
for description in df_description['Description']:
if description in special_apple_description:
description_index = df_description.loc[df_description['Description']==description].index.values
price = df_prices['Price'].iloc[description_index]
price_sum = np.sum(price)
price_special_apple.append(price_sum)
elif description not in special_apple_description:
description_index = df_description.loc[df_description['Description']==description].index.values
price = df_prices['Price'].iloc[description_index]
price_sum = np.sum(price)
price_standard_apple.append(price_sum)
I would expect the sum of the red colored cells (the special apples so to speak) to be 97 and that of the standard apples to be 224. This is not the case and the problem seems to be in the second loop. Python prints the following values: standard: 234 special: 209
CodePudding user response:
I think you are making this harder than it needs to be.
Given your test data from above you can simply do:
import pandas as pd
test_data = pd.read_csv(r'path\to\file', sep=',')
special = ['Golden', 'Diamond', 'Titanium']
mask = test_data['Description'].isin(special)
specials_price = sum(test_data[mask]['Price']) # -> 97
other = sum(test_data[~mask]['Price']) # -> 224
This is what test_data
looks like:
CodePudding user response:
I think the break statement would be useful in your case. It would allow you to break out of the inner for loop when you've retrieved the value you wanted, continuing the outer for loop.
Also covered in this question: Python: Continuing to next iteration in outer loop