How can I Accelerate performance of a double loop in python?-CodePudding

I am wondering is there is a way where I can accelerate a double for loop in python, currently this is my code:

for i in range (len(newdata1)):
  for j in range(len(dataset)):
    if str(dataset['date'].values[j]) == str(newdata1['SALEDATE'].values[i]):
        newdata1['QUANTITY'].values[i],newdata1['PRICEWONKG'].values[i] = dataset['apple(kg)'].values[j],dataset['apple($/kg)'].values[j]

The code is working correctly but is taking a lot of time since the dataframes size is really big. Is there any way I can reduce the execution time for this double loop?

Thanks

CodePudding user response：

I am not sure if it works well for your use case. But you could try to use the functools library to cache your code.

CodePudding user response：

You can create a filter for the matching dates, merge on dates, and assign the values from the merged dataframe to the relevant rows in newdata1:

mask = newdata1['SALEDATE'].isin(dataset['date'])
newdata1.loc[mask, ['QUANTITY','PRICEWONKG','SALEDATE']] = dataset.merge(newdata1.loc[mask,'SALEDATE'], right_on='SALEDATE', left_on='date').drop('date', axis=1).values