ServicePop has x, y coordinate and I want to add a square number(gid).
I made a nested for loop to assign a square number but ServicePop is so huge then it takes several hours.
Is there a faster and efficient way to do it?
When I search at Google they say using apply of dataframe or vectorization will help but I could not alter my code to use such an improvement.
I need your help, please.
import pandas
import datetime
TotPopCenter = pandas.read_csv('TotalPopulationCurrentCenterShapeCoordinate_UTF8.csv', encoding='euckr')
ServicePop = pandas.read_csv('202101_Final.csv', encoding='euckr')
ServicePop.insert(9,'gid','')
Service_gid = ['' for _ in range(len(ServicePop))]
for j in range(len(ServicePop)):
for i in range(len(TotPopCenter)):
if (ServicePop['X_COORD'][j] >= TotPopCenter['xcoord'][i]-125) and \
(ServicePop['X_COORD'][j] < TotPopCenter['xcoord'][i] 125) and \
(ServicePop['Y_COORD'][j] >= TotPopCenter['ycoord'][i]-125) and \
(ServicePop['Y_COORD'][j] < TotPopCenter['ycoord'][i] 125):
Service_gid[j] = TotPopCenter['gid'][I]
ServicePop['gid'] = Service_gid
TotPopCenter gid lbl val xcoord ycoord 0 LM87ab60ba NaN NaN 1087375 1760625 ServicePop STD_YMD X_COORD Y_COORD HCODE WKDY_CD TIME HPOP WPOP VPOP 0 2021-01-01 1.087484e 06 1.760579e 06 2207061 FRI 0 27.97 0.82 7.24
CodePudding user response:
If you're looking to optimize the nested loop specifically, you might want to use itertools.product
, using:
import itertools
for j, i in itertools.product(range(len(ServicePop)), range(len(TotPopCenter))):
rather than:
for j in range(len(ServicePop)):
for i in range(len(TotPopCenter)):
CodePudding user response:
I would store the values instead of constant lookups
for j in range(len(ServicePop)):
serviceX = ServicePop['X_COORD'][j]
serviceY = ServicePop['Y_COORD'][j]
for i in range(len(TotPopCenter)):
totX = TotPopCenter['xcoord'][i]
totY = TotPopCenter['ycoord'][i]
if (serviceX >= totX - 125) and \
(serviceX < totX 125) and \
(serviceY >= totY - 125) and \
(serviceY < totY 125):
Maybe you can even break the inner loop early if you know that they wont overlap. Maybe sort TotPopCenter before.