Home > front end >  Optimizing WHILE and FOR loops in Python
Optimizing WHILE and FOR loops in Python

Time:05-20

i would like some help with optimizing a piece of code. First, I will provide some context. I have a DataFrame with 23k rows and 33 columns which represents reservations and another dataframe with 2k rows and 50 columns which represent properties. What I would like happenning is to understand monthly GBV (Gross Booking Value) and monthly booked nights. The tricky part is that "monthly" inclines the following:

If a reservation is from 25th of a month until the 5th of the next month and the ADR (Average Daily Rate) is 50 dollars, the result of this reservation is going to be the following:

Nights Booked in 1st month: 7 nights (from 25th to 31st) Nights Booked in 2nd month: 5 nights (from 1st to 5th) GBV for 1st month: 7 nights * 50 dollars = 350 dollars GBV for 2nd month: 5 nights * 50 dollars = 250 dollars

Sample data would be: In reservation dataframe: Arrival_date, Departure_date, Accommodation_amount In properties dataframe: GBV column for each month and Nights booked column for each month

Here is the piece of code that I am using currently:

    i=0
while i < len(kpi):
    if kpi['cod_reservation_status'][i] == 'CONF':
        dateRange = pd.date_range(kpi['arrival'][i], kpi['departure'][i], freq='d', closed='left')
        index = prop_index(int(kpi['id_room'][i]), props.id_room_type)
        nGBV = kpi['accommodation_total_amount'][i] / len(dateRange)
        for adate in dateRange:
            if adate.month == 1 and adate.year == 2022:
                props['January Booked'][index] = props['January Booked'][index]   1
                props['January GBV'][index] = props['January GBV'][index]   nGBV
            if adate.month == 2 and adate.year == 2022:
                props['February Booked'][index] = props['February Booked'][index]   1
                props['February GBV'][index] = props['February GBV'][index]   nGBV
            if adate.month == 3 and adate.year == 2022:
                props['March Booked'][index] = props['March Booked'][index]   1
                props['March GBV'][index] = props['March GBV'][index]   nGBV
            if adate.month == 4 and adate.year == 2022:
                props['April Booked'][index] = props['April Booked'][index]   1
                props['April GBV'][index] = props['April GBV'][index]   nGBV
            if adate.month == 5 and adate.year == 2022:
                props['May Booked'][index] = props['May Booked'][index]   1
                props['May GBV'][index] = props['May GBV'][index]   nGBV
            if adate.month == 6 and adate.year == 2022:
                props['June Booked'][index] = props['June Booked'][index]   1
                props['June GBV'][index] = props['June GBV'][index]   nGBV
            if adate.month == 7 and adate.year == 2022:
                props['July Booked'][index] = props['July Booked'][index]   1
                props['July GBV'][index] = props['July GBV'][index]   nGBV
            if adate.month == 8 and adate.year == 2022:
                props['August Booked'][index] = props['August Booked'][index]   1
                props['August GBV'][index] = props['August GBV'][index]   nGBV
            if adate.month == 9 and adate.year == 2022:
                props['September Booked'][index] = props['September Booked'][index]   1
                props['September GBV'][index] = props['September GBV'][index]   nGBV
            if adate.month == 10 and adate.year == 2022:
                props['October Booked'][index] = props['October Booked'][index]   1
                props['October GBV'][index] = props['October GBV'][index]   nGBV
            if adate.month == 11 and adate.year == 2022:
                props['November Booked'][index] = props['November Booked'][index]   1
                props['November GBV'][index] = props['November GBV'][index]   nGBV
            if adate.month == 12 and adate.year == 2022:
                props['December Booked'][index] = props['December Booked'][index]   1
                props['December GBV'][index] = props['December GBV'][index]   nGBV
    i =1

That works and does the job BUT it is quite slow. On 22k entries, it takes me 2 and a half minutes. I am sure there is a way to optimize this. Any ideas are welcome. What are the best practices in optimizing loops and how would you go about this specific task?

PS: kpi = reservations DataFrame, props = properties DataFrame

CodePudding user response:

With pandas you must not use while and for loop where it is possible to use pandas built-in function which are written in C code and compiled (quicker than interpreted).

Try to uses props.loc[:,"column"].apply("the function you want to apply")

With pandas you must think the entire dataset and not iterate objects.

Look into the documnetation for df.loc[row,col] and df.apply()

CodePudding user response:

I noticed that you have this:

i=0
while i < len(kpi):

A while loop is slower than a for loop. Because a while loop checks the condition after each iteration.

So you should replace it with this:

for i in range(len(kpi))
  • Related