Home > Net >  nested loop and if statement, improve code
nested loop and if statement, improve code

Time:01-31

I have a DF with 16 columns, I need to work on 2 of them to generate a table (creating a DF in this case), the first column is Type (has 2 type, Subscriber and one-time user, and the second is day (days of a week), following is the DS (part, the actual is more than 16K rows and 16 columns), I am working on:

Type,Trip_in_min,Day
One-time user,25,Sunday
Subscriber,11,Sunday
Subscriber,4,Sunday
One-time user,11,Sunday
Subscriber,9,Sunday
One-time user,4,Sunday
Subscriber,2,Sunday
One-time user,3,Sunday
Subscriber,2,Sunday
Subscriber,8,Sunday
One-time user,12,Monday
Subscriber,11,Monday
One-time user,2,Monday
Subscriber,2,Monday
One-time user,5,Tuesday
Subscriber,9,Tuesday
One-time user,6,Tuesday
Subscriber,16,Tuesday

I can get the following results, based on the above input

           Subscribers  One Time Users
Sunday               6               4
Monday               2               2
Tuesday              2               2
Wednesday            0               0
Thursday             0               0
Friday               0               0
Saturday             0               0

I have the following code:

import pandas as pd
file = pd.read_csv(r"D:\OneDrive\DataAnalysis\Test2.txt")
df = pd.DataFrame(file)
user_type = 0
regular = 0
regular1 = 0
regular2 = 0
regular3 = 0
regular4 = 0
regular5 = 0
regular6 = 0

one_time_user = 0
one_time_user1 = 0
one_time_user2 = 0
one_time_user3 = 0
one_time_user4 = 0
one_time_user5 = 0
one_time_user6 = 0
day_idx = 0
new_list = pd.DataFrame({'Subscribers': [0, 0, 0, 0, 0, 0, 0], 'One Time Users': [0, 0, 0, 0, 0, 0, 0]}, index=['Sunday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday'])
for index1, row in df.iterrows():
    for i in range(0, len(df)):
        user_type = df['Type'].iloc[i]
        day = df['Day'].iloc[i]
        if day == 'Sunday':
            if user_type == 'Subscriber':
                regular  = 1
            else:
                one_time_user  = 1
        elif day == 'Monday':
            if user_type == 'Subscriber':
                regular1  = 1
            else:
                one_time_user1  = 1
        elif day == 'Tuesday':
            if user_type == 'Subscriber':
                regular2  = 1
            else:
                one_time_user2  = 1
        elif day == 'Wednesday':
            if user_type == 'Subscriber':
                regular3  = 1
            else:
                one_time_user3  = 1
        elif day == 'Thursday':
            if user_type == 'Subscriber':
                regular4  = 1
            else:
                one_time_user4  = 1
        elif day == 'Friday':
            if user_type == 'Subscriber':
                regular5  = 1
            else:
                one_time_user5  = 1
        elif day == 'Saturday':
            if user_type == 'Subscriber':
                regular6  = 1
            else:
                one_time_user6  = 1
    break
#print('Regular:', regular)
#print('One Tine User:', one_time_user)
new_list.at["Sunday", "Subscribers"] = regular
new_list.at['Sunday', "One Time Users"] = one_time_user
new_list.at["Monday", "Subscribers"] = regular1
new_list.at['Monday', "One Time Users"] = one_time_user1
new_list.at["Tuesday", "Subscribers"] = regular2
new_list.at['Tuesday', "One Time Users"] = one_time_user2
new_list.at["Wednesday", "Subscribers"] = regular3
new_list.at['Wednesday', "One Time Users"] = one_time_user3
new_list.at["Thursday", "Subscribers"] = regular4
new_list.at['Thursday', "One Time Users"] = one_time_user4
new_list.at["Friday", "Subscribers"] = regular5
new_list.at['Friday', "One Time Users"] = one_time_user5
new_list.at["Saturday", "Subscribers"] = regular6
new_list.at['Saturday', "One Time Users"] = one_time_user6
print(new_list)

But as you can see, it is a not an efficient one, I have been trying on this for days, so it will not be easy for me to list (or even remember) what I tried

Any suggestion to improve the code please? I just could not figure out how to make the proper looping

Thanks

CodePudding user response:

The no-loop version, just incase you want it...

  • uses pandas' groupby to get the results you want
  • uses pandas' pivot to create the layout you want
import pandas as pd
import io

data="""Type,Trip_in_min,Day
One-time user,25,Sunday
Subscriber,11,Sunday
Subscriber,4,Sunday
One-time user,11,Sunday
Subscriber,9,Sunday
One-time user,4,Sunday
Subscriber,2,Sunday
One-time user,3,Sunday
Subscriber,2,Sunday
Subscriber,8,Sunday
One-time user,12,Monday
Subscriber,11,Monday
One-time user,2,Monday
Subscriber,2,Monday
One-time user,5,Tuesday
Subscriber,9,Tuesday
One-time user,6,Tuesday
Subscriber,16,Tuesday"""

# Using StringIO as a pretend file...
df = pd.read_csv(io.StringIO(data))
print()
print(df)

# count the number of rows per Day/Type
df_agg = df.groupby(['Day','Type'],as_index=False).size()
print()
print(df_agg)

# Pivot the tyoe to be columns instead
df_pivot = df_agg.pivot(index='Day', columns='Type')
print()
print(df_pivot)

Demo; https://trinket.io/python3/d24d2b1e73

  • Related