Home > Mobile >  For Loop Count Unique
For Loop Count Unique

Time:03-03

I have a table like this

month customer_id
2015-01-01 customer_A
2015-01-01 customer_B
2015-02-01 customer_A
2015-02-01 customer_C
2015-03-01 customer_A
2015-03-01 customer_C
2015-03-01 customer_D
2015-04-01 customer_C
2015-04-01 customer_D

I want to find out how active my customers are every month.

I tried to write a loop that can do like the example below:

  1. Filter to count the unique customer_id in Jan 2015 (in the table above: 2)

  2. Filter all the customer_id in Feb 2015 and check how many are in the customer list of Jan 2015. (in the table above: 1)

  3. Continue to the last month

  4. Loop back to start from Feb 2015

The result should be a table like this (the numbers are random)

month month_1 customers
2015-01-01 2015-02-01 2 (A&B)
2015-01-01 2015-02-01 1 (only A left)
2015-01-01 2015-03-01 1 (only A left)
2015-01-01 2015-04-01 0 (none in the original left)
2015-02-01 2015-03-01 2 (A&C)
2015-02-01 2015-04-01 1 (only C left)

Thank you so much!

CodePudding user response:

Here is some code to do what you are asking, assuming the input is in the form shown.

        table = [
            {'month':'2015-01-01','customer_id':'customer_A'},
            {'month':'2015-01-01','customer_id':'customer_B'},
            {'month':'2015-02-01','customer_id':'customer_A'},
            {'month':'2015-02-01','customer_id':'customer_C'},
            {'month':'2015-03-01','customer_id':'customer_A'},
            {'month':'2015-03-01','customer_id':'customer_C'},
            {'month':'2015-03-01','customer_id':'customer_D'},
            {'month':'2015-04-01','customer_id':'customer_C'},
            {'month':'2015-04-01','customer_id':'customer_D'}
        ]
        result = []
        months = sorted(list({row['month'] for row in table}))
        for start in range(len(months)):
            custsAtStart = {row['customer_id'] for row in table if row['month'] == months[start]}
            for end in range(start, len(months)):
                custsAtEnd = {row['customer_id'] for row in table if row['month'] == months[end] and row['customer_id'] in custsAtStart}
                result.append((months[start], months[end], list(custsAtEnd)))
        print(f"{'month' : <12} {'month_1' : <12} {'customers'}")
        [print(f"{row[0] : <12} {row[1] : <12} {row[2]}") for row in result]

For a given month, it identifies the active customers and shows how many of these customers persist over the coming months. It then advances the initial observation to the next month and repeats the analysis.

Here's the output:

month        month_1      customers
2015-01-01   2015-01-01   ['customer_A', 'customer_B']
2015-01-01   2015-02-01   ['customer_A']
2015-01-01   2015-03-01   ['customer_A']
2015-01-01   2015-04-01   []
2015-02-01   2015-02-01   ['customer_A', 'customer_C']
2015-02-01   2015-03-01   ['customer_A', 'customer_C']
2015-02-01   2015-04-01   ['customer_C']
2015-03-01   2015-03-01   ['customer_A', 'customer_D', 'customer_C']
2015-03-01   2015-04-01   ['customer_D', 'customer_C']
2015-04-01   2015-04-01   ['customer_D', 'customer_C']
  • Related