Home > Software design >  Using Postresql, identify IDs if they have three or more boolean columns matching the premises
Using Postresql, identify IDs if they have three or more boolean columns matching the premises

Time:08-25

I'm trying to identify customers by their recurring purchases. For that I have a table with the distinct customer IDs and several columns with the year and month (YYYY-MM) filled in by a boolean that identifies whether the customer made any purchases in the month (0 does not have a purchase and 1 does), as shown below:

customer_id 2020_03 2020_04 2020_05 2020_06 2020_07 2020_08
12345 1 1 1 0 0 0
12346 0 0 1 1 1 0
12347 0 0 0 1 0 1
12348 0 1 0 0 0 0
12349 0 0 0 1 0 1

In the case above, if we define that a recurring customer has purchases in three consecutive months, then only customers 12345 and 12346 fit this premise.

The question I'm stuck with is: is there any way to be able to identify these customers if they have three booleans in a row marked with 1, or if I need to process my data in some other way to achieve the goal?

CodePudding user response:

You can concatenate all columns into a single string and then look for the sequence '111' in it. For example:

select *
from t
where '' || "2020_03" || "2020_04" || "2020_05"
         || "2020_06" || "2020_07" || "2020_08" like '1%'
  • Related