I have a very large table which contains
I need to flag every row based on two rules;
If the condition is 1 then the flag should be negative
All the flags are positive except the one coming immediately after fifth one which should be negative, so for every 5 positive ones, the next should be negative.
The flag resets for each ID ( name ), so even if the last flag of John is Positive, Amy starts fresh with 5 positive flags.
I am trying to this on hive, so that I can not update the existing tables.
CodePudding user response:
Here we use RANK() in a CTE to get count the rows per user. We then use CASE and integer division by 5 to find the multiplies of 5 and apply the 2 tests.
create table outcomes( id serial , username varchar(10), dollars int, condition int, expected varchar(10));
insert into outcomes (username,condition) values ('John',1),('John',0),('John',0),('John',0), ('John',0),('John',0),('John',0),('John',0), ('John',0),('John',0),('John',1),('John',0), ('Amy',0),('Amy',0),('Amy',0),('Amy',0), ('Amy',0),('Amy',0),('Amy',0),('Amy',0), ('Amy',1),('Amy',0),('Amy',0);
with cte as (select *, rank() over (partition by username order by id) rn from Outcomes) select id, username, condition, rn, case when condition = 1 then '- negative' when 5*(rn/5)=rn then '- negative' else 'positive' end expectedOutcome from cte;
id | username | condition | rn | expectedoutcome -: | :------- | --------: | -: | :-------------- 13 | Amy | 0 | 1 | positive 14 | Amy | 0 | 2 | positive 15 | Amy | 0 | 3 | positive 16 | Amy | 0 | 4 | positive 17 | Amy | 0 | 5 | - negative 18 | Amy | 0 | 6 | positive 19 | Amy | 0 | 7 | positive 20 | Amy | 0 | 8 | positive 21 | Amy | 1 | 9 | - negative 22 | Amy | 0 | 10 | - negative 23 | Amy | 0 | 11 | positive 1 | John | 1 | 1 | - negative 2 | John | 0 | 2 | positive 3 | John | 0 | 3 | positive 4 | John | 0 | 4 | positive 5 | John | 0 | 5 | - negative 6 | John | 0 | 6 | positive 7 | John | 0 | 7 | positive 8 | John | 0 | 8 | positive 9 | John | 0 | 9 | positive 10 | John | 0 | 10 | - negative 11 | John | 1 | 11 | - negative 12 | John | 0 | 12 | positive
db<>fiddle here