Home > Mobile >  Collapse strata in a sampling process in SQL
Collapse strata in a sampling process in SQL

Time:11-18

To get variance in each strata in a sampling process, at least 2 elements are needed for each strata. I need to "collapse" (add the records of stratum 3 to those of some other stratum) stratum 3 with some other. If by default in these cases it is requested that such strata be collapsed with the one above it (in this case the first strata would collapse with the last one if necessary), then:

Is there a way to do this collapse in SQL?

Can I bring the first table to the second?

Strata Frequency
1 4
2 6
3 1
4 10
Strata Frequency
1 4
2 7
4 10

I will appreciate your answers very much.

I have a suspicion that I can use "analytic functions", particularly something along the lines of "ROWS BETWEEN 1 AND PRECEDING AND 1 FOLLOWING" together with "IF" to identify rows that have fewer than 2 records, but I've run into complications.

CodePudding user response:

Try this. For your database, replace the @t's with your table name.

--create temp table for testing
declare @t table
(
    strata int,
    frequency int
)

--insert values into temp table
insert into @t (strata, frequency) values
(1,4),(2,6),(3,1),(4,1),(5,10)

--intermediate table for final strata values
declare @tmp table
(
    s int
)

--final strata values are those with frequency bigger than/equal to 2
insert into @tmp (s) 
(
    select strata 
    from @t
    where frequency>=2
)


select 
normalisedStrata as strata, 
sum(frequency) as frequency 

from
(
    select 
    --normalise the strata by testing frequency
    case 
        when frequency<2 
        --if freq<2, get biggest element of temp table that's less than the row's strata
        then (select max(s) from @tmp where s<strata) 
        --otherwise just copy across
        else strata 
    end as normalisedStrata, 
    frequency
from @t
--subquery must be named, just put arbitrary string
) a
group by normalisedStrata

CodePudding user response:

You can do something like this. Use a case expression to check for Frequency, return 1 or 0 as indicator. Perform a cumulative sum on it to form the grp.

with 
cte1 as
(
    select *, f = case when Frequency > 1 then 1 else 0 end
    from   strata
),
cte2 as
(
    select *, grp = sum(f) over (order by Strata)
    from   cte1
)
select Strata = min(Strata), Frequency = sum(Frequency)
from   cte2
group by grp
order by Strata
  • Related