To get variance in each strata in a sampling process, at least 2 elements are needed for each strata. I need to "collapse" (add the records of stratum 3 to those of some other stratum) stratum 3 with some other. If by default in these cases it is requested that such strata be collapsed with the one above it (in this case the first strata would collapse with the last one if necessary), then:
Is there a way to do this collapse in SQL?
Can I bring the first table to the second?
Strata | Frequency |
---|---|
1 | 4 |
2 | 6 |
3 | 1 |
4 | 10 |
Strata | Frequency |
---|---|
1 | 4 |
2 | 7 |
4 | 10 |
I will appreciate your answers very much.
I have a suspicion that I can use "analytic functions", particularly something along the lines of "ROWS BETWEEN 1 AND PRECEDING AND 1 FOLLOWING" together with "IF" to identify rows that have fewer than 2 records, but I've run into complications.
CodePudding user response:
Try this. For your database, replace the @t
's with your table name.
--create temp table for testing
declare @t table
(
strata int,
frequency int
)
--insert values into temp table
insert into @t (strata, frequency) values
(1,4),(2,6),(3,1),(4,1),(5,10)
--intermediate table for final strata values
declare @tmp table
(
s int
)
--final strata values are those with frequency bigger than/equal to 2
insert into @tmp (s)
(
select strata
from @t
where frequency>=2
)
select
normalisedStrata as strata,
sum(frequency) as frequency
from
(
select
--normalise the strata by testing frequency
case
when frequency<2
--if freq<2, get biggest element of temp table that's less than the row's strata
then (select max(s) from @tmp where s<strata)
--otherwise just copy across
else strata
end as normalisedStrata,
frequency
from @t
--subquery must be named, just put arbitrary string
) a
group by normalisedStrata
CodePudding user response:
You can do something like this. Use a case
expression to check for Frequency
, return 1 or 0 as indicator. Perform a cumulative sum on it to form the grp
.
with
cte1 as
(
select *, f = case when Frequency > 1 then 1 else 0 end
from strata
),
cte2 as
(
select *, grp = sum(f) over (order by Strata)
from cte1
)
select Strata = min(Strata), Frequency = sum(Frequency)
from cte2
group by grp
order by Strata