since STDEV
is not supported under ARRAYFORMULA
there is either draggable solution:
or hardcoded solution:
but neither of them is dynamically scalable so is there any workaround?
from a quick google search, it looks like no one dared to dream of such lengths
data sample:
b 5
a 1
a 2
b 5
a 1
desired result:
a 0.5773502692
b 0
CodePudding user response:
the math says that standard deviation can be achieved as:
if we substitute for formulae we get:
knowing the above we can easily convert it into a multidimensional array... let's start with the first query where we pivot labels:
removing labels:
calculating the average:
removing labels:
for subtraction, we need to exclude empty cells so we use IF
:
and raise it on the 2nd power:
now we can sum it up per column with MMULT
or QUERY
again:
to make it more dynamic we can construct query selection with SEQUENCE
:
next is division by count:
then square root:
now we just add back labels:
and transpose it:
=ARRAYFORMULA(TRANSPOSE({INDEX(QUERY({A:B},
"select max(Col2) where Col1 is not null group by Col2 pivot Col1"), 1);
(QUERY(QUERY(IF(QUERY(QUERY({A:B, ROW(A:A)},
"select max(Col2) where Col2 is not null group by Col3 pivot Col1"), "offset 1", )="",,
(QUERY(QUERY({A:B, ROW(A:A)},
"select max(Col2) where Col2 is not null group by Col3 pivot Col1"), "offset 1", )-
QUERY(QUERY({A:B, ROW(A:A)/0},
"select avg(Col2) where Col2 is not null group by Col3 pivot Col1"), "offset 1", ))^2),
"select "&TEXTJOIN(",", 1, "sum(Col"&SEQUENCE(COUNTUNIQUE(A:A))&")")), "offset 1", )/
(INDEX(TRANSPOSE(QUERY({A:B},
"select Col1,count(Col1) where Col1 is not null group by Col1 label count(Col1)''")), 2)-1))^(1/2)}))
it works, but we can do better... simple IF
statement will do:
but to make it fast and smart we need some tweaks to counter empty cells and unsorted dataset:
we add errors:
subtract the average:
raise on the 2nd power:
remove errors with IFNA
and sum it:
divide by count-1:
take a square root:
transpose it and add back labels:
=INDEX(IFERROR({SORT(UNIQUE(FILTER(A:A, A:A<>""))), FLATTEN((
INDEX(QUERY(IFNA((IF(FILTER(A:A, A:A<>"")=TRANSPOSE(SORT(UNIQUE(
FILTER(A:A, A:A<>"")))), FILTER(B:B, B:B<>""), NA())-INDEX(QUERY({A:B, ROW(A:A)/0},
"select avg(Col2) where Col2 is not null group by Col3 pivot Col1"), 2))^2),
"select "&TEXTJOIN(",", 1, "sum(Col"&SEQUENCE(COUNTUNIQUE(A:A))&")")), 2)/TRANSPOSE(QUERY({A:B},
"select count(Col1) where Col1 is not null group by Col1 label count(Col1)''")-1))^(1/2))}))
CodePudding user response:
Outstanding solution and thanks for the clear walk-through, @player0!