I have the following DB structure:
ID, Datetime, JsonData
Within JsonData there is a field called party which is "," delimited and another field called Source.
I'm trying to run a simple select where I get the data grouped by (Source, Time)...
Datetime, Source, SourceCount, CountParty
12/12/2021, 1, 4, 7
12/12/2021, 2, 3, 5
I'm trying to run the following SQL, however, I'm struggling to get CountParty. I always tend to get SourceCount. Can anyone please shed some light on what Im doing wrong?
I always end up with
12/12/2021, 1, 4, 4
12/12/2021, 2, 3, 3
Thanks
select
json_value(JsonData,'$.Info.source') as Source
, Party.value
,count(json_value(JsonData,'$.Info.Party'))
, count(*)
from test
CROSS APPLY
STRING_SPLIT(json_value(JsonData,'$.Info.Party'), ',') Party
group by json_value(JsonData,'$.Info.Source'), value
order by [Source]
{ "cID": "CID1","Info": {"Party": "A,B,C","Source" : "1"}}
{ "cID": "CID2","Info": {"Party": "A, C","Source" : "2" }}
{ "cID": "CID3","Info": {"Party": "B, C","Source" : "2" }}
{ "cID": "CID4","Info": {"Party": "B","Source" : "1" }}
{ "cID": "CID5","Info": {"Party": "C,A","Source" : "1" }}
{ "cID": "CID6","Info": {"Party": "A","Source" : "1" }}
{ "cID": "CID7","Info": {"Party": "C","Source" : "2" }}
select
json_value(JsonData,'$.Info.source') as Source
, Party.value
,count(json_value(JsonData,'$.Info.Party'))
, count(*)
from test
CROSS APPLY
STRING_SPLIT(json_value(JsonData,'$.Info.Party'), ',') Party
group by json_value(JsonData,'$.Info.Source'), value
order by [Source]
CodePudding user response:
Your problem is that you because you are grouping by value
as well, which refers to the value
from STRING_SPLIT
. So you get a new group for each individual split party
value. You should remove that from the grouping. Instead, you can get the count of Party
values in a subquery, then SUM
that.
Note also, that count(json_value(JsonData,'$.Info.Party'))
does not count distinct values, it counts how many non-null values there are, which probably wasn't your intention.
You can also use OPENJSON
to pull out all the values at once, instead of using JSON_VALUE
again and again.
SELECT
t.Datetime,
j.Source,
COUNT(*) SourceCount,
SUM(p.PartyCount) PartyCount
FROM test t
CROSS APPLY
OPENJSON(t.JsonData, '$.Info')
WITH (
Party nvarchar(500),
Source nvarchar(100)
) j
CROSS APPLY (
SELECT COUNT(*) PartyCount
FROM STRING_SPLIT(j.Party, ',')
) p
group by
t.Datetime,
j.Source
order by
j.Source;