Home > Software design >  Count on CROSS APPLY with JSON Data
Count on CROSS APPLY with JSON Data

Time:12-13

I have the following DB structure:

ID, Datetime, JsonData

Within JsonData there is a field called party which is "," delimited and another field called Source.

I'm trying to run a simple select where I get the data grouped by (Source, Time)...

Datetime, Source, SourceCount, CountParty

12/12/2021, 1, 4, 7

12/12/2021, 2, 3, 5

I'm trying to run the following SQL, however, I'm struggling to get CountParty. I always tend to get SourceCount. Can anyone please shed some light on what Im doing wrong?

I always end up with

12/12/2021, 1, 4, 4

12/12/2021, 2, 3, 3

Thanks

select
json_value(JsonData,'$.Info.source') as Source
, Party.value
,count(json_value(JsonData,'$.Info.Party'))
,  count(*)

from test
CROSS APPLY 
    STRING_SPLIT(json_value(JsonData,'$.Info.Party'), ',') Party
group by json_value(JsonData,'$.Info.Source'), value
order by [Source]

{ "cID": "CID1","Info": {"Party": "A,B,C","Source" : "1"}}
{ "cID": "CID2","Info": {"Party": "A, C","Source" : "2" }}
{ "cID": "CID3","Info": {"Party": "B, C","Source" : "2" }}
{ "cID": "CID4","Info": {"Party": "B","Source" : "1" }}
{ "cID": "CID5","Info": {"Party": "C,A","Source" : "1" }}
{ "cID": "CID6","Info": {"Party": "A","Source" : "1" }}
{ "cID": "CID7","Info": {"Party": "C","Source" : "2" }}


select


json_value(JsonData,'$.Info.source') as Source

, Party.value

,count(json_value(JsonData,'$.Info.Party'))

,  count(*)


from test

CROSS APPLY 

    STRING_SPLIT(json_value(JsonData,'$.Info.Party'), ',') Party

group by json_value(JsonData,'$.Info.Source'), value

order by [Source]

CodePudding user response:

Your problem is that you because you are grouping by value as well, which refers to the value from STRING_SPLIT. So you get a new group for each individual split party value. You should remove that from the grouping. Instead, you can get the count of Party values in a subquery, then SUM that.

Note also, that count(json_value(JsonData,'$.Info.Party')) does not count distinct values, it counts how many non-null values there are, which probably wasn't your intention.

You can also use OPENJSON to pull out all the values at once, instead of using JSON_VALUE again and again.

SELECT
  t.Datetime,
  j.Source,
  COUNT(*) SourceCount,
  SUM(p.PartyCount) PartyCount
FROM test t
CROSS APPLY
  OPENJSON(t.JsonData, '$.Info')
  WITH (
    Party nvarchar(500),
    Source nvarchar(100)
  ) j
CROSS APPLY (
    SELECT COUNT(*) PartyCount
    FROM STRING_SPLIT(j.Party, ',')
) p
group by
  t.Datetime,
  j.Source
order by
  j.Source;
  • Related