Home > Mobile >  Find Duplicate Objects in JSON Array
Find Duplicate Objects in JSON Array

Time:02-10

I am relatively new to c# and need some guidance. I need to search through a large json array to find duplicate objects and list how many occurrences.

Sample Array

{
    "data": 
        [
          {
            "date": "20220101",
            "someID": "ID1",
            "someType1": "SRVC",
            "someType2": "SEND"
          },
          {
            "date": "20220101",
            "someID": "ID1",
            "someType1": "SRVC",
            "someType2": "SEND"
          },
          {
            "date": "20220101",
            "someID": "ID2",
            "someType1": "SRVC",
            "someType2": "RECV"
          },
          {
            "date": "20220101",
            "someID": "ID2",
            "someType1": "SRVC",
            "someType2": "RECV"
          },
          {
            "date": "20220101",
            "someID": "ID2",
            "someType1": "SRVC",
            "someType2": "RECV"
          },
          {
            "date": "20220101",
            "someID": "ID2",
            "someType1": "SRVC",
            "someType2": "SEND"
          },          
          {
            "date": "20220101",
            "someID": "ID2",
            "someType1": "SRVC",
            "someType2": "SEND"
          },          
          {
            "date": "20220101",
            "someID": "ID2",
            "someType1": "SRVC",
            "someType2": "SEND"
          }
          
        ]
}

I have been able to search through and find duplicate elements but not the complete object. The output should be:

{
    "result": 
        [
          {
            "date": "20220101",
            "someID": "ID1",
            "someType1": "SRVC",
            "someType2": "SEND",
            "objCount": 2
          },
          {
            "date": "20220101",
            "someID": "ID2",
            "someType1": "SRVC",
            "someType2": "RECV",
            "objCount": 3
          },
          {
            "date": "20220101",
            "someID": "ID2",
            "someType1": "SRVC",
            "someType2": "SEND",
            "objCount": 3
          }
          
        ]
}

Any help to point me in the right direction would be great. Thanks in advance

CodePudding user response:

You can try to create models to let your JSON data be objects

public class Datum
{
    public string date { get; set; }
    public string someID { get; set; }
    public string someType1 { get; set; }
    public string someType2 { get; set; }
}

public class Root
{
    public List<Datum> data { get; set; }
}

then you can use JsonConvert.DeserializeObject method to deserialize the JSON value to be objects, then use lambda to get duplicate values from the list.

var root = JsonConvert.DeserializeObject<Root>(data);
var res = JsonConvert.SerializeObject(root.data.GroupBy(x=> new {
    x.someID,
    x.someType1,
    x.someType2,
    x.date
}).Where(x=>x.Count()>1).Select(x=>x.Key));

c# online

CodePudding user response:

Step 1 would be declaring a model

public class Data
{
    public DateTime date { get; set; }
    public int someID { get; set; }
    public string someType1 { get; set; }
    public string someType2 { get; set; }
}

Step 2 - deserialize JSON, in this case, using Newtonsoft

var dataList = JsonConvert.DeserializeObject<Data>(json);

Step 3 - produce result

var output = dataList.GroupBy(m => (m.date, m.someID, m.someType1, m.someType2))
    .Select(g => new 
        {
            date = g.Key.date,
            someID = g.Key.someID,
            someType1 = g.Key.someType1,
            someType2 = g.Key.someType2,
            objCount = g.Count()
        });

2 things - potentially LINQ syntax is different for different versions of c#. And you might need to declare a special model for the output.

CodePudding user response:

Another approach, with usage of anonymous classes:

using JsonDocument jsonDocument = JsonDocument.Parse(jsonString);

var output = jsonDocument.RootElement
    .GetProperty("data") // get root level "data" property
    .EnumerateArray()
    // create anonymous type on which C# will group list
    // this will work, because C# generates custom field comparing 
    // Equals for anonymous types
    .GroupBy(e => new { 
            date = e.GetProperty("date").GetString(),
            someID = e.GetProperty("someID").GetString(),
            someType1 = e.GetProperty("someType1").GetString(),
            someType2 = e.GetProperty("someType2").GetString(),
    }).Select(g => new {
            date = g.Key.date,
            someID = g.Key.someID,
            someType1 = g.Key.someType1,
            someType2 = g.Key.someType2,
            objCount = g.Count(),
    });

  • Related