Home > Net >  How to filter out duplicates while unmarshalling json into struct?
How to filter out duplicates while unmarshalling json into struct?

Time:12-31

I have this json which I am trying to unmarshall to my struct.

{
  "clientMetrics": [
    {
      "clientId": 951231,
      "customerData": {
        "Process": [
          "ABC"
        ],
        "Mat": [
          "KKK"
        ]
      },
      "legCustomer": [
        8773
      ]
    },
    {
      "clientId": 1234,
      "legCustomer": [
        8789
      ]
    },
    {
      "clientId": 3435,
      "otherIds": [
        4,
        32,
        19
      ],
      "legCustomer": [
        10005
      ]
    },
    {
      "clientId": 9981,
      "catId": 8,
      "legCustomer": [
        13769
      ]
    },
    {
      "clientId": 12124,
      "otherIds": [
        33,
        29
      ],
      "legCustomer": [
        12815
      ]
    },
    {
      "clientId": 8712,
      "customerData": {
        "Process": [
          "College"
        ]
      },
      "legCustomer": [
        951
      ]
    },
    {
      "clientId": 23214,
      "legCustomer": [
        12724,
        12727
      ]
    },
    {
      "clientId": 119812,
      "catId": 8,
      "legCustomer": [
        14519
      ]
    },
    {
      "clientId": 22315,
      "otherIds": [
        32
      ],
      "legCustomer": [
        12725,
        13993
      ]
    },
    {
      "clientId": 765121,
      "catId": 8,
      "legCustomer": [
        14523
      ]
    }
  ]
}

I used this tool to generate struct as shown below -

type AutoGenerated struct {
    ClientMetrics []ClientMetrics `json:"clientMetrics"`
}
type CustomerData struct {
    Process []string `json:"Process"`
    Mat     []string `json:"Mat"`
}
type ClientMetrics struct {
    ClientID     int          `json:"clientId"`
    CustomerData CustomerData `json:"customerData,omitempty"`
    LegCustomer  []int        `json:"legCustomer"`
    OtherIds     []int        `json:"otherIds,omitempty"`
    CatID        int          `json:"catId,omitempty"`
    CustomerData CustomerData `json:"customerData,omitempty"`
}

Now my confusion is, I have lot of string or int array so how can I filter out duplicates? I believe there is no set data type in golang so how can I achieve same thing here? Basically when I unmarshall json into my struct I need to make sure there are no duplicates present at all. Is there any way to achieve this? If yes, can someone provide an example how to achieve this for my above json and how should I design my struct for that.

Update

So basically just use like this and change my struct definitions and that's all? Internally it will call UnmarshalJSON and take care of duplicates? I will pass json string and structure to JSONStringToStructure method.

func JSONStringToStructure(jsonString string, structure interface{}) error {
    jsonBytes := []byte(jsonString)
    return json.Unmarshal(jsonBytes, structure)
}

type UniqueStrings []string

func (u *UniqueStrings) UnmarshalJSON(in []byte) error {
    var arr []string
    if err := json.Unmarshal(in, arr); err != nil {
        return err
    }
    *u = UniqueStrings(dedupStr(arr))
    return nil
}

func dedupStr(in []string) []string {
    seen:=make(map[string]struct{})
    w:=0
    for i:=range in {
        if _,s:=seen[in[i]]; !s {
            seen[in[i]]=struct{}{}
            in[w]=in[i]
            w  
        }
    }
    return in[:w]
}

CodePudding user response:

Ideally, you should post-process these arrays to remove duplicates. However, you can achieve this during unmarshaling using a custom type with an unmarshaler:

type UniqueStrings []string

func (u *UniqueStrings) UnmarshalJSON(in []byte) error {
  var arr []string
  if err:=json.Unmarshal(in,arr); err!=nil {
     return err
  }
  *u=UniqueStrings(dedupStr(arr))
  return nil
}

where

func dedupStr(in []string) []string {
   seen:=make(map[string]struct{})
   w:=0
   for i:=range in {
      if _,s:=seen[in[i]]; !s {
         seen[in[i]]=struct{}{}
         in[w]=in[i]
         w  
      } 
   }
   return in[:w]
}

You may use a similar approach for []ints.

You use the custom types in your structs:

type CustomerData struct {
    Process UniqueStrings `json:"Process"`
    Mat     UniqueStrings `json:"Mat"`
}
  • Related