Home > Mobile >  how to find most frequent integer in a slice of struct with Golang
how to find most frequent integer in a slice of struct with Golang

Time:09-30

*** disclaimer : i'm not a professional developer, been tinkering with golang for about 8 month (udemy youtube) and still got no idea how to simple problem like below..

Here's the summarize of the problem :

  • I'm trying to find the most frequent "age" from struct that comes from the decoded .json file (containing string "name" and integer "age").

  • After that i need to print the "name" based on the maximum occurence frequency of "age".

  • The printed "name" based on the maximum-occurence of "age" needs to be sorted alpabethically

Input (.json) :

[
{"name": "John","age": 15},
{"name": "Peter","age": 12},
{"name": "Roger","age": 12},
{"name": "Anne","age": 44},
{"name": "Marry","age": 15},
{"name": "Nancy","age": 15}
]

Output : ['John', 'Mary', 'Nancy'].

Explaination : Because the most occurring age in the data is 15 (occured 3 times), the output should be an array of strings with the three people's name, in this case it should be ['John', 'Mary', 'Nancy'].

Exception :

  • In the case there are multiple "age" with the same maximum-occurence count, i need to split the data and print them into different arrays (i.e when 'Anne' age is 12, the result is: ['John', 'Mary', 'Nancy'], ['Anne','Peter','Roger']

This is what i've tried (in Golang) :

package main
{
import (
    "encoding/json"
    "fmt"
    "os"
    "sort"
)
// 1. preparing the empty struct for .json
type Passanger struct {
    Name string `json:"name"`
    Age  int    `json:"age"`
}
func main() {
    // 2. load the json file
    content, err := os.ReadFile("passanger.json")
    if err != nil {
        fmt.Println(err.Error())
    }
    // 3. parse json file to slice
    var passangers []Passanger
    err2 := json.Unmarshal(content, &passangers)
    if err2 != nil {
        fmt.Println("Error JSON Unmarshalling")
        fmt.Println(err2.Error())
    }
    // 4. find most frequent age numbers (?)
    for _, v := range passangers {
        // this code only show the Age on every line
        fmt.Println(v.Age)
    }
    // 5. print the name and sort them apabethically (?)
       // use sort.slice package
       // implement group based by "max-occurence-age"
}

Been stuck since yesterday, i've also tried to implement the solution from many coding challenge question like :

func majorityElement(arr int) int {
    sort.Ints(arr)
    return arr[len(arr)/2]
}

but i'm still struggling to understand how to handle the "age" value from the Passanger slice as an integer input(arr int) to code above.

others solution i've found online is by iterating trough map[int]int to find the maximum frequency :

func main(){
    arr := []int{90, 70, 30, 30, 10, 80, 40, 50, 40, 30}
    freq := make(map[int]int)
    for _ , num :=  range arr {
        freq[num] = freq[num] 1
    }
    fmt.Println("Frequency of the Array is : ", freq)
}

but then again, the .json file contain not only integer(age) but also string(name) format, i still don't know how to handle the "name" and "age" separately..

i really need a proper guidance here.

*** here's the source code (main.go) and (.json) file that i mentioned above : https://github.com/ariejanuarb/golang-json

CodePudding user response:

What to do before implementing a solution

During my first years of college, my teachers would always repeat something to me and my fellow classmates, don't write code first, especially if you are a beginner, follow these steps instead:

  • Write what you want to happen
  • Details the problem into small steps
  • Write all scenarios and cases when they branch out
  • Write input and output (method/function signature)
  • Check they fit each other

Let's follow these steps...

Write what you want to happen

You have well defined your problem so i will skip this step.

Let's detail this further

  1. You have a passenger list
  2. You want to group the passengers by their age
  3. You want to look which are the most common/which have the most passengers.
  4. You want to print the name in alphabetical order

Branching out

  • Scenario one: one group has a bigger size than all others.
  • Scenario two: two or more groups has the same size and are bigger than the others.

There might more scenario but they are yours to find

input output ??

Well now that we have found out what we must be doing, we are going to check the input and output of each steps to achieve this goal.

the steps:

  1. You have a passenger list
  • input => none or filename (string)
  • output => []Passenger
  1. You want to group the passengers by their age
  • input => []Passenger // passenger list
  • output => map[int][]int or map[int][]&Passenger // ageGroups

The first type, the one inside the bracket is the age of the whole group.

The second type, is a list that contains either:

  • the passenger's position within the list
  • the address of the object/passenger in the memory

it is not important as long as we can retrieve back easily the passenger from the list without iterating it back again.

  1. You want to look which are the most common/which have the most passengers.
  • input => groups (ageGroups)

so here we have scenario 1 and 2 branching out... which mean that it must either be valid for all scenario or use a condition to branch them out.

  • output for scenario 1 => most common age (int)
  • output for scenario 2 => most common ages ([]int)

we can see that the output for scenario 1 can be merged with the output of scenario 2

  1. You want to print the name in alphabetical order of all passengers in an ageGroup

    • input => groups ([]Passenger) ages ([]int) passenger list ([]Passenger).
    • output => string or []byte or nothing if you just print it...

    to be honest, you can skip this one if you want to.

After checking, time to code

let's create functions that fit our signature first

type Passenger struct {
    Name string `json:"name"`
    Age  int    `json:"age"`
}

func GetPassengerList() []Passenger{
   // 2. load the json file
   content, err := os.ReadFile("passanger.json")
   if err != nil {
       fmt.Println(err.Error())
   }

   // 3. parse json file to slice
   var passengers []Passenger
 
   err2 := json.Unmarshal(content, &passengers)
   if err2 != nil {
       fmt.Println("Error JSON Unmarshalling")
       fmt.Println(err2.Error())
   }

   return passengers
}

// 4a. Group by Age
func GroupByAge(passengers []Passenger) map[int][]int {
    group := make(map[int][]int, 0)

    for index, passenger := range passengers {
        ageGroup := group[passenger.Age]
        ageGroup = append(ageGroup, index)
        group[passenger.Age] = ageGroup
    }

    return group
}

// 4b. find the most frequent age numbers

func FindMostCommonAge(ageGroups map[int][]int) []int {
    mostFrequentAges := make([]int, 0)
    biggestGroupSize := 0

    // find most frequent age numbers
    for age, ageGroup := range ageGroups {
        // is most frequent age
        if biggestGroupSize < len(ageGroup) {
            biggestGroupSize = len(ageGroup)
            mostFrequentAges = []int{age}
        } else if biggestGroupSize == len(ageGroup) { // is one of the most frequent age
            mostFrequentAges = append(mostFrequentAges, age)
        }
        // is not one of the most frequent age so does nothing
    }

    return mostFrequentAges
}

func main() {
    passengers := loadPassengers()

    // I am lazy but if you want you could sort the
    // smaller slice before printing to increase performance
    sort.Slice(passengers, func(i, j int) bool {
        if passengers[i].Age == passengers[j].Age {
            return passengers[i].Name < passengers[j].Name
        }
        return passengers[i].Age < passengers[j].Age
    })

    // age => []position
    // Length of the array count as the number of occurences
    ageGrouper := GroupByAge(passengers)

    mostFrequentAges := FindMostCommonAge(ageGrouper)

    // print the passenger
    for _, age := range mostFrequentAges {
        fmt.Println("{")
        for _, passengerIndex := range ageGrouper[age] {
            fmt.Println("\t", passengers[passengerIndex].Name)
        }
        fmt.Println("}")
    }
}


CodePudding user response:

Should be any more complicated than

  • Sort the source slice by age and name
  • Break it up into sequences with a common age, and
  • As you go along, track the most common

Something like this:

https://goplay.tools/snippet/6pCpkTEaDXN

type Person struct {
    Age  int
    Name string
}

func MostCommonAge(persons []Person) (mostCommonAge int, names []string) {
  
  sorted := make([]Person, len(persons))
  copy(sorted, persons)
  
  // sort the slice by age and then by name
  sort.Slice(sorted, func(x, y int) bool {
    left, right := sorted[x], sorted[y]
    
    switch {
    case left.Age < right.Age:
      return true
    case left.Age > right.Age:
      return false
    default:
      return left.Name < right.Name
    }
  })

  updateMostCommonAge := func(seq []Person) (int, []string) {
    
    if len(seq) > len(names) {
      
      buf := make([]string, len(seq))
      for i := 0; i < len(seq); i   {
        buf[i] = seq[i].Name
      }
      
      mostCommonAge = seq[0].Age
      names = buf
      
    }

    return mostCommonAge, names
  
  }

  for lo, hi := 0, 0; lo < len(sorted); lo = hi {
    
    for hi = lo; hi < len(sorted) && sorted[lo].Age == sorted[hi].Age; {
      hi  
    }
    
    mostCommonAge, names = updateMostCommonAge(sorted[lo:hi])
    
  }

  return mostCommonAge, names
}

Another approach uses more memory, but is a little simpler. Here, we build a map of names by age and then iterate over it to find the key with the longest list of names.

https://goplay.tools/snippet/_zmMys516IM

type Person struct {
    Age  int
    Name string
}

func MostCommonAge(persons []Person) (mostCommonAge int, names []string) {
    namesByAge := map[int][]string{}

    for _, p := range persons {
        value, found := namesByAge[p.Age]
        if !found {
            value = []string{}
        }
        namesByAge[p.Age] = append(value, p.Name)
    }

    for age, nameList := range namesByAge {
        if len(nameList) > len(names) {
            mostCommonAge, names = age, nameList
        }
    }

    return mostCommonAge, names
}
  • Related