Home > database >  Is it possible to find duplicate characters in a random unicode string using the bitwise operations?
Is it possible to find duplicate characters in a random unicode string using the bitwise operations?

Time:06-01

I was looking for a solution to find duplicate characters in a string and I was interested in a solution with bitwise operations.

I found such a variant with bitwise operations. But in it, the search occurs in the range a-z of the ASCII table.

func HasDuplicates(str string) (string, bool) {
    checker := 0
    for _, char := range str {
        val := char - 'a'
        fmt.Println(val)
        if (checker & (1 << val)) > 0 {
            fmt.Printf("'%c' is Duplicate\n", char)
            return str, false
        }
        checker |= 1 << val
    }
    return str, true
}

Is it possible to make a universal solution, like the example above, only for a random unicode string (hieroglyphs, emoji, etc.)?

CodePudding user response:

Use a big.Int as a bitset:

func HasDuplicates(str string) (string, bool) {
    var bits big.Int
    for _, char := range str {
        val := int(char)
        fmt.Println(val)
        if bits.Bit(val) != 0 {
            fmt.Printf("'%c' is Duplicate\n", char)
            return str, false
        }
        bits.SetBit(&bits, val, 1)
    }
    return str, true
}

https://go.dev/play/p/kS-OxYPts5G

How efficient this is will depend on the implementation of big.Int, you're not in control of that the way you are when using bitwise operations on a simple integer.

You could also use a map of booleans, although then it wouldn't be bitwise operations any more:

func HasDuplicates(str string) (string, bool) {
    var bits = make(map[int]bool)
    for _, char := range str {
        val := int(char)
        fmt.Println(val)
        if bits[val] {
            fmt.Printf("'%c' is Duplicate\n", char)
            return str, false
        }
        bits[val] = true
    }
    return str, true
}
  • Related