I am working on a "simple" base converter to convert ULong with base 10 to a String with any base. Here I use 64 chars. Usecase is to shorten ULong that are stored as String anyway.
Public Class BaseConverter
'base64, but any length would work
Private Shared ReadOnly Characters() As Char = {"0"c, "1"c, "2"c, "3"c, "4"c, "5"c, "6"c, "7"c, "8"c, "9"c,
"a"c, "b"c, "c"c, "d"c, "e"c, "f"c, "g"c, "h"c, "i"c, "j"c, "k"c, "l"c, "m"c, "n"c, "o"c, "p"c, "q"c, "r"c, "s"c, "t"c, "u"c, "v"c, "w"c, "x"c, "y"c, "z"c,
"A"c, "B"c, "C"c, "D"c, "E"c, "F"c, "G"c, "H"c, "I"c, "J"c, "K"c, "L"c, "M"c, "N"c, "O"c, "P"c, "Q"c, "R"c, "S"c, "T"c, "U"c, "V"c, "W"c, "X"c, "Y"c, "Z"c,
" "c, "-"c}
Public Shared Function Encode(number As ULong) As String
Dim buffer = New Text.StringBuilder()
Dim quotient = number
Dim remainder As ULong
Dim base = Convert.ToUInt64(Characters.LongLength)
Do
remainder = quotient Mod base
quotient = quotient \ base
buffer.Insert(0, Characters(remainder).ToString())
Loop While quotient <> 0
Return buffer.ToString()
End Function
Public Shared Function Decode(str As String) As ULong
If String.IsNullOrWhiteSpace(str) Then Return 0
Dim result As ULong = 0
Dim base = Convert.ToUInt64(Characters.LongLength)
Dim nPos As ULong = 0
For i As Integer = str.Length - 1 To 0 Step - 1
Dim cPos As Integer = Array.IndexOf(Of Char)(Characters, str(i))
result = (base ^ nPos) * Convert.ToUInt64(cPos)
nPos = 1
Next
Return result
End Function
End Class
It works fine with numbers up to 16 digits, but when the number has more digits it starts to round weird.
17 digits result in jumps of 8, 18 digits result in jumps of 32 and 19 digits result in jumps of 256.
For example, these numbers around 42347959784570944 do not work and result in those blocks:
42347959784570939 > 2msPWXX00X > 42347959784570936 ERROR
42347959784570940 > 2msPWXX00Y > 42347959784570944 ERROR
42347959784570941 > 2msPWXX00Z > 42347959784570944 ERROR
42347959784570942 > 2msPWXX00 > 42347959784570944 ERROR
42347959784570943 > 2msPWXX00- > 42347959784570944 ERROR
42347959784570944 > 2msPWXX010 > 42347959784570944
42347959784570945 > 2msPWXX011 > 42347959784570944 ERROR
42347959784570946 > 2msPWXX012 > 42347959784570944 ERROR
42347959784570947 > 2msPWXX013 > 42347959784570944 ERROR
42347959784570948 > 2msPWXX014 > 42347959784570944 ERROR
42347959784570949 > 2msPWXX015 > 42347959784570952 ERROR
The problem has to be in the Decode() function, as the generated strings differ.
I put some tests on https://dotnetfiddle.net/7wAGLh, but I just can't find the issue.
CodePudding user response:
The exponent operator, ^
, always returns a Double
.
This means the entire expression (base ^ nPos) * Convert.ToUInt64(cPos)
is evaluated and returned as Double
(and then silently crammed into a ULong
).
Which brings in the Double
imprecision that you are observing.
Using Option Strict On
at all times is a good way to catch these errors.
Provided that you know that (base ^ nPos)
will not exceed the maximum value of ULong
(which seems to be the assumption anyway), the fix is
result = CULng(base ^ nPos) * Convert.ToUInt64(cPos)