Consider the following code:
BigInteger value = BigInteger.Parse("9876543210987654321098765432109876543210");
Console.WriteLine(value.ToString().Length); // 40
Is there a more efficient way to get the length of the number without calling ToString
?
CodePudding user response:
Anytime a question asks about "performance", it is always necessary to benchmark the approaches. As madreflection pointed out, there is both cpu and memory costs to consider, so I ran benchmarks targeting .NET 6 with a memory diagnoser. The results are fascinating and Joel's answer performs very poorly compared to the highly optimized ToString()
and Log10
methods of BigInteger
.
Benchmark Code:
[SimpleJob(RuntimeMoniker.Net60)]
[MemoryDiagnoser]
public class BigIntegerBenchmarks
{
private static BigInteger _value = BigInteger.Parse("9876543210987654321098765432109876543210");
[Benchmark(Baseline = true)]
public int ToStringBenchmark()
{
return _value.ToString().Length;
}
[Benchmark]
public int GetLengthDivisionBenchmark()
{
return GetLengthDivision(_value);
}
[Benchmark]
public int GetLengthMultiplicationBenchmark()
{
return GetLengthMultiplication(_value);
}
[Benchmark]
public int GetLengthAbsPowBenchmark()
{
return GetLengthAbsPow(_value);
}
[Benchmark]
public int BigIntegerLog()
{
return (int)Math.Round(BigInteger.Log10(_value), MidpointRounding.AwayFromZero);
}
[Benchmark]
public int CountDigitsBenchmark()
{
return CountDigits(_value);
}
static int GetLengthDivision(BigInteger value)
{
int result = 1;
while (value >= 10)
{
result ;
value /= 10;
}
return result;
}
static int GetLengthMultiplication(BigInteger value)
{
BigInteger power = 10;
int result = 1;
while (power < value)
{
result ;
power *= 10;
}
return result;
}
static int GetLengthAbsPow(BigInteger value)
{
int result = 1;
BigInteger ten = new(10);
BigInteger absValue = BigInteger.Abs(value);
while (BigInteger.Pow(ten, result) < absValue)
{
result ;
}
return result;
}
internal static int CountDigits(BigInteger value)
{
int num = 1;
BigInteger num2;
if (value >= 10000000)
{
if (value >= 100000000000000L)
{
num2 = value / 100000000000000;
num = 14;
while (num2 >= 10)
{
num ;
num2 /= 10;
}
}
else
{
num2 = value / 10000000;
num = 7;
}
}
else
{
num2 = (uint)value;
}
if (num2 >= 10)
{
num = ((num2 < 100) ? (num 1) : ((num2 < 1000) ? (num 2) : ((num2 < 10000) ? (num 3) : ((num2 < 100000) ? (num 4) : ((num2 >= 1000000) ? (num 6) : (num 5))))));
}
return num;
}
}
Results:
BenchmarkDotNet=v0.13.1, OS=Windows 10.0.19043.1826 (21H1/May2021Update)
Intel Core i7-8565U CPU 1.80GHz (Whiskey Lake), 1 CPU, 8 logical and 4 physical cores
.NET SDK=7.0.100-preview.4.22252.9
[Host] : .NET Core 3.1.24 (CoreCLR 4.700.22.16002, CoreFX 4.700.22.17909), X64 RyuJIT
.NET 6.0 : .NET 6.0.5 (6.0.522.21309), X64 RyuJIT
Job=.NET 6.0 Runtime=.NET 6.0
Method | Mean | Error | StdDev | Ratio | RatioSD | Gen 0 | Allocated |
---|---|---|---|---|---|---|---|
ToStringBenchmark | 160.96 ns | 3.200 ns | 2.837 ns | 1.00 | 0.00 | 0.0668 | 280 B |
GetLengthDivisionBenchmark | 1,939.93 ns | 35.299 ns | 31.292 ns | 12.06 | 0.28 | 0.3052 | 1,288 B |
GetLengthMultiplicationBenchmark | 1,330.02 ns | 21.522 ns | 20.131 ns | 8.26 | 0.17 | 0.5569 | 2,336 B |
GetLengthAbsPowBenchmark | 10,928.05 ns | 123.547 ns | 115.566 ns | 67.89 | 1.25 | 3.4637 | 14,520 B |
BigIntegerLog | 58.34 ns | 1.179 ns | 0.985 ns | 0.36 | 0.01 | - | - |
CountDigitsBenchmark | 1,018.92 ns | 9.688 ns | 8.589 ns | 6.33 | 0.12 | 0.1774 | 744 B |
Notes:
- CountDigits is a clever method from System.Buffers.Text.FormattingHelpers that I adapted for BigInteger. It performs significantly better than Joel's methods, but far worse than either ToString or Log10.
- Matthew's adapted method was 5-10x worse than Joel's method, showing just how important benchmarking code can be when selecting an algorithm.
Conclusion:
BigInteger.Log10()
seems to be by far the fastest method and causes zero memory allocations, although the need to round concerns me. You would have to determine if that rounding method is acceptable to you and worth the risk.
In addition, it is worth noting that the dotnet core team and open source community continue to heavily optimize the framework apis and runtime. It is critical to prove that they do in fact have a slow algorithm via benchmarking before rolling your own solution.
To Theodor Zoulias' point, the benchmarks against a single value are incomplete.
As a result, I ran additional benchmarks against BigInteger values ranging from 10 digits to 200 digits. In all cases, the winner was consistently the Log10
implementation. For very small numbers (10 digits) the CountDigits
implementation beats ToString
but not Log10
.
Benchmark Code:
[SimpleJob(RuntimeMoniker.Net60)]
[MemoryDiagnoser]
public class BigIntegerBenchmarks
{
public IEnumerable<BigInteger> Values()
{
// 10
yield return BigInteger.Parse("1234512345");
// 20
yield return BigInteger.Parse("12345123451234512345");
// 30
yield return BigInteger.Parse("123451234512345123451234512345");
// 40
yield return BigInteger.Parse("1234512345123451234512345123451234512345");
// 50
yield return BigInteger.Parse("12345123451234512345123451234512345123451234512345");
// 75
yield return BigInteger.Parse("123451234512345123451234512345123451234512345123451234512345123451234512345");
// 100
yield return BigInteger.Parse("1234512345123451234512345123451234512345123451234512345123451234512345123451234512345123451234512345");
// 150
yield return BigInteger.Parse("123451234512345123451234512345123451234512345123451234512345123451234512345123451234512345123451234512345123451234512345123451234512345123451234512345");
// 200
yield return BigInteger.Parse("12345123451234512345123451234512345123451234512345123451234512345123451234512345123451234512345123451234512345123451234512345123451234512345123451234512345123451234512345123451234512345123451234512345");
}
[Benchmark(Baseline = true)]
[ArgumentsSource(nameof(Values))]
public int ToStringBenchmark(BigInteger value)
{
return value.ToString().Length;
}
[Benchmark]
[ArgumentsSource(nameof(Values))]
public int GetLengthDivisionBenchmark(BigInteger value)
{
return GetLengthDivision(value);
}
[Benchmark]
[ArgumentsSource(nameof(Values))]
public int GetLengthMultiplicationBenchmark(BigInteger value)
{
return GetLengthMultiplication(value);
}
[Benchmark]
[ArgumentsSource(nameof(Values))]
public int GetLengthAbsPowBenchmark(BigInteger value)
{
return GetLengthAbsPow(value);
}
[Benchmark]
[ArgumentsSource(nameof(Values))]
public int BigIntegerLog(BigInteger value)
{
return (int)Math.Floor(BigInteger.Log10(value) 1);
}
[Benchmark]
[ArgumentsSource(nameof(Values))]
public int CountDigitsBenchmark(BigInteger value)
{
return CountDigits(value);
}
static int GetLengthDivision(BigInteger value)
{
int result = 1;
while (value >= 10)
{
result ;
value /= 10;
}
return result;
}
static int GetLengthMultiplication(BigInteger value)
{
BigInteger power = 10;
int result = 1;
while (power < value)
{
result ;
power *= 10;
}
return result;
}
static int GetLengthAbsPow(BigInteger value)
{
int result = 1;
BigInteger ten = new(10);
BigInteger absValue = BigInteger.Abs(value);
while (BigInteger.Pow(ten, result) < absValue)
{
result ;
}
return result;
}
internal static int CountDigits(BigInteger value)
{
int num = 1;
BigInteger num2;
if (value >= 10000000)
{
if (value >= 100000000000000L)
{
num2 = value / 100000000000000;
num = 14;
while (num2 >= 10)
{
num ;
num2 /= 10;
}
}
else
{
num2 = value / 10000000;
num = 7;
}
}
else
{
num2 = (uint)value;
}
if (num2 >= 10)
{
num = ((num2 < 100) ? (num 1) : ((num2 < 1000) ? (num 2) : ((num2 < 10000) ? (num 3) : ((num2 < 100000) ? (num 4) : ((num2 >= 1000000) ? (num 6) : (num 5))))));
}
return num;
}
}
Results:
BenchmarkDotNet=v0.13.1, OS=Windows 10.0.19043.1826 (21H1/May2021Update)
Intel Core i7-8565U CPU 1.80GHz (Whiskey Lake), 1 CPU, 8 logical and 4 physical cores
.NET SDK=7.0.100-preview.4.22252.9
[Host] : .NET Core 3.1.24 (CoreCLR 4.700.22.16002, CoreFX 4.700.22.17909), X64 RyuJIT
.NET 6.0 : .NET 6.0.5 (6.0.522.21309), X64 RyuJIT
Job=.NET 6.0 Runtime=.NET 6.0
Method | value | Mean | Error | StdDev | Median | Ratio | RatioSD | Gen 0 | Allocated |
---|---|---|---|---|---|---|---|---|---|
ToStringBenchmark | 1234512345 | 53.48 ns | 1.085 ns | 1.333 ns | 53.10 ns | 1.00 | 0.00 | 0.0114 | 48 B |
GetLengthDivisionBenchmark | 1234512345 | 160.13 ns | 2.436 ns | 2.991 ns | 159.68 ns | 3.00 | 0.09 | - | - |
GetLengthMultiplicationBenchmark | 1234512345 | 130.25 ns | 2.035 ns | 1.904 ns | 129.94 ns | 2.42 | 0.07 | 0.0076 | 32 B |
GetLengthAbsPowBenchmark | 1234512345 | 1,602.62 ns | 14.585 ns | 13.643 ns | 1,603.14 ns | 29.79 | 0.76 | 0.3567 | 1,496 B |
BigIntegerLog | 1234512345 | 23.52 ns | 0.148 ns | 0.123 ns | 23.53 ns | 0.44 | 0.01 | - | - |
CountDigitsBenchmark | 1234512345 | 30.82 ns | 0.523 ns | 1.274 ns | 30.50 ns | 0.60 | 0.03 | - | - |
ToStringBenchmark | 12345123451234512345 | 96.01 ns | 1.962 ns | 1.927 ns | 95.17 ns | 1.00 | 0.00 | 0.0440 | 184 B |
GetLengthDivisionBenchmark | 12345123451234512345 | 609.47 ns | 11.889 ns | 9.928 ns | 608.20 ns | 6.32 | 0.13 | 0.0763 | 320 B |
GetLengthMultiplicationBenchmark | 12345123451234512345 | 568.80 ns | 8.009 ns | 7.100 ns | 566.88 ns | 5.91 | 0.15 | 0.1717 | 720 B |
GetLengthAbsPowBenchmark | 12345123451234512345 | 4,795.50 ns | 74.831 ns | 62.487 ns | 4,796.57 ns | 49.75 | 1.01 | 1.0910 | 4,584 B |
BigIntegerLog | 12345123451234512345 | 43.85 ns | 0.565 ns | 0.441 ns | 43.79 ns | 0.46 | 0.01 | - | - |
CountDigitsBenchmark | 12345123451234512345 | 156.70 ns | 1.055 ns | 0.881 ns | 156.89 ns | 1.63 | 0.04 | 0.0153 | 64 B |
ToStringBenchmark | 12345(...)12345 [30] | 130.06 ns | 0.931 ns | 0.825 ns | 130.38 ns | 1.00 | 0.00 | 0.0572 | 240 B |
GetLengthDivisionBenchmark | 12345(...)12345 [30] | 1,196.23 ns | 8.965 ns | 7.486 ns | 1,199.23 ns | 9.20 | 0.08 | 0.1888 | 792 B |
GetLengthMultiplicationBenchmark | 12345(...)12345 [30] | 926.82 ns | 13.441 ns | 11.915 ns | 927.25 ns | 7.13 | 0.12 | 0.3557 | 1,488 B |
GetLengthAbsPowBenchmark | 12345(...)12345 [30] | 9,131.78 ns | 324.146 ns | 955.750 ns | 8,980.04 ns | 61.96 | 2.28 | 2.1362 | 8,944 B |
BigIntegerLog | 12345(...)12345 [30] | 43.73 ns | 0.673 ns | 0.597 ns | 43.61 ns | 0.34 | 0.01 | - | - |
CountDigitsBenchmark | 12345(...)12345 [30] | 546.94 ns | 10.446 ns | 11.611 ns | 544.76 ns | 4.22 | 0.08 | 0.0706 | 296 B |
ToStringBenchmark | 12345(...)12345 [40] | 161.58 ns | 1.509 ns | 1.338 ns | 161.86 ns | 1.00 | 0.00 | 0.0668 | 280 B |
GetLengthDivisionBenchmark | 12345(...)12345 [40] | 1,900.12 ns | 37.730 ns | 33.446 ns | 1,896.37 ns | 11.76 | 0.24 | 0.2956 | 1,240 B |
GetLengthMultiplicationBenchmark | 12345(...)12345 [40] | 1,399.98 ns | 27.838 ns | 77.602 ns | 1,370.01 ns | 9.27 | 0.28 | 0.5569 | 2,336 B |
GetLengthAbsPowBenchmark | 12345(...)12345 [40] | 12,315.16 ns | 143.022 ns | 111.662 ns | 12,321.32 ns | 76.22 | 0.98 | 3.4637 | 14,520 B |
BigIntegerLog | 12345(...)12345 [40] | 59.83 ns | 2.957 ns | 8.625 ns | 57.83 ns | 0.34 | 0.02 | - | - |
CountDigitsBenchmark | 12345(...)12345 [40] | 1,087.25 ns | 18.482 ns | 17.288 ns | 1,086.55 ns | 6.74 | 0.11 | 0.1678 | 704 B |
ToStringBenchmark | 12345(...)12345 [50] | 197.47 ns | 1.815 ns | 1.417 ns | 197.59 ns | 1.00 | 0.00 | 0.0763 | 320 B |
GetLengthDivisionBenchmark | 12345(...)12345 [50] | 2,666.97 ns | 22.475 ns | 21.023 ns | 2,665.74 ns | 13.51 | 0.16 | 0.4196 | 1,768 B |
GetLengthMultiplicationBenchmark | 12345(...)12345 [50] | 1,750.60 ns | 17.269 ns | 13.483 ns | 1,749.85 ns | 8.87 | 0.11 | 0.7782 | 3,256 B |
GetLengthAbsPowBenchmark | 12345(...)12345 [50] | 15,840.75 ns | 315.818 ns | 569.485 ns | 15,549.61 ns | 79.38 | 2.94 | 5.0964 | 21,360 B |
BigIntegerLog | 12345(...)12345 [50] | 44.87 ns | 0.809 ns | 0.756 ns | 45.09 ns | 0.23 | 0.00 | - | - |
CountDigitsBenchmark | 12345(...)12345 [50] | 1,626.71 ns | 27.600 ns | 23.047 ns | 1,623.62 ns | 8.24 | 0.14 | 0.2747 | 1,152 B |
ToStringBenchmark | 12345(...)12345 [75] | 306.08 ns | 4.371 ns | 3.650 ns | 305.57 ns | 1.00 | 0.00 | 0.1030 | 432 B |
GetLengthDivisionBenchmark | 12345(...)12345 [75] | 5,192.75 ns | 34.686 ns | 30.749 ns | 5,195.60 ns | 16.96 | 0.17 | 0.7629 | 3,208 B |
GetLengthMultiplicationBenchmark | 12345(...)12345 [75] | 3,988.09 ns | 271.305 ns | 799.948 ns | 3,794.99 ns | 10.89 | 2.50 | 1.4191 | 5,944 B |
GetLengthAbsPowBenchmark | 12345(...)12345 [75] | 33,948.30 ns | 1,493.620 ns | 4,403.973 ns | 32,639.63 ns | 132.92 | 8.45 | 10.4370 | 43,848 B |
BigIntegerLog | 12345(...)12345 [75] | 43.95 ns | 0.908 ns | 1.360 ns | 43.35 ns | 0.14 | 0.00 | - | - |
CountDigitsBenchmark | 12345(...)12345 [75] | 3,529.34 ns | 68.576 ns | 60.791 ns | 3,531.11 ns | 11.54 | 0.20 | 0.5836 | 2,456 B |
ToStringBenchmark | 1234(...)2345 [100] | 552.12 ns | 32.468 ns | 95.732 ns | 533.47 ns | 1.00 | 0.00 | 0.1316 | 552 B |
GetLengthDivisionBenchmark | 1234(...)2345 [100] | 9,353.61 ns | 278.178 ns | 820.215 ns | 9,213.45 ns | 17.62 | 4.21 | 1.1902 | 5,000 B |
GetLengthMultiplicationBenchmark | 1234(...)2345 [100] | 3,949.08 ns | 49.055 ns | 45.886 ns | 3,945.01 ns | 8.32 | 1.34 | 2.1667 | 9,064 B |
GetLengthAbsPowBenchmark | 1234(...)2345 [100] | 40,856.02 ns | 442.861 ns | 414.252 ns | 40,816.28 ns | 86.03 | 13.74 | 17.7002 | 74,120 B |
BigIntegerLog | 1234(...)2345 [100] | 42.35 ns | 0.558 ns | 0.436 ns | 42.26 ns | 0.09 | 0.02 | - | - |
CountDigitsBenchmark | 1234(...)2345 [100] | 6,511.54 ns | 60.303 ns | 56.407 ns | 6,513.08 ns | 13.70 | 2.11 | 0.9766 | 4,112 B |
ToStringBenchmark | 1234(...)2345 [150] | 734.17 ns | 11.671 ns | 9.746 ns | 735.48 ns | 1.00 | 0.00 | 0.1831 | 768 B |
GetLengthDivisionBenchmark | 1234(...)2345 [150] | 18,963.26 ns | 291.798 ns | 272.948 ns | 18,927.82 ns | 25.76 | 0.43 | 2.2278 | 9,384 B |
GetLengthMultiplicationBenchmark | 1234(...)2345 [150] | 7,402.45 ns | 146.064 ns | 326.694 ns | 7,373.84 ns | 9.63 | 0.46 | 4.0283 | 16,848 B |
GetLengthAbsPowBenchmark | 1234(...)2345 [150] | 96,553.68 ns | 3,203.319 ns | 9,344.232 ns | 94,618.21 ns | 124.75 | 3.00 | 37.7197 | 157,928 B |
BigIntegerLog | 1234(...)2345 [150] | 43.73 ns | 0.847 ns | 0.792 ns | 43.57 ns | 0.06 | 0.00 | - | - |
CountDigitsBenchmark | 1234(...)2345 [150] | 13,799.37 ns | 181.226 ns | 169.519 ns | 13,831.45 ns | 18.75 | 0.27 | 1.9531 | 8,184 B |
ToStringBenchmark | 1234(...)2345 [200] | 1,170.38 ns | 22.340 ns | 18.655 ns | 1,167.00 ns | 1.00 | 0.00 | 0.2365 | 992 B |
GetLengthDivisionBenchmark | 1234(...)2345 [200] | 30,656.86 ns | 493.053 ns | 437.079 ns | 30,659.95 ns | 26.21 | 0.70 | 3.5400 | 14,896 B |
GetLengthMultiplicationBenchmark | 1234(...)2345 [200] | 9,459.47 ns | 115.420 ns | 102.317 ns | 9,419.28 ns | 8.09 | 0.17 | 6.3477 | 26,600 B |
GetLengthAbsPowBenchmark | 1234(...)2345 [200] | 124,083.76 ns | 2,093.310 ns | 1,748.008 ns | 124,681.07 ns | 106.03 | 1.39 | 65.1855 | 272,776 B |
BigIntegerLog | 1234(...)2345 [200] | 50.15 ns | 1.029 ns | 1.338 ns | 49.84 ns | 0.04 | 0.00 | - | - |
CountDigitsBenchmark | 1234(...)2345 [200] | 26,047.64 ns | 520.981 ns | 1,452.286 ns | 25,753.68 ns | 24.32 | 0.90 | 3.2043 | 13,416 B |
CodePudding user response:
There's this:
static int GetLength(this BigInteger value)
{
int result = 1;
while (value >= 10)
{
result ;
value /= 10;
}
return result;
}
I know it seems like the loop could be slow, but it's still probably faster than ToString()
, which thanks to localization/cultural issues is probably a LOT slower even than you'd expect.
Or we can invert it to use multiplication and count up, which might be faster depending on the instruction set of the computer, even though it needs an additional variable:
static int GetLength(this BigInteger value)
{
BigInteger power = 10;
int result = 1;
while (power < value)
{
result ;
power *= 10;
}
return result;
}
Finally, I need to point out both of these assume positive values. I feel this is justified, because the original ToString()
code would also return a count including the negative sign which is system dependent.
Windows lets me set the default negative sign to be any arbitrary string I want, and someone else logged into a different profile on the same computer could have a completely different setting here (thankfully, people hardly ever change this). The UI limits the number of characters you can use, but the registry location it writes to is much more forgiving for long negative strings.
Therefore, if the original code is allowed to be wrong for negative values, this code can to (and this is much easier to fix).
CodePudding user response:
DO NOT USE THIS!
Check the benchmarks produced by @David L - it's the worst performing.
Many thanks to @Joel Coehoorn, from whose answer I managed to derive this:
static int GetLength(BigInteger value)
{
int result = 1;
BigInteger ten = new(10);
BigInteger absValue = BigInteger.Abs(value);
while(BigInteger.Pow(ten, result) < absValue)
{
result ;
}
return result;
}
Leaving it here for feedback (again, don't use it!)
A Better Solution
As Log10
seems to be the fastest solution, but requires rounding, I've tested the following solution over 50,000 BigInteger values:
static int GetLength(BigInteger value)
{
double log10 = BigInteger.Log10(value);
return (int) Math.Ceiling(log10) 1;
}