PowerShell Implicit Type Conversions - does it happen? Is it safe?-CodePudding

I'm trying to get a handle on documentation/evidence around when PowerShell does conversions, and when a user needs to be explicit about what they want. A somewhat related question, here, has a broken link that possibly explained scenarios that the type would be adjusted. There are plenty of instances of similar problems though (namely, comparing a string to an int) - for example when you get a number via Read-Host, you're actually getting a string.

Specifically though I've found that certain mechanisms seem to handle a string representation of the number fine - but I want to understand if that is truly because they're handling it as a number, or if the output is correct in appearance, but wrong under the hood. Here's a specific example though, so the question is about Measure-Object and how it handles calculating the Sum of a string property.

PS > $Example = Import-Csv .\Example.csv
PS > $Example

Procedure_Code : 123456789
CPT_Code       : J3490
Modifier       :
Service_Date   : 02/01/2020
Revenue_Code   : 259
Units          : -100.00
Amount         : 55.00

Procedure_Code : 123456789
CPT_Code       : J3490
Modifier       :
Service_Date   : 02/02/2020
Revenue_Code   : 259
Units          : 100.00
Amount         : 55.00

PS > [Math]::Sign($Example[0].Units)
-1

PS > $Example | Measure-Object Amount -Sum

Count             : 2
Sum               : 110
Property          : Amount

PS > $Example | ForEach-Object { $Sum = $Sum   $_.Amount} ; Write-Host "$($Sum)"
55.0055.00  #Plus sign concatenates strings

PS > $Example | ForEach-Object { $Sum = $Sum   [int]$_.Amount} ; Write-Host "$($Sum)"
110  #Expected result if casting the string into an integer type before adding

So basically it seems like [Math]::Sign() works (even though it doesn't accept strings as far as I can tell), and Measure-Object is smarter about strings than a simple plus sign. But is Measure-Object casting my string into an [int]? If I use an example with decimals, the answer is more precise, so maybe it is a [single]?

What is Measure-Object doing to get the right answer, and when shouldn't I trust it?

CodePudding user response：

I'm not sure this is the be-all and end-all answer, but it works for what I was curious about. For a lot of 'quick script' scenarios where you might use something like Measure-Object - you'll probably get the correct answers you're looking for, albeit maybe slower than other methods.

Measure-Object specifically seems to use [double] for Sum, Average, and StandardDeviation and will indeed throw an error if one of the string values from a CSV Object can't be converted.

I'm still a little surprised that [Math]::Sign() works at all with strings, but seemingly glad it does

            if (_measureAverage || _measureSum || _measureStandardDeviation)
            {
                double numValue = 0.0;
                if (!LanguagePrimitives.TryConvertTo(objValue, out numValue))
                {
                    _nonNumericError = true;
                    ErrorRecord errorRecord = new(
                        PSTraceSource.NewInvalidOperationException(MeasureObjectStrings.NonNumericInputObject, objValue),
                        "NonNumericInputObject",
                        ErrorCategory.InvalidType,
                        objValue);
                    WriteError(errorRecord);
                    return;
                }

                AnalyzeNumber(numValue, stat);
            }

CodePudding user response：

Measure-Object is a special case, because it is that cmdlet's specific logic that performs automatic type conversion, as opposed to PowerShell's own, general automatic conversions that happen in the context of using operators such as , passing arguments to commands, calling .NET methods such as [Math]::Sign(), and using values as conditionals.

As your own answer hints at, Measure-Object calls the same code that PowerShell's general automatic type conversions use; these are discussed below.

Automatic type conversions in PowerShell:

In general, PowerShell always attempts automatic type conversion - and a pitfall to those familiar with C# is that even an explicit cast (e.g. [double] '1.2') may not be honored if the target type is ultimately a different one.^[1]

Supported conversions:

In addition to supporting .NET's type conversions, PowerShell implements a few built-in custom conversions and supports user-defined conversions by way of custom conversion classes and attribute classes - see this answer.

To-number conversion and culture-sensitivity when converting to and from strings:

Typically, PowerShell uses culture-invariant conversion, so that, for instance, only . is recognized as the decimal mark in "number strings" that represent floating-point numbers.
String-to-number conversion, including how specific numeric data types are chosen, is covered in this answer.
To-string conversion (including from numbers) is covered in this answer.

Automatic type conversions by context:

Operands (operator arguments):
- Some operators, such as -replace and -match, operate on strings only, in which case the operand(s) are invariably converted to strings (which always succeeds):
  - 42 -replace 2, '!' yields '4!': both 42 and 2 were implicitly converted to strings.
- Others, such as - and \, operate on numbers only, in which case the operand(s) are invariably converted to numbers (which may fail):
  - '10' - '2' yields 8, an [int]
- Yet others, such as and *, can operate on either strings or numbers, and it is the type of the LHS that determines the operation type:
  - '10' 2 yields '102', the concatenation of string '10' with the to-string conversion of number 2.
  - By contrast, 10 '2' yields 12, the addition of the number 10 and the to-number conversion of string '2'.
- See this answer for more information for more information.
Command arguments:
- If the target parameter has a specific type (other than [object] or [psobject]), PowerShell will attempt conversion.
- Otherwise, for literal arguments, the type is inferred from the literal representation; e.g. 42 is passed as an [int], and '42' as a [string].
Values serving as conditionals, such as in if statements:
- A conditional must by definition be a Boolean (type [bool]), and PowerShell automatically converts a value of any type to [bool], using the rules described in the bottom section of this answer.
Method arguments:

Automatic type conversion for .NET-method arguments can be tricky, because multiple overloads may have to be considered, and - if the argument's aren't of the expected type - PowerShell has to find the best overload based on supported type conversions - and it may not be obvious which one is chosen.

Executing [Math]::Sign (calling this as-is, without ()) reveals 8(!) different overloads, for various numeric types.

More insidiously, the introduction of additional .NET-method overloads in future .NET versions can break existing PowerShell code, if the new overload then happens to be the best match in a given invocation.

A high-profile example is the [string] type's Split() method - see the bottom section of this answer.

Therefore, for long-term code stability:

Avoid .NET methods in favor of PowerShell-native commands, if possible.
Otherwise, if type conversion is necessary, use casts (e.g. [Math]::Sign([int] '-42')) to guide method-overload resolution to avoid ambiguity.

^{[1] E.g., the explicit [double] is quietly converted to an [int] in the following statement: & { param([int] $i) $i } ([double] '1.2'). Also, casts to .NET interfaces generally have no effect in PowerShell - except to guide overload resolution in .NET method calls.}