I've tried really hard not ask this question, but I keep coming back to it as I'm not sure if I'm doing everything as efficiently as I can or if there might be problems under the hood. Basically, I have a CSV file that contains a number field, but it includes a decimal and values out to the ten-thousandths place, e.g. 15.0000
. All I need to do is convert that to a whole number without the decimal place.
I came across a related question here, but the selected answer seems to cast doubt on casting the string representation directly to an integer data type - without explaining why.
Simply casting the string as an
int
won't work reliably. You need to convert it to anint32
.
I've haven't had much luck getting the [System.Convert]
method to work, or doing something like $StringNumber.ToInt32()
. I realize that once I save the data back to the PSCustomObject
they'll be stored as strings, so at the end of the day maybe I'm making this even more complicated than necessary for my use case and I just need to reformat $StringNumber
...but even that has caused me some problems.
Any ideas on why casting wouldn't be reliable or better ways to handle this in my case?
Examples of what I've tried:
PS > $StringNumber = '15.0000'
PS > [Convert]::ToInt32($StringNumber)
#MethodInvocationException: Exception calling "ToInt32" with "1" argument(s): "Input string was not in a correct format."
PS > [Convert]::ToInt32($StringNumber, [CultureInfo]::InvariantCulture)
#MethodInvocationException: Exception calling "ToInt32" with "2" argument(s): "Input string was not in a correct format."
PS > $StringNumber.ToInt32()
#MethodException: Cannot find an overload for "ToInt32" and the argument count: "0".
PS > $StringNumber.ToInt32([CultureInfo]::InvariantCulture)
#MethodInvocationException: Exception calling "ToInt32" with "1" argument(s): "Input string was not in a correct format."
PS > $StringNumber.ToString("F0")
#MethodException: Cannot find an overload for "ToString" and the argument count: "1".
PS > $StringNumber.ToString("F0", [CultureInfo]::CurrentCulture)
#MethodException: Cannot find an overload for "ToString" and the argument count: "2".
PS > "New format: {0:F0}" -f $StringNumber
#New format: 15.0000
So basically what I've come up with is:
- Someone in 2014 said casting my
string
to anint
wouldn't work reliably, even though it seems like the Cast operator is actually doing a conversion - The
ToInt32
methods don't like strings with decimals as the input - Apparently
String.ToString
Method is useless - Thanks to
String.ToString
and the processing order of composite formatting, simple "reformatting" of my string representation won't work
In summary: Is there a way to safely cast my $StringNumber
into a whole number, and, if so, what's the most efficient way to do it on a large dataset?
Bonus Challenge:
If anyone can make this work using the ForEach
magic method then I'll buy you a beer. Here's some pseudo code that doesn't work, but would be awesome if it did. As far as I can figure out, there's no way to reference the current item in the collection when setting the value of a string property
#This code DOES NOT work as written
PS > $CSVData = Import-Csv .\somedata.csv
PS > $CSVData.ForEach('StringNumberField', [int]$_.StringNumberField)
CodePudding user response:
If your string representation can be interpreted as a number, you can cast it to an integer, as long as the specific integer type used is large enough to accommodate (the integer portion of) the value represented (e.g.
[int] '15.0000'
)A string that can not be interpreted as a number or represents a number that is too large (or small, for negative numbers) for the target type, results in a statement-terminating error; e.g.
[int] 'foo'
or[int] '444444444444444'
Note that PowerShell's casts and implicit string-to-number conversions use the invariant culture, which means that only ever
.
is recognized as the decimal mark (and,
is effectively ignored, because it is interpreted as the thousands-grouping symbol), irrespective of the culture currently in effect (as reflected in$PSCulture
).As for integer types you can use (all of them - except the open-ended
[bigint]
type - support::MinValue
and::MaxValue
to determine the range of integers they can accommodate; e.g.[int]::MaxValue
)- Signed integer types:
[sbyte]
,[int16]
,[int]
([int32]
),[long]
([int64]
),[bigint]
- Unsigned integer types:
[byte]
,[uint16]
,[uint]
([uint32]
),[ulong]
([uint64]
) - but note that PowerShell itself uses only signed types natively in its calculations.
- Signed integer types:
Casting to an integer type performs half-to-even midpoint rounding, which means that a string representing a value whose fractional part is
.5
is rounded to the nearest even integer; e.g.[int] '1.5'
and[int] '2.5'
both round to2
.To choose a different midpoint rounding strategy, use
[Math]::Round()
with aSystem.MidpointRounding
argument; e.g.:[Math]::Round('2.5', [MidPointRounding]::AwayFromZero) # -> 3
To unconditionally round up or down to the nearest integer, use
[Math]::Ceiling()
,[Math]::Floor()
, or[Math]::Truncate()
; e.g.:[Math]::Ceiling('2.5') # -> 3 [Math]::Floor('2.5') # -> 2 [Math]::Truncate('2.5') # -> 2 # [Math]::Ceiling('-2.5') # -> -2 [Math]::Floor('-2.5') # -> -3 [Math]::Truncate('-2.5') # -> -2
Note: While the resulting number is conceptually an integer, technically it is a
[double]
or - with explicit[decimal]
or integer-number-literal input - a[decimal]
.
As for the bonus challenge:
- With an integer-type cast:
[int[]] (Import-Csv .\somedata.csv).StringNumberField
Note: (Import-Csv .\somedata.csv).StringNumberField.ForEach([int])
would work too, but offers no advantage here.
- With a
[Math]::*()
call and the.ForEach()
array method:
(Import-Csv .\somedata.csv).StringNumberField.ForEach(
{ [Math]::Round($_, [MidPointRounding]::AwayFromZero) }
)
CodePudding user response:
Casting [int]
as you explained, is something that would work in most cases, however it is also prone to errors. What if the number is higher than [int]::MaxValue
? The alternative you could use to avoid the exceptions would be to use the -as [int]
operator however there is another problem with this, if the value cannot be converted to integer you would be getting $null
as a result.
To be safe that the string will be converted and you wouldn't get null as a result first you need to be 100% sure that the data you're feeding is correct or assume the worst and use [math]::Round(..)
in combination with -as [decimal]
or -as [long]
or -as [double]
(∞
) to round your numbers:
[math]::Round('123.123' -as [decimal]) # => 123
[math]::Round('123.asd' -as [decimal]) # => 0
Note: I'm using round but [math]::Ceiling(..)
or [math]::Floor(..)
or [math]::Truncate(..)
are valid alternatives too, depending on your expected output.
Another alternative is to use [decimal]::TryParse(..)
however this would throw if there ever be something that is not a number:
$StringNumber = '15.0000'
$ref = 0
[decimal]::TryParse( $StringNumber, ([ref]$ref) )
[math]::Round($ref) # => 15
Using Hazrelle's advise would work too but again, would throw an exception for invalid input or "Value was either too large or too small for an Int32."
[System.Decimal]::ToInt32('123123123.123') # => 123123123
As for the Bonus Challenge, I don't think it's possible to cast and then set the rounded values to your CSV on just one go using ForEach(type convertToType)
, and even if it was, it could also bring problems because of what was mentioned before:
$csv = @'
"Col1","Col2"
"val1","15.0000"
"val2","20.123"
"val3","922337203685477.5807"
'@ | ConvertFrom-Csv
$csv.Col2.ForEach([int])
Cannot convert argument "item", with value: "922337203685477.5807", for "Add" to type "System.Int32": "Cannot convert value "922337203685477.5807" to type "System.Int32".
Using .foreach(..)
array method combined with a script block would work:
$csv.ForEach({
$_.Col2 = [math]::Round($_.Col2 -as [decimal])
})
In case you wonder why not just use [math]::Round(..)
over the string and forget about it:
[math]::Round('123.123') # => 123 Works!
But what about:
PS /> [math]::Round([decimal]::MaxValue -as [string])
7.92281625142643E 28
PS /> [math]::Round([decimal]([decimal]::MaxValue -as [string]))
79228162514264337593543950335