I would like to determine via a search whether a character more often or less than a defined number. For example
ABC_2019_02_01_blabla_05.pdf <- right
ABC_DEF_192_1111_oaoaoa.pdf -false
For me, the decisive factor is the amount of "_" used. For example, only 5 times the character _ may have been used.
Get-ChildItem -af -recurse | Where-Object { $_.Name -notmatch '_*_*_*_*_' } | % { $_.FullName }
Don't work for that.
I would like to determine via a search whether a character more often or less than a defined number.
CodePudding user response:
You can use the .split
method as a way to count the number of instance of a specific character.
$Files = Get-ChildItem -af -recurse
$Files | Where-Object {$_.Name.Split('_').Length-1 -gt 5}
Alternatively...
# Using regex... careful with special characters
$Files | Where-Object { [regex]::matches($_.Name, "_").count -gt 5}
# Grouping the char array representation of the string
$Files | Where-Object { ($_.Name.ToCharArray() | Group-Object | Where-Object Name -eq '_').Count -gt 5 }
And just for fun...
If you want both the valid items and invalid ones into a separate array, you can achieve that via the .where
method, which accept an additional parameter to further define the search.
Using that sample below, invalid items (more than 5 times the _
character) will end up in the first array ($Invalid
) while the valid items (the ones that were not picked up by our condition) will end up in the second array $Valid
(reference)
$Invalid,$Valid = $Files.Where({$_.Name.Split('_').Length-1 -gt 5},'split')
CodePudding user response:
To complement Sage Pourpre's helpful solutions:
Two more ways to count the number of occurrences of a given character in a string:
# -> 5, 4
@{ Name = 'ABC_2019_02_01_blabla_05.pdf' },
@{ Name = 'ABC_DEF_192_1111_oaoaoa.pdf' } |
ForEach-Object {
($_.Name.ToCharArray() -eq '_').Count
}
This relies on the -eq
operator acting as a filter when the LHS is an array: in effect, it returns the subarray of characters that are _
, whose element count is therefore the number of _
chars. in the input.
# -> 5, 4
@{ Name = 'ABC_2019_02_01_blabla_05.pdf' }, # Sample input
@{ Name = 'ABC_DEF_192_1111_oaoaoa.pdf' } |
ForEach-Object {
($_.Name -replace '[^_]').Length
}
This relies on using the regex-based -replace
operator for removing all characters other than _
([^_]
) from the input string, which leaves a string composed only of the _
chars., whose length is therefore the number of _
chars. in the input.
As for what you tried:
Your attempt, if corrected, has the potential to perform additional validation, such as requiring that _
characters be surrounded by at least one other character, such that, say, 'a_b_c_d_e_f.pdf'
is valid, but 'abcdef_____.pdf'
is not.
# -> $true, $false, $false
@{ Name = 'ABC_2019_02_01_blabla_05.pdf' }, # OK
@{ Name = 'abcdef_____.pdf' }, # Correct number of "_", but in the wrong places
@{ Name = 'ABC_DEF_192_1111_oaoaoa.pdf' } | # Not enough "_"
ForEach-Object {
$_.Name -match '^([^_] _){5}[^_] $'
}
For an explanation of the regex and the ability to experiment with it, see this regex101.com page.