foreach ($line in $test) {
$line.GetType()
$newline = $line -split ("<.*?>") -split ("{.*?}") # remove html and css tags
$newline.GetType()
}
I came across this when trying to use the .Trim()
method on $newline
. It works, but the intellisense did not indicate that it would. I thought .Trim()
would only work on String Objects (BaseType:System.Object), but in this instance, it seems to work on String[] Objects as well (BaseType:System.Array).
$line.GetType()
returns
IsPublic IsSerial Name BaseType
-------- -------- ---- --------
True True String System.Object
$newline.GetType()
returns
IsPublic IsSerial Name BaseType
-------- -------- ---- --------
True True String[] System.Array
First off, I would like to know why my original string was converted to an array, assuming it's the return value of -split
... Is it now an array of characters? I am a little confused.
Secondly, if there is a good answer, why do the string methods work on what is technically an array?
Coming from Python and C/C , thanks.
CodePudding user response:
Lasse V. Karlsen already provided the key information to understand why the strings ($line
) are converted to string[]
after being split. What you have most likely wanted to use in this case was the -replace
operator which is regex compatible.
Using below as an example:
$htmlcss = @'
table {
font-family: arial, sans-serif;
border-collapse: collapse;
width: 100%;
}
td, th {
border: 1px solid #dddddd;
text-align: left;
padding: 8px;
}
tr:nth-child(even) {
background-color: #dddddd;
}
</style>
</head>
<body>
<h2>HTML Table</h2>
<table>
<tr>
<th>Company</th>
<th>Contact</th>
<th>Country</th>
</tr>
<tr>
<td>Alfreds Futterkiste</td>
<td>Maria Anders</td>
<td>Germany</td>
</tr>
</table>
</body>
</html>
'@
Using -replace
to remove the HTML and CSS tags then -split
to get a string[]
and lastly filter the array to skip the empty lines:
$htmlcss -replace '(?s)<.*?>|\{.*?\}' -split '\r?\n' |
Where-Object { $_ -match '\S' }
Results in:
table
td, th
tr:nth-child(even)
HTML Table
Company
Contact
Country
Alfreds Futterkiste
Maria Anders
Germany
Note, regarding \{.*?\}
, for this regex to work you must use it with a string or multi-line string. It will not work with a string array string[]
. You will also need to enable the (?s)
flag. Supposing you were reading this from a file you would want to use the -Raw
switch on Get-Content
.
CodePudding user response:
Santiago Squarzon's helpful answer provides an effective solution to what your code is trying to do.
To answer your questions as asked, building on Lasse V. Karlsen's helpful comments:
I would like to know why my original string was converted to an array, assuming it's the return value of -split... Is it now an array of characters?
The -split
operator splits a string or array of strings into substrings by a given separator regex and returns the substrings as a string array ([string[]]
)
'foo|bar' -split '\|' # -> [string[]] ('foo', 'bar')
With an array as input, the splitting operation is performed on each element separately, and the per-element result arrays are concatenated to form a single, flat array.
'foo|bar', 'baz|quux' -split '\|' # -> [string[]] ('foo', 'bar', 'baz', 'quux')
Secondly, if there is a good answer, why do the string methods work on what is technically an array?
What you're seeing is a feature semi-officially known as member enumeration: The ability to access a member (a property or a method) on a collection and have it implicitly applied to each of its elements, with the results getting collected in an array (for two or more elements).
It is described in detail in this answer.
As for giving the feature an official name: As of this writing, it is described, but not named in the conceptual
about_Properties
help topic. GitHub docs issue #8437 asks for it to be given an official name.
Quick example:
# .Trim() is called on *each element* of the input array.
PS> (' foo', 'bar ').Trim() | ForEach-Object { "[$_]" }
[foo]
[bar]