Home > Mobile >  One liner to find files that match two strings on different lines
One liner to find files that match two strings on different lines

Time:09-17

Currently on version 4.0. I am easily able to find files containing one string. Trying to do something slightly more complex with two strings on different lines has not worked out. The problem is apparently due to the way lines are evaluated individually and results are piped as objects between commands making the obvious solutions do not work.

# Works, so simple
Get-ChildItem | Select-String -Pattern "Test1" -List | Select Path

# Fail
Get-ChildItem | Select-String -Pattern "Test1. Test2" -List | Select Path

# Fail
Get-ChildItem | Select-String -Pattern "Test1" -List | Select-String -Pattern "Test2" -List | Select Path

Conceptually this is so simple, but I've been spinning my wheels on finding a solution. Might be easier in newer version but not able to update on server involved. I could write a script to get to the solution in Python, but at this point I want to make PowerShell do it out of spite.

Knowing what is returned from each command, how to get underlying object data, how to then pipe that data to subsequent commands, etc has repeatedly been an issue for me with PowerShell :(

CodePudding user response:

The following commands use Where-Object to test for our conditions.


Get-ChildItem | Where-Object { 
    ($_ | Select-String -Pattern 'test1' -Quiet) -and
    ($_ | Select-String -Pattern 'test2' -Quiet) 
}

This line checks if the file piped from Get-ChildItem (represented by $_) matches both 'test1' AND 'test2'. Using -Quiet parameter will make it so that Select-String returns true if it finds a match or false if it doesn't.


Get-ChildItem -File | Where-Object {
    ($_ | Select-String -Pattern 'test1', 'test2' | 
        Group-Object -Property 'Pattern' ).Count -eq 2
    }

With this line I give Select-String 2 patterns to look for. This will not ensure that both patterns are matched, but instead will return results for each that are matched. On the [Microsoft.PowerShell.Commands.MatchInfo] objects returned by Select-String is a property called Pattern. If I use Group-Object to group all the MatchInfo results I got back by the property 'Pattern' I can then check to see if I end up with 2 group objects, one for each of my patterns. If true, the object is returned by Where-Object


Just as with Python or any other languages there are commands/methods and parameters you need to discover and learn that will make your life easier and usually there is more than one way to do something.

Make use of the built-in help PowerShell provides to get to know the command and the objects that these commands produce. Get-Command, Get-Help, Get-Member are arguably 3 of the most useful cmdlets, especially to people new to or struggling with PowerShell

CodePudding user response:

As Olaf points out, Select-String's -Pattern parameter accepts an array of patterns.

Such a command returns lines that match any of the given patterns.

If you want to determine if a given file matches all of the given patterns (at least once), more work is needed:

$patterns = 'Test1', 'Test2'
Get-ChildItem -File |
  Select-String -Pattern $patterns | 
    Group-Object -Property Path | 
      Where-Object { 
        ($_.Group.Pattern | Select-Object -Unique).Count -eq $patterns.Count
      } | ForEach-Object Name

The above outputs the full paths of those files that match all patterns.

  • The Microsoft.PowerShell.Commands.MatchInfo instances that Select-String outputs ...

  • ... are grouped by their .Path property (the input file's full path) using the Group-Object cmdlet.

  • $_.Group.Pattern extracts the array of patterns that triggered each match using member enumeration, based on the Microsoft.PowerShell.Commands.GroupInfo instances output by Group-Object.

  • Select-Object -Unique shrinks the array of patterns that actually matched to the unique (distinct) patterns it contains.

  • The resulting array matching the count of input patterns (-eq $patterns.Count) implies that all input patterns were found (at least once), and using that in a script block passed to Where-Object means that only matching groups are output.

  • Finally, ForEach-Object Name outputs each matching group's .Name property, which contains the (stringified) value of the grouping property passed to Group-Object -Property, i.e. the full path of each input file in the case at hand.

  • Related