Home > database >  How to get searched strings (not lines) in a file using PowerShell?
How to get searched strings (not lines) in a file using PowerShell?

Time:12-07

I have a file that contains multiples strings between parentheses that represents country's names.

This (USA) is a bad text (France)
with countries (Luxembourg) between () (Germany)
Whith multiple (Luxembourg) countries (USA) per line
and some lines without countries
To search (France) and find (Belgique) duplicate
countries (USA)   

I want to extract all countries and display each country found on a new line.

What I'm expecting is following

USA
France
Luxembourg
Germany
Luxembourg
USA
France
Belgique
USA

Using a special tool named BS2EDT editor on BS2000 Mainframe, the solution can be

list-string /(?<=\()[^)] (?=\))/,from=lettre.pays.txt

Using PowerShell, what is the shorter solution ?

CodePudding user response:

Following tricky solutions is working on my PC

$regex = '(?<=\()[^)] (?=\))'
(Select-String .\lettre.pays.txt -Pattern $regex -AllMatches).Matches `
        | Select-String -Pattern '.*' `

Get-Content command read input file.

First Select-String command find ALL strings using same Regex given in question.

.Matches and second Select-String command display strings found.

It is also possible to sort all countries found in adding SORT-OBJECT command !

$regex = '(?<=\()[^)] (?=\))'
(Select-String .\lettre.pays.txt -Pattern $regex -AllMatches).Matches `
        | Select-String -Pattern '.*' `
            | Sort-Object

that display following result ...

Belgique
France
France
Germany
Luxembourg
Luxembourg
USA
USA
USA

CodePudding user response:

Using you can accomplish it using Select-String, definitely not shorter than what you already have:

Select-String .\lettre.pays.txt -Pattern '(?<=\()[^)] (?=\))' -AllMatches |
    ForEach-Object { $_.Matches.Value }

As for your regex, I believe (?=\)) is not needed and could be removed from the pattern.

  • Related