Home > Net >  How can I remove directory paths listed in a text file but keep the file paths?
How can I remove directory paths listed in a text file but keep the file paths?

Time:10-26

I have a text file with thousands of lines, containing both directory paths and file paths. I would like to loop through each line of the text file and remove any lines containing a directory path, and keep all lines containing a file path. An example of two lines (one directory, and one path from the text file):

exampleDirectoryPath/tags/10.0.0.8/tools/
exampleFilePath/tags/10.0.0.8/tools/hello.txt

So far, to loop through the text file, I have:

foreach ($line in [System.IO.File]::ReadLines("file.txt")) {
    if ($line -match ".*/.*$") {
        $line
    }
}

Goal output:

exampleFilePath/tags/10.0.0.8/tools/hello.txt

Note: I do not want to hardcode file extensions. There are thousands of files to traverse and I dont know what extensions are present, so I would like to return all of them.

CodePudding user response:

So, the basic logic here is easy:

Get-Content "file.txt" | where { $_ is a file path... }

It kind of depends on how you want to determine, if it's a file path

If all of your directory paths end in "/", you could simply do:

where { -not $_.EndsWith("/") }

or:

where { [system.io.Path]::GetFileName($_) -eq "" }

If not, but all of your file paths definitely have an extension, you could do:

where { [system.io.Path]::GetExtension($_) -ne "" }

If all of the paths actually exist, you could also do this:

where { Test-Path $_ -Type Leaf }

CodePudding user response:

To provide a concise solution that also performs well:

(Get-Content -ReadCount 0 file.txt) -notmatch '\\$'
  • Using -ReadCount 0 with Get-Content is a performance optimization that returns all lines in the input file as a single array object rather than collecting the lines one by one.

    • Additionally, -ReadCount 0 ensures that an array is output even if the input file happens to have just one line.
  • -notmatch, the negated form of the regex-based -match operator, acts as a filter with an array-valued LHS, returning the (non)matching elements (lines) (as a new array).

    • Regex \\$ matches a verbatim \ at the end ($) of each input string (line).

Note: As your question suggests, the solution above assumes that directories can be distinguished from files formally, based on whether the lines in the input file end in / or not.

CodePudding user response:

I personally would not use regex for this for the simple reason that, even though you may be able to validate if the path's pattern matches the pattern of a file or folder, it cannot validate if it actually exists. I would use this following your code:

$result = foreach($line in [System.IO.File]::ReadLines("file.txt"))
{
    if(([System.IO.DirectoryInfo]$line).Attributes -eq 'Archive')
    {
        $line
    }
}
  • Related