Home > Software engineering >  Filtering and Combining CSV Files via Shared Column Values and Outputting to a New CSV
Filtering and Combining CSV Files via Shared Column Values and Outputting to a New CSV

Time:04-16

New to PowerShell, but starting to get the hang of some basics!
Appreciate everyone's patience :)

I'm trying to match some ZIP code data from one file with matching ZIP codes and Attraction ID's in a second file.

File 1 is my 'master' = "ZipResults.csv" = a list of ZIP codes that are within 50 miles of every other ZIP code. There are 3 columns, but no headers. However, they are organized as follows:

Example segment:

SearchedZip FoundZip Radius
----------- -------- ------
12345       12345    50
12345       12346    50
12345       12347    50
12346       12346    50
12346       12344    50
12346       12347    50
12347       12347    50
12347       12346    50
12347       12349    50

The second file = "AttractionIDsWithZips.csv", which has headers "ID,Zip" and looks something like:

ID       ZIP
--       ---
112484   12346
112486   12346
5548     12347
112491   12345
5583     12349
112480   12344

I'd like to use the values from the "Zip" column of the second file to find matching "SearchedZip" rows from the first file, then pair the AttractionID's with each of their matching "FoundZips" values from the first file, and output to a 3rd file... something to the effect of:

AttractionId MatchedZip Radius
------------ ---------- ------
112484       12346      50
112484       12347      50
112484       12348      50
112486       12346      50
112486       12348      50
112486       12344      50
5548         12347      50
5548         12348      50
5548         12349      50
112491       12345      50
112491       12346      50
112491       12344      50

My current code is like this, but it's not getting any output... seems to be in a super-long loop that's not getting any results:

$hauntIdStuff = Import-Csv -Path .\AttractionIDsWithZips.csv | Select-Object -ExpandProperty Zip<br>
Import-Csv -Path .\Zips1kTo2k.csv -Header "zipS","zipF","zipR" | Where-Object {$hauntIdStuff.Zip -contains $_.zipS} | ForEach-Object {<br>
    Add-Content -Path .\IDsWithMatchingZips.csv -Value "$($_.ID),$($_.zipF),$($_.zipR)"<br>
}

CodePudding user response:

You can use Group-Object -AsHashTable to generate a hash table of the AttractionIDsWithZips.csv, this helps allows for fast lookup when searching for matching Zips:

$refTable = Import-Csv Zips1kTo2k.csv -Header 'SearchedZip', 'FoundZip', 'Radius' |
    Group-Object SearchedZip -AsHashTable -AsString

& {
    foreach($line in Import-Csv AttractionIDsWithZips.csv) {
        if($values = $refTable[$line.zip]) {
            foreach($value in $values) {
                [pscustomobject]@{
                    AttractionId = $line.ID
                    MatchedZip   = $value.SearchedZip
                    Radius       = $value.Radius
                }
            }
        }
    }
} | Export-Csv IDsWithMatchingZips.csv -NoTypeInformation

The result I got using the example CSVs provided in the question looks like this:

AttractionId MatchedZip Radius
------------ ---------- ------
112484       12346      50
112484       12346      50
112484       12346      50
112486       12346      50
112486       12346      50
112486       12346      50
5548         12347      50
5548         12347      50
5548         12347      50
112491       12345      50
112491       12345      50
112491       12345      50
  • Related