Home > Mobile >  Write a script to get 10 longest words chart and put them in separate file
Write a script to get 10 longest words chart and put them in separate file

Time:01-19

I need to sort the words in a text file and output them to a file

Function AnalyseTo-Doc{
    param ([Parameter(Mandatory=$true)][string]$Pad )

    $Lines = Select-String -Path $Pad -Pattern '\b[A-Za-zA-Яа-я]{2,}\b' -AllMatches
    $Words = ForEach($Line in $Lines){
        ForEach($Match in $Line.Matches){
            [PSCustomObject]@{
                LineNumber = $Line.LineNumber
                Word       = $Match.Value
            }
        }
    }
    $Words | Group-Object Word | ForEach-Object {
        [PSCustomObject]@{
            Count= $_.Count
            Word = $_.Name
            Longest= $_.Lenght
        }
    }
    | Sort-Object -Property Count | Select-Object -Last 10
}

AnalyseTo-Doc 1.txt
#Get-Content 1.txt | Sort-Bubble -Verbose | Write-Host Sorted Array: | Select-Object -Last 10 | Out-File .\dz11-11.txt

it's don't work

CodePudding user response:

Sort by the Longest property (consider renaming it to Length), which is intended to contain the word length, but must be redefined to $_.Group[0].Word.Length:Tip of the hat to Daniel.

    $Words | Group-Object Word | ForEach-Object {
        [PSCustomObject]@{
            Count= $_.Count
            Word = $_.Name            
            Longest = $_.Group[0].Word.Length
        }
    } | 
    Sort-Object -Descending -Property Longest | 
    Select-Object -First 10

Note that, for conceptual clarity, I've used -Descending to sort by longest words first, which then requires -First 10 instead of -Last 10 to get the top 10.


As for what you tried:

  • Sorting by the Count property sorts by frequency of occurrence instead, i.e. by how often each word appears in the input file, due to use of Group-Object.

  • Longest= $_.Length (note that your code had a typo there) accesses the length property of each group object, which is an instance of Microsoft.PowerShell.Commands.GroupInfo, not that of the word being grouped by.

    • (Since such a GroupInfo instance has no type-native .Length property, but PowerShell automatically provides such a property as an intrinsic member, in the interest of unified handling of collections and scalars. Since a group object itself is considered a scalar (single object), .Length returns 1. PowerShell also provides .Count with the same value - unless present type-natively, which is indeed the case here: a GroupInfo object's type-native .Count property returns the count of elements in the group).

    • The [pscustomobject] instances wrapping the word at hand are stored in the .Group property, and since they're all the same here, .Group[0].Word.Length can be used to return the length of the word at hand.

CodePudding user response:

Function AnalyseTo-Doc{ param ([Parameter(Mandatory=$true)][string]$Pad )

$Lines = Select-String -Path $Pad -Pattern '\b[A-Za-zA-Яа-я]{2,}\b' -AllMatches
$Words = ForEach($Line in $Lines){
    ForEach($Match in $Line.Matches){
        [PSCustomObject]@{
            LineNumber = $Line.LineNumber
            Word       = $Match.Value
        }
    }
}
$Words | Group-Object Word | ForEach-Object {
    [PSCustomObject]@{
        #Count= $_.Count
        Word = $_.Name            
        Longest = $_.Group[0].Word.Length
    }
} | 
Sort-Object -Descending -Property Longest | Select-Object -First 10 | Out-File .\dz11-11.txt

}

AnalyseTo-Doc 1.txt

  • Related