Home > Software engineering >  compare 2 text file lines and put # in new file lines present in old file
compare 2 text file lines and put # in new file lines present in old file

Time:08-05

I am having 2 files with service details.

File 1:

Netlogon,OS,WARNING
#Opsware Agent,OS,WARNING
Server,OS,WARNING
VMware Tools,OS,WARNING
#Background Intelligent Transfer Service,OS,WARNING
#Background Tasks Infrastructure Service,OS,WARNING
#Base Filtering Engine,OS,WARNING
#Cb Defense WSC,OS,WARNING
#CDPUserSvc_214457e,OS,WARNING

File 2:

Netlogon,OS,WARNING
Opsware Agent,OS,WARNING
Server,OS,WARNING
VMware Tools,OS,WARNING
Background Tasks Infrastructure Service,OS,WARNING
Base Filtering Engine,OS,WARNING
Cb Defense WSC,OS,WARNING
CDPUserSvc_214457e,OS,WARNING

As you can see, In File 1, there are entries which are hashed out but not in File 2 and also File 2 is having 1 entry less than File 1.

I need to put # in front of each line in the new file (File 2) which is present in the old file (File 1). if any entry is not present in new file (File 2) we can ignore. So the output expected is

File 2:

Netlogon,OS,WARNING
#Opsware Agent,OS,WARNING
Server,OS,WARNING
VMware Tools,OS,WARNING
#Background Tasks Infrastructure Service,OS,WARNING
#Base Filtering Engine,OS,WARNING
#Cb Defense WSC,OS,WARNING
#CDPUserSvc_214457e,OS,WARNING

Please let me know how can I do that. I am able to fetch the hashed out entries from File 1 but not sure how to compare and modify

gc C:\Configs\Services.txt | % { if($_ -match "#") {write-host $_}}

CodePudding user response:

You mean to 'hash-out' any lines in file2 that are present as 'hashed-out' lines in file1 ?

Try

$file1 = Get-Content 'D:\Test\file1.txt'
$file2 = Get-Content 'D:\Test\file2.txt' | ForEach-Object {
    if ($file1.Contains("#$_")) { "#$_" }
    else { $_ }
}

$file2 | Set-Content -Path 'D:\Test\file3.txt'

Output in file3.txt:

Netlogon,OS,WARNING
#Opsware Agent,OS,WARNING
Server,OS,WARNING
VMware Tools,OS,WARNING
#Background Tasks Infrastructure Service,OS,WARNING
#Base Filtering Engine,OS,WARNING
#Cb Defense WSC,OS,WARNING
#CDPUserSvc_214457e,OS,WARNING

CodePudding user response:

Complementing Theo's helpful answer, here is an approach using a Hashset to quickly find lines from the 2nd file in the 1st file, without having to search through all lines linearly. This can give a noticable performance improvement if the actual input files consist of thousands of lines.

# Read the lines of the 1st file into a Hashset
[Collections.Generic.Hashset[string]] $file1Lines = Get-Content 'file1.txt'

# For each line of file 2
Get-Content 'file2.txt' | ForEach-Object {

    # Hashset overrides Contains() for much faster lookup, compared to array
    if( $file1Lines.Contains("#$_") ) { "#$_" } else { $_ }

    # Alternative for PS 7 : use ternary operator
    # $file1Lines.Contains("#$_") ? "#$_" : $_
} |
Set-Content 'Output.txt'
  • Related