Home > OS >  How to extract coding between two tags using RegEx in perlscript
How to extract coding between two tags using RegEx in perlscript

Time:11-28

I want to extract the coding between <ix:hidden> and </ix:hidden>. Please advise how to extract them

<ix:hidden>
<ix:nonNumeric contextRef="Duration_4_1_2021_To_3_31_2022_IlKaMcQ2N0C41UxW3xo4zg" name="dei:DocumentType" id="Tc_evMsUKdlCEyCZtbxEMZIxg_1_1">DEF 14A</ix:nonNumeric>
<ix:nonNumeric contextRef="Duration_4_1_2021_To_3_31_2022_IlKaMcQ2N0C41UxW3xo4zg" name="dei:AmendmentFlag" id="Tc_nHcapE52UUqrWD0pLkbdag_2_1">false</ix:nonNumeric>
<ix:nonNumeric contextRef="Duration_4_1_2021_To_3_31_2022_IlKaMcQ2N0C41UxW3xo4zg" name="dei:EntityRegistrantName" id="Tc_ZXMW19KSmk2TfvdhMCMr_A_3_1">Walter Hamscher Co Number One</ix:nonNumeric>
<ix:nonNumeric contextRef="Duration_4_1_2021_To_3_31_2022_IlKaMcQ2N0C41UxW3xo4zg" name="dei:EntityCentralIndexKey" id="Tc_MybzAywpbUCU3LEGZc_Ftg_4_1">0000990667</ix:nonNumeric>
</ix:hidden>
use strict;
use warnings;

my @ar_sp;
my $string;
my @ar_out;

# Source File 
my $src = 'iXBRL-Tagged_tm213138-13_def14a.htm';

# open source file for reading
open(FHR, '<', $src);
  
# Destination File
my $des = 'output.txt';

# Open new file to write
open(FHW, '>', $des);
  
  
print("Copying content from $src to $des\n");
@ar_sp = <FHR>;

# Copy data from one file to another.
foreach $string ( @ar_sp ) 
{
    if ($string =~ m/<ix:hidden>(.*?)<\/ix:hidden>/)
    {
        print "Yes" . "\n";
        $string =~ m/<ix:hidden>(.*?)<\/ix:hidden>/;
        print FHW $string;
    }
    
}

# Closing the filehandles
close(FHR);
close(FHW);
   
print "File content copied successfully!\n";

===========================================

Defined Regex not matched in the script


CodePudding user response:

There are good XML parsers out there. Don't very poorly re-invent the wheel.

use XML::LibXML               qw( );
use XML::LibXML::XPathContext qw( );

my $doc = XML::LibXML->new->parse_file( 'iXBRL-Tagged_tm213138-13_def14a.htm' );

my $xpc = XML::LibXML::XPathContext->new();
$xpc->registerNs( ix => 'http://...' );

for my $hidden_node ( $xpc->findnodes( '//ix:hidden', $doc ) ) {
   print $_->toString() for $hidden_node->childNodes();
}
  •  Tags:  
  • perl
  • Related