perl code to search and replace a pattern-CodePudding

output        [15:0] pin;                
output         [1:0] en;                
input          [6:0] dddr;            
input          [6:0] dbg;

replace this with ( I am counting the bus)

16 : pin : output;                         
2 : en : output;                
7 : dddr : input;            
7 : dbg :input;

I tried this code after opening the file and stored it in var. but i am not able to filter it like above

if ($var =~ /(\w )\[(\d )\:/) {  
    print "word=$1 number=$2\n";
}

//i am trying to add : in middle of the columns also

CodePudding user response：

You are missing the whitespace after the word characters in your pattern.

(\w  )       \[(\d ):
      VVVVVVVV
output        [15:0] pin;

This is easily fixed. Add it into the pattern in between, like so:

use strict;
use warnings;
use feature 'say';

while (my $line = <DATA>) {
    if ($line =~ /(\w )\s \[(\d )\:/) {
        say "word=$1 number=$2";
    }
}

__DATA__
output        [15:0] pin;
output         [1:0] en;
input          [6:0] dddr;
input          [6:0] dbg;

This produces:

word=output number=15
word=output number=1
word=input number=6
word=input number=6

To get to your desired output, you'll have to refine the pattern and probably do some incrementing too.

CodePudding user response：

You are not taking account of the whitespace between (\w ) and the (\d ) parts of your regex.

while (<DATA>)
{
    if ( /(\w )\s \[(\d )\:/) {  
        print "word=$1 number=$2\n";
    }
}

__DATA__
output        [15:0] pin;                
output         [1:0] en;                
input          [6:0] dddr;            
input          [6:0] dbg;

That outputs this

word=output number=15
word=output number=1
word=input number=6
word=input number=6

To get to your close to your final requirement, the regex can be expanded to match the other parts you need, as follows

while (<DATA>)
{
    if ( /(\w )\s \[(\d )\:\d \]\s (.*);/) {  
        print "$2 : $3 : $1\n";
    }
}

__DATA__
output        [15:0] pin;                
output         [1:0] en;                
input          [6:0] dddr;            
input          [6:0] dbg;

which outputs this

15 : pin : output
1 : en : output
6 : dddr : input
6 : dbg : input

Not sure how you calculate the value for the first column. It appears to be the number field 1. Is that correct?

CodePudding user response：

One way to parse the shown data

use warnings;
use strict;
use feature 'say';

while (<>) {             
    if ( /(\S ) \s  \[ ([0-9] ):[0-9]  \] \s  (\w )/x ) {
        say $2 1, ' : ', $3, ' : ', $1, ';';  
    }
}

Comments

In most regex patterns a lot depends on details of the input data format, and on how much flexibility there is in what data to expect and allow.

That \S matches a string of non-whitespace characters; that assumes that there is a single word in the beginning, and that it may contain any characters (other than whitespace). If there may be multiple words then use . ? instead; if only "word character"s ([a-zA-Z0-9_]) are expected — and we want/need to enforce that — then use the far more restrictive \w
There is no space allowed inside [], only numbers with a : between them. But if it is OK for data to possibly have spaces use \[\s* and \s*\]
In the end, the \w matches a "word," consisting of one or more \w's. If more than one word can be expected then use . ? (but which now allows any characters), which matches all up to the first instance of the following pattern (here ;). If that part may even contain semi-colons then use . which takes everything up to the very last ;
In all of this the quantifier requires that there be at least one occurrence of the previous pattern. If it is acceptable that there is nothing in that place in data (that last word just missing for example) then use the * quantifier instead, like .*

So do your best to understand what the data is like exactly, as much as possible. Or thoughfully articulate your requirements, in what precisely to restrict/allow.