output [15:0] pin;
output [1:0] en;
input [6:0] dddr;
input [6:0] dbg;
replace this with ( I am counting the bus)
16 : pin : output;
2 : en : output;
7 : dddr : input;
7 : dbg :input;
I tried this code after opening the file and stored it in var. but i am not able to filter it like above
if ($var =~ /(\w )\[(\d )\:/) {
print "word=$1 number=$2\n";
}
//i am trying to add : in middle of the columns also
CodePudding user response:
You are missing the whitespace after the word characters in your pattern.
(\w ) \[(\d ):
VVVVVVVV
output [15:0] pin;
This is easily fixed. Add it into the pattern in between, like so:
use strict;
use warnings;
use feature 'say';
while (my $line = <DATA>) {
if ($line =~ /(\w )\s \[(\d )\:/) {
say "word=$1 number=$2";
}
}
__DATA__
output [15:0] pin;
output [1:0] en;
input [6:0] dddr;
input [6:0] dbg;
This produces:
word=output number=15
word=output number=1
word=input number=6
word=input number=6
To get to your desired output, you'll have to refine the pattern and probably do some incrementing too.
CodePudding user response:
You are not taking account of the whitespace between (\w )
and the (\d )
parts of your regex.
while (<DATA>)
{
if ( /(\w )\s \[(\d )\:/) {
print "word=$1 number=$2\n";
}
}
__DATA__
output [15:0] pin;
output [1:0] en;
input [6:0] dddr;
input [6:0] dbg;
That outputs this
word=output number=15
word=output number=1
word=input number=6
word=input number=6
To get to your close to your final requirement, the regex can be expanded to match the other parts you need, as follows
while (<DATA>)
{
if ( /(\w )\s \[(\d )\:\d \]\s (.*);/) {
print "$2 : $3 : $1\n";
}
}
__DATA__
output [15:0] pin;
output [1:0] en;
input [6:0] dddr;
input [6:0] dbg;
which outputs this
15 : pin : output
1 : en : output
6 : dddr : input
6 : dbg : input
Not sure how you calculate the value for the first column. It appears to be the number field 1. Is that correct?
CodePudding user response:
One way to parse the shown data
use warnings;
use strict;
use feature 'say';
while (<>) {
if ( /(\S ) \s \[ ([0-9] ):[0-9] \] \s (\w )/x ) {
say $2 1, ' : ', $3, ' : ', $1, ';';
}
}
Comments
In most regex patterns a lot depends on details of the input data format, and on how much flexibility there is in what data to expect and allow.
That
\S
matches a string of non-whitespace characters; that assumes that there is a single word in the beginning, and that it may contain any characters (other than whitespace). If there may be multiple words then use. ?
instead; if only "word character"s ([a-zA-Z0-9_]
) are expected — and we want/need to enforce that — then use the far more restrictive\w
There is no space allowed inside
[]
, only numbers with a:
between them. But if it is OK for data to possibly have spaces use\[\s*
and\s*\]
In the end, the
\w
matches a "word," consisting of one or more\w
's. If more than one word can be expected then use. ?
(but which now allows any characters), which matches all up to the first instance of the following pattern (here;
). If that part may even contain semi-colons then use.
which takes everything up to the very last;
In all of this the
*
quantifier instead, like.*
So do your best to understand what the data is like exactly, as much as possible. Or thoughfully articulate your requirements, in what precisely to restrict/allow.