I have a data array like below i need to format it like shown
a = ["8619 [EC006]", "9876 [ED009]", "1034 [AX009]"]
Need to format like ["EC006", "ED009", "AX009"]
can any one please help
CodePudding user response:
Input
a = ["8619 [EC006]", "9876 [ED009]", "1034 [AX009]"]
Code
p a.collect { |x| x[/\[(.*)\]/, 1] }
Output
["EC006", "ED009", "AX009"]
CodePudding user response:
arr = ["8619 [EC006]", "9876 [ED009]", "1034 [AX009]"]
To merely extract the strings of interest, assuming the data is formatted correctly, we may write the following.
arr.map { |s| s[/(?<=\[)[^\]]*/] }
#=> ["EC006", "ED009", "AX009"]
In the regular expression (?<=\[)
is a positive lookbehind that asserts the previous character is '['
. The ^
at the beginning of the character class [^\]]
means that any character other than ']'
must be matched. Appending the asterisk ([^\]]*
) causes the character class to be matched zero or more times.
To confirm the correctness of the formatting as well, use
arr.map { |s| s[/\A[1-9]\d{3} \[\K[A-Z]{2}\d{3}(?=]\z)/] }
#=> ["EC006", "ED009", "AX009"]
Note that at the link I replaced \A
and \z
with ^
and $
, respectively, in order to test the regex against multiple strings.
This regular expression can be broken down as follows.
\A # match beginning of string
[1-9] # match a digit other than zero
\d{3} # match 3 digits
[ ] # match one space
\[ # match '['
\K # reset start of match to current stringlocation and discard
# all characters previously matched from match that is returned
[A-Z]{2} # match 2 uppercase letters
\d{3} # match 3 digits
(?=]\z) # positive lookahead asserts following character is
# ']' and that character is at the end of the string
In the above I placed a space character in a character class ([ ]
) merely to make it visible to the reader.