In my scenario I have a string like this:
Lorem ipsum dolor sit amet, <?php echo Json::$product->
MILK
;?> consectetur adipiscing elit. Maecenas at dolor mauris. Sed malesuada, lectus sit amet molestie fermentum, arcu quam maximus sapien, id sollicitudin mauris purus ac dui. Cras sit amet sapien urna. Maecenas dolor est, ultricies non imperdiet sit amet, sollicitudin id est. Duis ac mi non enim dictum bibendum. Ut et mauris quis nisi sollicitudin hendrerit. <?php MessageBox('suitable_prefix' . Json::$product->PETIBOR_BISCUIT
);?> Duis ultrices, velit vel fringilla laoreet, metus neque aliquam neque, vel dapibus lectus quam ut ipsum. <script type="text/javascript"> \r\n// <![CDATA[ \r\n$(document).ready(function() { MessageBox: "<?php echo Json::$product->WHIPPED_CREAM_500_MG
. '& liquid form';?>" });\r\n// ]]>\r\n</script>Vivamus porttitor libero non venenatis porta. Sed venenatis ac purus id ultricies. Nunc volutpat, lacus faucibus aliquam feugiat, turpis lacus malesuada metus, ut aliquet sapien est nec sapien. Phasellus feugiat nisl vel eleifend varius.
In this string, I need to take and process the words (or more accurately, strings) MILK
, PETIBOR_BISCUIT
and WHIPPED_CREAM_500_MG
. The string Json::$product->
exists in all of them. But some ends with semicolons, some with parentheses, some with a dot and some with a space. What should I do to receive and process the strings I specified in the most error-free way?
I tried things like:
var matches = Regex.Matches(str, @"(?:\S \s)?\S*Json::\\$product->\S*(?:\s\S )?", RegexOptions.IgnoreCase);
for (var i = 0; i < matches.Count; i )
{
MessageBox.Show(matches[i].Value);
}
The result is frustration.
CodePudding user response:
You can try this regex instead:
(?<=Json::\$product->)\w
Explaining:
(?<= ... )
This is a "lookbehind", that is, match only if preceded by what is inside this groupJson::\$product->
That specific string (the$
is quoted)\w
a sequence of one or more "word" characters (i.e.[a-zA-Z0-9_]
)
CodePudding user response:
Your pattern matches too much as \S
can match any non whitespace character.
You can also omit the first part of you pattern before Json as all is optional, and you don't need it.
If you don't want to match consecutive underscores, and the strings can not start or end with an underscore, you can also use a capture group.
\bJson::\$product->([^\W_] (?:_[^\W_] )*)\b
Explanation
\b
A word boundary to prevent a partial word matchJson::\$product->
(
Capture group 1[^\W_]
Match 1 wors chars other than _(?:_[^\W_] )*
Optionally repeat the same with a leading _
)
Close group 1\b
A word boundary
If you only want to match uppercase chars, then you can change [^\W_]
to [A-Z]