i hope you're fine.
example i have a text with tags @tag i want to get all text between the tags as parts (not as one, look at the picture at the end, at the top, the result is 4 matches and not one). the problem is inside the text of every tag maybe contains the symbol @, and this will cut my result matches.
i tried several times with different regex but alwas without success
the final regex i use is :
((^(@))|[^\"]@)[^(@)] \}\n
the example centent text (that i use for trying) :
@main.xml
{"adSize":"","adUnitId":"","alpha":1.0,"checked":0,"choiceMode":0,"clickable":1,"convert":"","customView":"","dividerHeight":1,"enabled":1,"firstDayOfWeek":1,"id":"_fab","image":{"rotate":0,"scaleType":"CENTER"},"indeterminate":"false","index":0,"inject":"","layout":{"backgroundColor":16777215,"borderColor":-16740915,"gravity"}
{"adSize":"","adUnitId":"","alpha":1.0,"checked":0,"choiceMode":0,"clickable":1,"convert":"","customView":"","dividerHeight":1,"enabled":1,"firstDayOfWeek":1,"id":"_fab","image":{"rotate":0,"scaleType":"CENTER"},"indeterminate":"false","index":0,"inject":"","layout":{"backgroundColor":16777215,"borderColor":-16740915,"gravity"}
@main.xml_fab
{"adSize":"","adUnitId":"","alpha":1.0,"checked":0,"choiceMode":0,"clickable":1,"convert":"","customView":"","dividerHeight":1,"enabled":1,"firstDayOfWeek":1,"id":"_fab","image":{"rotate":0,"scaleType":"CENTER"},"indeterminate":"false","index":0,"inject":"","layout":{"backgroundColor":16777215,"borderColor":-16740915,"gravity"}
{"adSize":"","adUnitId":"","alpha":1.0,"checked":0,"choiceMode":0,"clickable":1,"convert":"","customView":"","dividerHeight":1,"enabled":1,"firstDayOfWeek":1,"id":"_fab","image":{"rotate":0,"scaleType":"CENTER"},"indeterminate":"false","index":0,"inject":"","layout":{"backgroundColor":16777215,"borderColor":-16740915,"gravity"}
{"adSize":"","adUnitId":"","alpha":1.0,"checked":0,"choiceMode":0,"clickable":1,"convert":"","customView":"","dividerHeight":1,"enabled":1,"@firstDayOfWeek":1,"id":"_fab","image":{"rotate":0,"scaleType":"CENTER"},"indeterminate":"false","index":0,"inject":"","layout":{"backgroundColor":16777215,"borderColor":-16740915,"gravity"}
@main.xml
{"adSize":"","adUnitId":"","alpha":1.0,"checked":0,"choiceMode":0,"clickable":1,"convert":"","customView":"","dividerHeight":1,"enabled":1,"firstDayOfWeek":1,"id":"_fab","image":{"rotate":0,"scaleType":"CENTER"},"indeterminate":"false","index":0,"inject":"","layout":{"backgroundColor":16777215,"borderColor":-16740915,"gravity"}
{"adSize":"","adUnitId":"","alpha":1.0,"checked":0,"choiceMode":0,"clickable":1,"convert":"","customView":"","dividerHeight":1,"enabled":1,"firstDayOfWeek":1,"id":"_fab","image":{"rotate":0,"scaleType":"CENTER"},"indeterminate":"false","index":0,"inject":"","layout":{"backgroundColor":16777215,"borderColor":-16740915,"gravity"}
j
@main.xml_fab
{"adSize":"","adUnitId":"","alpha":1.0,"checked":0,"choiceMode":0,"clickable":1,"convert":"","customView":"","dividerHeight":1,"enabled":1,"firstDayOfWeek":1,"id":"_fab","image":{"rotate":0,"scaleType":"CENTER"},"indeterminate":"false","index":0,"inject":"","layout":{"backgroundColor":16777215,"borderColor":-16740915,"gravity"}
{"adSize":"","adUnitId":"","alpha":1.0,"checked":0,"choiceMode":0,"clickable":1,"convert":"","customView":"","dividerHeight":1,"enabled":1,"firstDayOfWeek":1,"id":"_fab","image":{"rotate":0,"scaleType":"center"},"indeterminate":"false","index":0,"inject":"","layout":{"backgroundColor":16777215,"borderColor":-16740915,"gravity"}
{"adSize":"","adUnitId":"","alpha":1.0,"checked":0,"choiceMode":0,"clickable":1,"convert":"","customView":"","dividerHeight":1,"enabled":1,"firstDayOfWeek":1,"id":"_fab","image":{"rotate":0,"scaleType":"CENTER"},"indeterminate":"false","index":0,"inject":"","layout":{"backgroundColor":16777215,"borderColor":-16740915,"gravity"}
and this is picture show my problem
i want to get the whole parts between @tag1 and @tag2,3,4...ect, how can i deal with the symbol @ if it is inside the content of tag ?
CodePudding user response:
I am not quite sure of the use of [^\"]@
in your expression... What is its purpose? could you provide an example?
Anyway you have to remove @ from the list of excluded characters. and change the way you look a the end of your section. Try:
^@[^(] ?\}\n(?=\@|$)
^@
matches @ at the beginning of a line[^(] ?
matches any character expect(
one or more times until (?
makes it lazy) the next occurrence of the next element...\}\n
your segments always end with a curly bracket followed by a new line(?=\@|$)
the critical part: a lookahead that ensures the following element is either a@
(new segment) or the end of the file, without capturing it. That way you don't cut a segment before its end and you can still capture the beginning of the next segment.
Try a Demo
CodePudding user response:
Use
(?m)^@. (?:\n. )*
See regex proof.
EXPLANATION
--------------------------------------------------------------------------------
^ the beginning of the line due to (?m)
--------------------------------------------------------------------------------
@ '@'
--------------------------------------------------------------------------------
. any character except \n (1 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
(?: group, but do not capture (0 or more times
(matching the most amount possible)):
--------------------------------------------------------------------------------
\n '\n' (newline)
--------------------------------------------------------------------------------
. any character except \n (1 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
)* end of grouping