I have the following EDI file and need to filter the element LOC 11 but not the LOC 7 and I need all segments between them that the LOC Segment gets repeated but the segments between them not.
At the moment my regex looks like LOC[^L]*(?:L(?!OC)[^L]*)*
but with that I get 4 results because it filters the loc 7 elemements too.
I only need the 2 results. Could you help me?
> NAD ST 14::92 Test' LOC 11 KOD23277::92' LOC 7 D77::92:Test' LIN 1
> test AP:IN'IMD F 12::272:K
> RIPPsadasdRIEM'RFF ON:EN10514492'RFF AAN:501'
> DTM 171:20220309:102'RFF AIF:500'DTM 171:20220305:102'CTA SC 12414:test,
> test'COM [email protected]:EM'
> COM ? 49-561-490-4173:TE'COM ? 49-561-490-84173:FX' QTY 83:1000:PCE'
> QTY 70:66850:PCE'DTM 51:20080101:102'
> QTY 72:0:PCE'DTM 52:20080101:102'
> QTY 194:1000:PCE'DTM 50:20220224:102'
> RFF AAU:2143276'DTM 171:20220218:102'
> QTY 194:1000:PCE'DTM 50:20220202:102'
> RFF AAU:2138944'DTM 171:20220131:102'
> QTY 194:1000:PCE'DTM 50:20220105:102'
> RFF AAU:2138943'DTM 171:20220103:102' SCC 24'
> QTY 113:1000:PCE'DTM 2:20220412:102'
> QTY 113:1000:PCE'DTM 2:20220503:102'
> QTY 113:1000:PCE'DTM 64:20220530:102'DTM 63:20220605:102'
> QTY 113:1000:PCE'DTM 64:20220620:102'DTM 63:20220626:102'
> QTY 113:1000:PCE'DTM 64:20220711:102'DTM 63:20220717:102'
> QTY 113:1000:PCE'DTM 64:20220801:102'DTM 63:20220807:102' GEI 3 37'
>
> NAD ST 14::92 test' LOC 11 KOD823226::92' LOC 7 D86::92:Test' LIN 2
> test H:IN'IMD F 12::272:K
> RIPPRIEM'RFF ON:EN10662318'RFF AAN:266'DTM 171:20220309:102'
> RFF AIF:265'DTM 171:20220305:102'CTA SC 12414:test,
> test'COM [email protected]:EM'
> COM ? 49-561-490-4173:TE'COM ? 49-561-490-84173:FX' QTY 83:200:PCE'
> QTY 70:14319:PCE'DTM 51:20100101:102'
> QTY 72:0:PCE'DTM 52:20100101:102' QTY 194:200:PCE'DTM 50:20220126:102'
> RFF AAU:2146871'DTM 171:20220121:102'
> QTY 194:200:PCE'DTM 50:20211210:102'RFF AAU:2146914'DTM 171:20211209:102' QTY 194:200:PCE'DTM 50:20211129:102'RFF AAU:2139927'DTM 171:20211124:102'SCC 24'
> QTY 113:200:PCE'DTM 2:20220503:102'
> QTY 113:200:PCE'DTM 64:20220606:102'DTM 63:20220612:102'
> QTY 113:200:PCE'DTM 64:20220718:102'DTM 63:20220724:102'
> QTY 113:200:PCE'DTM 64:20220829:102'DTM 63:20220904:102'
> QTY 113:200:PCE'DTM 64:20221010:102'DTM 63:20221016:102'
>
> UNT 142 1'UNZ 1 2756'
CodePudding user response:
You can use
LOC\ 11[^L]*(?:L(?!OC\ 11)[^L]*)*
LOC\ 11[\w\W]*?(?=LOC\ 11|$)
See the regex demo.
Details:
LOC\ 11
-LOC 11
string[^L]*(?:L(?!OC\ 11)[^L]*)*
- any text up to the first occurrence ofLOC 11
substring (uses the unroll-the-loop principle).
Although the results you get with the two patterns above are identical, the first one is much faster provided there are not too many L
s that are not followed with 11
.