Home > Blockchain >  Counting if part of string is within interval
Counting if part of string is within interval

Time:08-31

I am currently trying to check if a number in a comma-separated string is within a number interval. What I am trying to do is to check if an area code (from the comma-separated string) is within the interval of an area.

The data: AREAS

Area interval Name Number of locations
1000-1499 Area 1 ?
1500-1799 Area 2 ?
1800-1999 Area 3 ?

GEOLOCATIONS

Name Areas List
Location A 1200, 1400
Location B 1020, 1720
Location C 1700, 1920
Location D 1940, 1950, 1730

The result I want here is the number of unique locations in the "Areas list" within the area interval. So Location D should only count ONCE in the 1800-1999 "area", and the Location A the same in the 1000-1499 location. But location B should count as one in both 1000-1499 and one in 1500-1799 (because a number from each interval is in the comma-separated string in "Areas list"):

Area interval Name Number of locations
1000-1499 Area 1 2
1500-1799 Area 2 3
1800-1999 Area 3 2

How is this possible?

I have tried with a COUNTIFS, but it doesnt seem to do the job.

CodePudding user response:

Here is one option using FILTERXML():

enter image description here

Formula in C2:

=SUM(FILTERXML("<x><t>"&TEXTJOIN("</s></t><t>",,"1<s>"&SUBSTITUTE(B$7:B$10,", ","</s><s>"))&"</s></t></x>","//t[count(.//*[.>="&SUBSTITUTE(A2,"-","][.<=")&"])>0]"))

Where //t[count(.//*[.>="&SUBSTITUTE(A2,"-","][.<=")&"])>0]")) basically means that we collect all t-nodes where the count of child s-nodes that are >= to the leftmost number and <= to the rightmost number is larger than zero. By default the returned locations should be unique based on your data.


Or, wrap the above in BYROW():

enter image description here

Formula in C2:

=BYROW(A2:A4,LAMBDA(a,SUM(FILTERXML("<x><t>"&TEXTJOIN("</s></t><t>",,"1<s>"&SUBSTITUTE(B$7:B$10,", ","</s><s>"))&"</s></t></x>","//t[count(.//*[.>="&SUBSTITUTE(a,"-","][.<=")&"])>0]"))))

CodePudding user response:

Using MMULT and TEXTSPLIT:

=LET(rng,TEXTSPLIT(D2,"-"),
    tarr,IFERROR(--TRIM(TEXTSPLIT(TEXTJOIN(";",,$B$2:$B$5),",",";")),0),
    SUM(--(MMULT((tarr>=--TAKE(rng,,1))*(tarr<=--TAKE(rng,,-1)),SEQUENCE(COLUMNS(tarr),,1,0))>0)))

enter image description here

CodePudding user response:

I am in very distinguished company but will add my version anyway as byrow probably is a slightly different approach

=LET(range,B$2:B$5,
lowerLimit,--@TEXTSPLIT(E2,"-"),
upperLimit,--INDEX(TEXTSPLIT(E2,"-"),2),
counts,BYROW(range,LAMBDA(r,SUM((--TEXTSPLIT(r,",")>=lowerLimit)*(--TEXTSPLIT(r,",")<=upperLimit)))),
SUM(--(counts>0))
)

enter image description here

CodePudding user response:

Here the ugly way to do it, with A LOT of helper columns. But not so complicated

  • Related