Home > front end >  Get the shortest distance between two coordinates and return the adjacent cell
Get the shortest distance between two coordinates and return the adjacent cell

Time:01-17

My spreadsheet contains data from hospital patients, based on different criteria, and one of them are household's coordinates. The question is which household is the nearest neighbour to the household that contains the most people, based on straight-line distances and return the hserial value for the household. I have the households identified by their serial number:

hserial hhcoords persnum
101051 346350 1
101151 347312 1
101201 433616 1
101271 609464 1
101271 609464 2
101351 228562 1
101351 228562 2
101351 228562 3
101371 556408 1
101371 556408 2

hhcoords: (x,y) coordinates of the household location. First three digits are x, last three digits are y. persnum: number of persons within the household hserial: is the serial number of the household

Once you have their coords you can calculate the straightline distance between all houses, using Pythagorean Theorem.

For example, if we have the coords for a house 1 (x1,y1), and we want to know how far away it is from house 2 (x2,y2) then we can calculate a straight line distance using:

√((x1 – x2)2 (y1 – y2)2) The nearest neighbour is the one with the shortest distance (but careful with zeros).

I can't alter or change any of the data, even if I identify an error.

I think using the MAXIF formula would be useful as well as filtering it with FILTER, but I don't know how to aplly it to this case.

EDIT: This is my code but I have the #VALUE error:

=LET(hserial,UNIQUE(Table2[[#Headers],[hserial]]),hhcoords,UNIQUE(Table2[[#Headers],[hhcoords]]),hserial1,hserial,hserial2,TRANSPOSE(hserial),getXλ,LAMBDA(h,LEFT(INDEX(hhcoords,MATCH(h,hserial,0)),3)),getYλ,LAMBDA(h,MID(INDEX(hhcoords,MATCH(h,hserial,0)),4,3)),m,SQRT((getXλ(hserial1)-getXλ(hserial2))^2 (getYλ(hserial1)-getYλ(hserial2))^2),posMaxValue,MATCH(MAX(m),TOCOL(m),0),firstHH, CHOOSEROWS(hserial1,INT(posMaxValue/COUNT(hserial1) 1)),secondHH,INDEX(hserial1,MOD(posMaxValue,COUNT(hserial1))),HSTACK(firstHH,secondHH,MAX(m)))

CodePudding user response:

You can use this formula:

=LET(
hserial,UNIQUE(data[hserial]),
hhcoords,UNIQUE(data[hhcoords]),

hserial1,hserial,
hserial2,TRANSPOSE(hserial),

getXλ,LAMBDA(h,LEFT(INDEX(hhcoords,MATCH(h,hserial,0)),3)),
getYλ,LAMBDA(h,MID(INDEX(hhcoords,MATCH(h,hserial,0)),4,3)),
m,SQRT((getXλ(hserial1)-getXλ(hserial2))^2 (getYλ(hserial1)-getYλ(hserial2))^2),

posMaxValue,MATCH(MAX(m),TOCOL(m),0),
firstHH, CHOOSEROWS(hserial1,INT(posMaxValue/COUNT(hserial1) 1)),
secondHH,INDEX(hserial1,MOD(posMaxValue,COUNT(hserial1))),

HSTACK(firstHH,secondHH,MAX(m)))

Maybe you don't have HSTACK - then you would have to use three formulas.

enter image description here

CodePudding user response:

Assuming the DATA is located at [B2:D12].
The serial and coordinates of the household with the most people are: 101351 and 228562.

-. To obtain the shortest distance enter this formula in [I7](Excel 2021):

 = LET(
 lSerial, $B$3:$B$12, lCoords, $C$3:$C$12, lPers, $D$3:$D$12,
 lMaxPpl, XLOOKUP( MAX(lPers), lPers, lCoords ),
 MIN( IF( lCoords = lMaxPpl, ""N/A"",
 SQRT( ( LEFT(lMaxPpl,3) -LEFT(lCoords,3) )^2   ( RIGHT(lMaxPpl,3) -RIGHT(lCoords,3) )^2 ) ) ) )

enter image description here

-. To obtain the serial and coordinates enter this formula in [G7] (Excel 2021):

= LET( lData, $B$3:$D$12,
 lSerial, INDEX( lData, 0, 1 ), lCoords, INDEX( lData, 0, 2 ), lPers, INDEX( lData, 0, 3 ),
 lMaxPpl, XLOOKUP( MAX(lPers), lPers, lCoords ),
 lDist, IF( lCoords = lMaxPpl, ""N/A"",
 SQRT( ( LEFT(lMaxPpl,3) -LEFT(lCoords,3) )^2   ( RIGHT(lMaxPpl,3) -RIGHT(lCoords,3) )^2 ) ),
 lRow, MATCH( MIN(lDist), lDist, 0 ),
 INDEX( lData, lRow, 1 ): INDEX( lData, lRow, 2 ) )

enter image description here

-. To obtain the serial, coordinates & distance enter this formula in [G8] (Excel 365):

= LET( lData, $B$3:$D$12,
 lSerial, INDEX( lData, 0, 1 ), lCoords, INDEX( lData, 0, 2 ), lPers, INDEX( lData, 0, 3 ),
 lMaxPpl, XLOOKUP( MAX(lPers), lPers, lCoords ),
 lDist, IF( lCoords = lMaxPpl, ""N/A"",
 SQRT( ( LEFT(lMaxPpl,3) -LEFT(lCoords,3) )^2   ( RIGHT(lMaxPpl,3) -RIGHT(lCoords,3) )^2 ) ),
 CHOOSEROWS( HSTACK( DROP( lData, , -1 ), lDist ),
 MATCH( MIN(lDist), lDist, 0 ) ) )

enter image description here

Evaluation of the LET function:

lData, $B$3:$D$12: Sets the range to work with as lData, having the entire data range as one name makes easier the maintenance of the formula, i.e. if the range size changes only this name needs to be adjusted. However, I would suggest the use of Excel Tables as formulas are automatically updated when the size of the table changes.

lSerial, INDEX( lData, 0, 1 ), lCoords, INDEX( lData, 0, 2 ) and lPers, INDEX( lData, 0, 3 ) the columns to be used in the calculations.

lMaxPpl, XLOOKUP( MAX(lPers), lPers, lCoords ): Gives the coordinates of the household with most people.

lDist, IF( lCoords = lMaxPpl, ""N/A"", SQRT( ( LEFT(lMaxPpl,3) -LEFT(lCoords,3) )^2 ( RIGHT(lMaxPpl,3) -RIGHT(lCoords,3) )^2 ) ): Calculates the distances of all other households from the one with most people.

lRow, MATCH( MIN(lDist), lDist, 0 ): Gives the row of the shortest distance.

INDEX( lData, lRow, 1 ): INDEX( lData, lRow, 2 ): Use to return the first two columns from the data (not sure if HSTACK or any other alternative is available in Excel 2021).

DROP( lData, , -1 ): Excludes the last column from the data range.

HSTACK( DROP( lData, , -1 ), lDist ): Combines the adjusted data range and the calculated distances as one array.

CHOOSEROWS( HSTACK( DROP( lData, , -1 ), lDist ), MATCH( MIN(lDist), lDist, 0 ) ): Returns the entire row with the shortest distance from the combined array.

For documentation on the functions used see the following pages: CHOOSECOLS function, DROP function, HSTACK function, SQRT function, XLOOKUP function.

  • Related