Home > Net >  Why is the Canberra distance doubled in R?
Why is the Canberra distance doubled in R?

Time:01-31

df <- data.frame(name = c('A', 'B', 'C'),
                 value = c(6, 3, 4))

dist <- dist(df, method = 'canberra')

dist

          1         2
2 0.6666667          
3 0.4000000 0.2857143

Shouldn't the results be:

          1         2
2 0.3333334          
3 0.2000000 0.1428571

Because |6 - 3|/(6 3) = 1/3 ?

CodePudding user response:

We need to exclude column "name":

dist(df[, 2], method = 'canberra')
#           1         2
# 2 0.3333333          
# 3 0.2000000 0.1428571

CodePudding user response:

Did you mean to set the row names instead of a dataframe column?

df <- data.frame(
row.names = c("A", "B", "C"),
value = c(6, 3, 4)
)

dist(df, method = 'canberra')

Appears to give the result you want.

  •  Tags:  
  • r
  • Related