[R] Canberra distance
Frédéric Chiroleu
frederic.chiroleu at cirad.fr
Tue Oct 16 09:47:51 CEST 2007
Hi,
I misunderstand the definition of Canberra distance in R.
On Internet and in function description pages of dist() from stats and
Dist() from amap, Canberra distance between vectors x and y, d(x,y), is :
d(x,y) = sum(abs(x-y)/(x+y))
But in use, through simple examples, we find that the formula is :
d(x,y) = (NZ + 1)/NZ * sum(abs(x-y)/(x+y))
with NZ = nb of pairs of coordinates that are different from (0,0) (Non
Zeros)
Functions vegdist() from vegan and gdist() from mvpart, like
documentation of ADE4 software, use (for positive variables) :
d(x,y) = 1/NZ * sum(abs(x-y)/(x+y))
Can someone help me to understand the differences in the choice of the
formula and why there's a difference between calculus and explaination
for dist() ?
Thank you for your help.
Best regards,
Fred
PS : Be careful with function dudi.pca() from ade4 ; in values, "norm"
doesn't give you what is written in the help page : "norm" returns the
vector of standard deviations of initial variables when you choose
"normed" PCA and the vector of standard deviations of normed variables,
ie 1, when you choose non "normed" PCA. We contacted authors of the
package unsuccessly to rectify the information.
--
Dr. Frédéric Chiroleu
Biométricien
CIRAD-Systèmes Biologiques (Cirad-Bios)
UMR 53 PVBMT (Peuplements Végétaux et Bio-agresseurs en Milieu Tropical)
Laboratoire d'Ecologie Terrestre et de Lutte Intégrée (LETLI)
Pôle de Protection des Plantes (3P)
7, chemin de l'IRAT
Ligne Paradis
97410 Saint-Pierre
Île de la Réunion - France
Tél. : +262 (0)262 499 230
Standard : +262 (0)262 499 200
Fax : +262 (0)262 499 293
Courriel : frederic.chiroleu at cirad.fr
More information about the R-help
mailing list