Home > other >  Be urgent! R language k - means clustering optimization problem about the initial clustering center
Be urgent! R language k - means clustering optimization problem about the initial clustering center

Time:09-15

Strives for the optimization of existing programs, using r language implementation, specific requirements are as follows:
Modify the 34th - 41 lines, namely # mykmeans function of the part of the determination of the initial clustering center of (), for reform of the current way (randomly chosen) to a set strategy, made for the clustering results more reliable matrix m,
1, this strategy design, and in the method of application in comments to clarify,
2, reliable own kmeans clustering results to system function as a standard,
 rm ls (list=()) 
# computing function of the distance between two points of the
Dis=function (c1, c2) {
SQRT (sum ((c1 and c2) ^ 2))
}
# according to the current center calculate the new cennew and categories tag labels
Calccenlab=function (m, cen) {
Nrow=nrow (m)
Ncol=ncol (m)
Labels=c ()
K=nrow (cen)
For (I in 1: nrow) {
Point=m [, I]
Short=c ()
For (j in 1: k) {
Short [j]=dis (point, cen [j])
}
Labels [I]=which. Min (short)
}
# calc the new center
Cennew=matrix (0, k, ncol)
For (I in 1: k) {
If (is the vector (m/labels==I,)) {# only 1 row
Cennew [, I]=m/labels==I,
There are more lines in the} else {#
Cennew [, I]=the apply (m/labels==I,, 2, the mean)
}
}
The list (cennew, labels)
}

# mykmeans function
Mykmeans=function (m, k) {
Clustering center # to find suitable k
Sel=sample (1: nrow (m), k)
Print (sel)
Center=m/sel,
TMP=calccenlab (m, center)
Center=TMP [[1]]
Labels=TMP [[2]]
Iteration # relocation
While (1) {
RST=calccenlab (m, center)
If (sum (RST [[2]]==labels)==nrow (m)) {
Break
}
Center=RST [[1]]
Labels=RST [[2]]
}
RST
}

# the construct data m
P1=c (0.8, 0.9)
The p2=c (1, 1)
P3=c (3, 3)
P4=c (2.9, 3.1)
P5=c (3.5, 3.4)
M=rbind (p1, p2, p3, p4, p5)
# clustering
RST=mykmeans (m, 3)
Print (RST [[2]])
Print (RST [[1]])
# the plot
Plot (m m [1], [2], PCH=19, col=factor (RST) [[2]])
The text (m + 0.07 m [1], [2] + 0.07, rownames (m), the font=2)
  • Related