White Box Cluster Algorithm Design
This is a framework for representative based component based cluster algorithm design. Representative based cluster algorithms are one of the largest class of clustering algorithms (K-Means as a typical example) which can be decomposed to RCs (Reusable Components). Identified RCs are:
All of the RCs have different implementation as representative based clustering algorithms evolved. Every improvement tried to solve major issues of K-Means algorithm just by improving one of the RC leaving other unchanged. This R package aims to develop all of those improvemets in such manner that one can extend it by implementing new solution for specific RC and use it with already developed solutions of other RCs.
Hopefully, one will find all needed instructions for running and developing improvements for whiboclustering R package.
Installment of R package should be as easy as every other package.
install.package("whiboclustering")
After that you can use it by running command:
library("whiboclustering")
Once you installed and loaded the library you can use it for your clustering experiments.
Simple example:
data <- iris[, 1:4] #Selecting only numeric columns
model <- whiboclustering(data = data, k = 3)
With this step you created WhiBo Cluster model which will show information about clustering. You can access brief information about cluster model by providing:
print(model)
Here one can find implemented solutions for RCs. Formulas and more detailed explanation will be available on GitHub page (soon to be). If you know any other RC solution that is available, but not implemented please contact us.
Normalization:
Cluster Initialization:
Measuring Distance and Cluster Assignment:
Update (Recalculate) Cluster Representatives:
Cluster evaluation:
This part is explained in more details on GitHub page (soon to be) or you can contact us.
We had a help from other members: