项目作者: kisungyou

项目描述 :
Dimension Reduction and Estimation Methods
高级语言: R
项目地址: git://github.com/kisungyou/Rdimtools.git
创建时间: 2017-12-03T00:10:16Z
项目社区:https://github.com/kisungyou/Rdimtools

开源协议:Other

下载


" class="reference-link">Rdimtools

CRAN_Status_Badge
Travis-CI Build
Status

Rdimtools is an R package for dimension reduction (DR) - including
feature selection and manifold learning - and intrinsic dimension
estimation (IDE) methods. We aim at building one of the most
comprehensive
toolbox available online, where current version delivers
145 DR algorithms and 17 IDE methods.

The philosophy is simple, the more we have at hands, the better we can
play
.

Elephant

Our logo characterizes the foundational nature of multivariate data
analysis; we may be blind people wrangling the data to see an
elephant to
grasp an idea of what the data looks like with partial information from
each algorithm.

Installation

You can install a release version from CRAN:

  1. install.packages("Rdimtools")

or the development version from github:

  1. ## install.packages("devtools")
  2. devtools::install_github("kisungyou/Rdimtools")

Minimal Example : Dimension Reduction

Here is an example of dimension reduction on the famous iris dataset.
Principal Component Analysis (do.pca), Laplacian Score (do.lscore),
and Diffusion Maps (do.dm) are compared, each from a family of
algorithms for linear reduction, feature extraction, and nonlinear
reduction.

  1. # load the library
  2. library(Rdimtools)
  3. # load the data
  4. X = as.matrix(iris[,1:4])
  5. lab = as.factor(iris[,5])
  6. # run 3 algorithms mentioned above
  7. mypca = do.pca(X, ndim=2)
  8. mylap = do.lscore(X, ndim=2)
  9. mydfm = do.dm(X, ndim=2, bandwidth=10)
  10. # visualize
  11. par(mfrow=c(1,3))
  12. plot(mypca$Y, pch=19, col=lab, xlab="axis 1", ylab="axis 2", main="PCA")
  13. plot(mylap$Y, pch=19, col=lab, xlab="axis 1", ylab="axis 2", main="Laplacian Score")
  14. plot(mydfm$Y, pch=19, col=lab, xlab="axis 1", ylab="axis 2", main="Diffusion Maps")

Minimal Example : Dimension Estimation

Swill Roll is a classic example of 2-dimensional manifold embedded in
$\mathbb{R}^3$ and one of 11 famous model-based samples from
aux.gensamples() function. Given the ground truth that $d=2$, let’s
apply several methods for intrinsic dimension estimation.

  1. # generate sample data
  2. set.seed(100)
  3. roll = aux.gensamples(dname="swiss")
  4. # we will compare 6 methods (out of 17 methods from version 1.0.0)
  5. vecd = rep(0,5)
  6. vecd[1] = est.Ustat(roll)$estdim # convergence rate of U-statistic on manifold
  7. vecd[2] = est.correlation(roll)$estdim # correlation dimension
  8. vecd[3] = est.made(roll)$estdim # manifold-adaptive dimension estimation
  9. vecd[4] = est.mle1(roll)$estdim # MLE with Poisson process
  10. vecd[5] = est.twonn(roll)$estdim # minimal neighborhood information
  11. # let's visualize
  12. plot(1:5, vecd, type="b", ylim=c(1.5,2.5),
  13. main="true dimension is d=2",
  14. xaxt="n",xlab="",ylab="estimated dimension")
  15. xtick = seq(1,5,by=1)
  16. axis(side=1, at=xtick, labels = FALSE)
  17. text(x=xtick, par("usr")[3],
  18. labels = c("Ustat","correlation","made","mle1","twonn"), pos=1, xpd = TRUE)

We can observe that all 5 methods we tested estimated the intrinsic
dimension around $d=2$. It should be noted that the estimated dimension
may not be integer-valued due to characteristics of each method.

Acknowledgements

The logo icon is made by
Freepik from
www.flaticon.com.The rotating Swiss Roll
image is taken from Dinoj
Surendran
’s
website.