项目作者: dppalomar

项目描述 :
Covariance Matrix Estimation via Factor Models
高级语言: R
项目地址: git://github.com/dppalomar/covFactorModel.git
创建时间: 2018-05-17T05:08:45Z
项目社区:https://github.com/dppalomar/covFactorModel

开源协议:GNU General Public License v3.0

下载



output:
html_document:
variant: markdown_github
keep_md: true
md_document:

  1. variant: markdown_github

covFactorModel

Estimation of covariance matrix via factor models with application
to financial data. Factor models decompose the asset returns into an
exposure term to some factors and a residual idiosyncratic component. The
resulting covariance matrix contains a low-rank term corresponding to the
factors and another full-rank term corresponding to the residual component.

This package provides a function to separate the data into the factor
component and residual component, as well as to estimate the corresponding
covariance matrix. Different kind of factor models are considered, namely,
macroeconomic factor models and statistical factor models. The estimation
of the covariance matrix accepts different kinds of structure on the
residual term: diagonal structure (implying that residual component is
uncorrelated) and block diagonal structure (allowing correlation within
sectors). The package includes a built-in database containing stock symbols
and their sectors.

The package is based on the book:
R. S. Tsay, Analysis of Financial Time Series. John Wiley & Sons, 2005.

Installation

  1. # Installation from CRAN (not available yet)
  2. #install.packages("covFactorModel")
  3. # Installation from GitHub
  4. # install.packages("devtools")
  5. devtools::install_github("dppalomar/covFactorModel")
  6. # Getting help
  7. library(covFactorModel)
  8. help(package = "covFactorModel")
  9. package?covFactorModel
  10. ?factorModel
  11. ?covFactorModel
  12. ?getSectorInfo

Vignette

For more detailed information, please check the vignette: GitHub-html-vignette and GitHub-pdf-vignette.

Usage of factorModel()

The function factorModel() builds a factor model for the data, i.e., it decomposes the asset returns into a factor component and a residual component. The user can choose different types of factor models, namely, macroeconomic, BARRA, or statistical. We start by generating some synthetic data:

  1. library(covFactorModel)
  2. library(xts)
  3. library(MASS)
  4. # generate synthetic data
  5. set.seed(234)
  6. N <- 3 # number of stocks
  7. T <- 5 # number of samples
  8. mu <- rep(0, N)
  9. Sigma <- diag(N)/1000
  10. # generate asset returns TxN data matrix
  11. X <- xts(mvrnorm(T, mu, Sigma), order.by = as.Date('2017-04-15') + 1:T)
  12. colnames(X) <- c("A", "B", "C")
  13. # generate K=2 macroeconomic factors
  14. econ_fact <- xts(mvrnorm(T, c(0, 0), diag(2)/1000), order.by = index(X))
  15. colnames(econ_fact) <- c("factor1", "factor2")

We first build a macroeconomic factor model, which fits the data to the given macroeconomic factors:

  1. macro_econ_model <- factorModel(X, type = "Macro", econ_fact = econ_fact)
  2. # sanity check
  3. X_ <- with(macro_econ_model,
  4. matrix(alpha, T, N, byrow = TRUE) + factors %*% t(beta) + residual)
  5. norm(X - X_, "F")
  6. #> [1] 2.091133e-18

Next, we build a BARRA industry factor model (assuming assets A and C belong to sector 1 and asset B to sector 2):

  1. stock_sector_info <- c(1, 2, 1)
  2. barra_model <- factorModel(X, type = "Barra", stock_sector_info = stock_sector_info)
  3. # sanity check
  4. X_ <- with(barra_model,
  5. matrix(alpha, T, N, byrow = TRUE) + factors %*% t(beta) + residual)
  6. norm(X - X_, "F")
  7. #> [1] 1.45461e-18

Finally, we build a statistical factor model, which is based on principal component analysis (PCA):

  1. # set factor dimension as K=2
  2. stat_model <- factorModel(X, K = 2)
  3. # sanity check
  4. X_ <- with(stat_model,
  5. matrix(alpha, T, N, byrow = TRUE) + factors %*% t(beta) + residual)
  6. norm(X - X_, "F")
  7. #> [1] 1.414126e-17

Usage of covFactorModel()

The function covFactorModel() estimates the covariance matrix of the data based on factor models. The user can choose not only the type of factor model (i.e., macroeconomic, BARRA, or statistical) but also the structure of the residual covariance matrix (i.e., scaled identity, diagonal, block diagonal, and full).
We start by preparing some synthetic data:

  1. library(covFactorModel)
  2. library(xts)
  3. library(MASS)
  4. # generate synthetic data
  5. set.seed(234)
  6. K <- 1 # number of factors
  7. N <- 400 # number of stocks
  8. mu <- rep(0, N)
  9. beta <- mvrnorm(N, rep(1, K), diag(K)/10)
  10. Sigma <- beta %*% t(beta) + diag(N)
  11. print(eigen(Sigma)$values[1:10])
  12. #> [1] 438.757 1.000 1.000 1.000 1.000 1.000 1.000 1.000
  13. #> [9] 1.000 1.000

Then, we simply use function covFactorModel() (by default it uses a statistical factor model and a diagonal structure for the residual covariance matrix). We show the average error w.r.t number of observations:

  1. # estimate error by loop
  2. err_scm_vs_T <- err_statPCA_diag_vs_T <- c()
  3. index_T <- N*seq(5)
  4. for (T in index_T) {
  5. X <- xts(mvrnorm(T, mu, Sigma), order.by = as.Date('1995-03-15') + 1:T)
  6. # use statistical factor model
  7. cov_statPCA_diag <- covFactorModel(X, K = K, max_iter = 10)
  8. err_statPCA_diag_vs_T <- c(err_statPCA_diag_vs_T, norm(Sigma - cov_statPCA_diag, "F")^2)
  9. # use sample covariance matrix
  10. err_scm_vs_T <- c(err_scm_vs_T, norm(Sigma - cov(X), "F")^2)
  11. }
  12. res <- rbind(err_scm_vs_T, err_statPCA_diag_vs_T)
  13. rownames(res) <- c("SCM", "stat + diag")
  14. colnames(res) <- paste0("T/N=", index_T/N)
  15. print(res)
  16. #> T/N=1 T/N=2 T/N=3 T/N=4 T/N=5
  17. #> SCM 1378.3156 689.3066 515.7518 322.9559 309.4131
  18. #> stat + diag 967.7577 478.5742 368.6348 221.7183 215.2621

Usage of getSectorInfo()

The function getSectorInfo() provides sector information for a given set of stock symbols. The usage is rather simple:

  1. library(covFactorModel)
  2. mystocks <- c("AAPL", "ABBV", "AET", "AMD", "APD", "AA","CF", "A", "ADI", "IBM")
  3. getSectorInfo(mystocks)
  4. #> $stock_sector_info
  5. #> AAPL ABBV AET AMD APD AA CF A ADI IBM
  6. #> 1 2 2 1 3 3 3 2 1 1
  7. #>
  8. #> $sectors
  9. #> 1 2 3
  10. #> "Information Technology" "Health Care" "Materials"

The built-in sector database can be overidden by providing a stock-sector pairing:

  1. my_stock_sector_database <- cbind(mystocks, c(rep("sector1", 3),
  2. rep("sector2", 4),
  3. rep("sector3", 3)))
  4. getSectorInfo(mystocks, my_stock_sector_database)
  5. #> $stock_sector_info
  6. #> AAPL ABBV AET AMD APD AA CF A ADI IBM
  7. #> 1 1 1 2 2 2 2 3 3 3
  8. #>
  9. #> $sectors
  10. #> 1 2 3
  11. #> "sector1" "sector2" "sector3"

Package: GitHub.

README file: GitHub-readme.

Vignette: GitHub-html-vignette and GitHub-pdf-vignette.