项目作者: easystats

项目描述 :
:bar_chart: Computation and processing of models' parameters
高级语言: R
项目地址: git://github.com/easystats/parameters.git
创建时间: 2019-02-09T04:01:40Z
项目社区:https://github.com/easystats/parameters

开源协议:GNU General Public License v3.0

下载


" class="reference-link">parameters

DOI
downloads
total

Describe and understand your model’s parameters!

parameters’ primary goal is to provide utilities for processing the
parameters of various statistical models (see
here for a list of supported
models). Beyond computing p-values, CIs, Bayesian indices and
other measures for a wide variety of models, this package implements
features like bootstrapping of parameters and models, feature
reduction
(feature extraction and variable selection), or tools for
data reduction like functions to perform cluster, factor or principal
component analysis.

Another important goal of the parameters package is to facilitate
and streamline the process of reporting results of statistical models,
which includes the easy and intuitive calculation of standardized
estimates or robust standard errors and p-values. parameters
therefor offers a simple and unified syntax to process a large variety
of (model) objects from many different packages.

Installation

CRAN
parameters status
badge
codecov

Type Source Command
Release CRAN install.packages("parameters")
Development r - universe install.packages("parameters", repos = "https://easystats.r-universe.dev")
Development GitHub remotes::install_github("easystats/parameters")

Tip

Instead of library(parameters), use library(easystats). This will
make all features of the easystats-ecosystem available.

To stay updated, use easystats::install_latest().

Documentation

Documentation
Blog
Features

Click on the buttons above to access the package
documentation and the
easystats blog, and check-out
these vignettes:

Contributing and Support

In case you want to file an issue or contribute in another way to the
package, please follow this
guide
.
For questions about the functionality, you may either contact us via
email or also file an issue.

Features

Model’s parameters description

The
model_parameters()
function (that can be accessed via the parameters() shortcut) allows
you to extract the parameters and their characteristics from various
models in a consistent way. It can be considered as a lightweight
alternative to broom::tidy(),
with some notable differences:

  • The column names of the returned data frame are specific to their
    content. For instance, the column containing the statistic is named
    following the statistic name, i.e., t, z, etc., instead of a
    generic name such as statistic (however, you can get standardized
    (generic) column names using
    standardize_names()).
  • It is able to compute or extract indices not available by default,
    such as p-values, CIs, etc.
  • It includes feature engineering capabilities, including parameters
    bootstrapping.

Classical Regression Models

  1. model <- lm(Sepal.Width ~ Petal.Length * Species + Petal.Width, data = iris)
  2. # regular model parameters
  3. model_parameters(model)
  4. #> Parameter | Coefficient | SE | 95% CI | t(143) | p
  5. #> -------------------------------------------------------------------------------------------
  6. #> (Intercept) | 2.89 | 0.36 | [ 2.18, 3.60] | 8.01 | < .001
  7. #> Petal Length | 0.26 | 0.25 | [-0.22, 0.75] | 1.07 | 0.287
  8. #> Species [versicolor] | -1.66 | 0.53 | [-2.71, -0.62] | -3.14 | 0.002
  9. #> Species [virginica] | -1.92 | 0.59 | [-3.08, -0.76] | -3.28 | 0.001
  10. #> Petal Width | 0.62 | 0.14 | [ 0.34, 0.89] | 4.41 | < .001
  11. #> Petal Length × Species [versicolor] | -0.09 | 0.26 | [-0.61, 0.42] | -0.36 | 0.721
  12. #> Petal Length × Species [virginica] | -0.13 | 0.26 | [-0.64, 0.38] | -0.50 | 0.618
  13. # standardized parameters
  14. model_parameters(model, standardize = "refit")
  15. #> Parameter | Coefficient | SE | 95% CI | t(143) | p
  16. #> -------------------------------------------------------------------------------------------
  17. #> (Intercept) | 3.59 | 1.30 | [ 1.01, 6.17] | 2.75 | 0.007
  18. #> Petal Length | 1.07 | 1.00 | [-0.91, 3.04] | 1.07 | 0.287
  19. #> Species [versicolor] | -4.62 | 1.31 | [-7.21, -2.03] | -3.53 | < .001
  20. #> Species [virginica] | -5.51 | 1.38 | [-8.23, -2.79] | -4.00 | < .001
  21. #> Petal Width | 1.08 | 0.24 | [ 0.59, 1.56] | 4.41 | < .001
  22. #> Petal Length × Species [versicolor] | -0.38 | 1.06 | [-2.48, 1.72] | -0.36 | 0.721
  23. #> Petal Length × Species [virginica] | -0.52 | 1.04 | [-2.58, 1.54] | -0.50 | 0.618
  24. # heteroscedasticity-consitent SE and CI
  25. model_parameters(model, vcov = "HC3")
  26. #> Parameter | Coefficient | SE | 95% CI | t(143) | p
  27. #> -------------------------------------------------------------------------------------------
  28. #> (Intercept) | 2.89 | 0.43 | [ 2.03, 3.75] | 6.66 | < .001
  29. #> Petal Length | 0.26 | 0.29 | [-0.30, 0.83] | 0.92 | 0.357
  30. #> Species [versicolor] | -1.66 | 0.53 | [-2.70, -0.62] | -3.16 | 0.002
  31. #> Species [virginica] | -1.92 | 0.77 | [-3.43, -0.41] | -2.51 | 0.013
  32. #> Petal Width | 0.62 | 0.12 | [ 0.38, 0.85] | 5.23 | < .001
  33. #> Petal Length × Species [versicolor] | -0.09 | 0.29 | [-0.67, 0.48] | -0.32 | 0.748
  34. #> Petal Length × Species [virginica] | -0.13 | 0.31 | [-0.73, 0.48] | -0.42 | 0.675

Mixed Models

  1. library(lme4)
  2. model <- lmer(Sepal.Width ~ Petal.Length + (1 | Species), data = iris)
  3. # model parameters with CI, df and p-values based on Wald approximation
  4. model_parameters(model)
  5. #> # Fixed Effects
  6. #>
  7. #> Parameter | Coefficient | SE | 95% CI | t(146) | p
  8. #> ------------------------------------------------------------------
  9. #> (Intercept) | 2.00 | 0.56 | [0.89, 3.11] | 3.56 | < .001
  10. #> Petal Length | 0.28 | 0.06 | [0.16, 0.40] | 4.75 | < .001
  11. #>
  12. #> # Random Effects
  13. #>
  14. #> Parameter | Coefficient | SE | 95% CI
  15. #> -----------------------------------------------------------
  16. #> SD (Intercept: Species) | 0.89 | 0.46 | [0.33, 2.43]
  17. #> SD (Residual) | 0.32 | 0.02 | [0.28, 0.35]
  18. # model parameters with CI, df and p-values based on Kenward-Roger approximation
  19. model_parameters(model, ci_method = "kenward", effects = "fixed")
  20. #> # Fixed Effects
  21. #>
  22. #> Parameter | Coefficient | SE | 95% CI | t | df | p
  23. #> -------------------------------------------------------------------------
  24. #> (Intercept) | 2.00 | 0.57 | [0.07, 3.93] | 3.53 | 2.67 | 0.046
  25. #> Petal Length | 0.28 | 0.06 | [0.16, 0.40] | 4.58 | 140.98 | < .001

Structural Models

Besides many types of regression models and packages, it also works for
other types of models, such as structural
models

(EFA, CFA, SEM…).

  1. library(psych)
  2. model <- psych::fa(attitude, nfactors = 3)
  3. model_parameters(model)
  4. #> # Rotated loadings from Factor Analysis (oblimin-rotation)
  5. #>
  6. #> Variable | MR1 | MR2 | MR3 | Complexity | Uniqueness
  7. #> ------------------------------------------------------------
  8. #> rating | 0.90 | -0.07 | -0.05 | 1.02 | 0.23
  9. #> complaints | 0.97 | -0.06 | 0.04 | 1.01 | 0.10
  10. #> privileges | 0.44 | 0.25 | -0.05 | 1.64 | 0.65
  11. #> learning | 0.47 | 0.54 | -0.28 | 2.51 | 0.24
  12. #> raises | 0.55 | 0.43 | 0.25 | 2.35 | 0.23
  13. #> critical | 0.16 | 0.17 | 0.48 | 1.46 | 0.67
  14. #> advance | -0.11 | 0.91 | 0.07 | 1.04 | 0.22
  15. #>
  16. #> The 3 latent factors (oblimin rotation) accounted for 66.60% of the total variance of the original data (MR1 = 38.19%, MR2 = 22.69%, MR3 = 5.72%).

Variable and parameters selection

select_parameters()
can help you quickly select and retain the most relevant predictors
using methods tailored for the model type.

  1. lm(disp ~ ., data = mtcars) |>
  2. select_parameters() |>
  3. model_parameters()
  4. #> Parameter | Coefficient | SE | 95% CI | t(26) | p
  5. #> -----------------------------------------------------------------------
  6. #> (Intercept) | 141.70 | 125.67 | [-116.62, 400.02] | 1.13 | 0.270
  7. #> cyl | 13.14 | 7.90 | [ -3.10, 29.38] | 1.66 | 0.108
  8. #> hp | 0.63 | 0.20 | [ 0.22, 1.03] | 3.18 | 0.004
  9. #> wt | 80.45 | 12.22 | [ 55.33, 105.57] | 6.58 | < .001
  10. #> qsec | -14.68 | 6.14 | [ -27.31, -2.05] | -2.39 | 0.024
  11. #> carb | -28.75 | 5.60 | [ -40.28, -17.23] | -5.13 | < .001

Statistical inference - how to quantify evidence

There is no standardized approach to drawing conclusions based on the
available data and statistical models. A frequently chosen but also much
criticized approach is to evaluate results based on their statistical
significance (Amrhein, Korner-Nievergelt, & Roth, 2017).

A more sophisticated way would be to test whether estimated effects
exceed the “smallest effect size of interest”, to avoid even the
smallest effects being considered relevant simply because they are
statistically significant, but clinically or practically irrelevant
(Lakens, 2024; Lakens, Scheel, & Isager, 2018). A rather unconventional
approach, which is nevertheless advocated by various authors, is to
interpret results from classical regression models in terms of
probabilities, similar to the usual approach in Bayesian statistics
(Greenland, Rafi, Matthews, & Higgs, 2022; Rafi & Greenland, 2020;
Schweder, 2018; Schweder & Hjort, 2003; Vos & Holbert, 2022).

The parameters package provides several options or functions to aid
statistical inference. These are, for example:

  • equivalence_test(),
    to compute the (conditional) equivalence test for frequentist models
  • p_significance(),
    to compute the probability of practical significance, which can be
    conceptualized as a unidirectional equivalence test
  • p_function(),
    or consonance function, to compute p-values and compatibility
    (confidence) intervals for statistical models
  • the pd argument (setting pd = TRUE) in model_parameters()
    includes a column with the probability of direction, i.e. the
    probability that a parameter is strictly positive or negative. See
    bayestestR::p_direction()
    for details.
  • the s_value argument (setting s_value = TRUE) in
    model_parameters() replaces the p-values with their related
    S-values (@ Rafi & Greenland, 2020)
  • finally, it is possible to generate distributions of model
    coefficients by generating bootstrap-samples (setting
    bootstrap = TRUE) or simulating draws from model coefficients using
    simulate_model().
    These samples can then be treated as “posterior samples” and used in
    many functions from the bayestestR package.

Most of the above shown options or functions derive from methods
originally implemented for Bayesian models (Makowski, Ben-Shachar, Chen,
& Lüdecke, 2019). However, assuming that model assumptions are met
(which means, the model fits well to the data, the correct model is
chosen that reflects the data generating process (distributional model
family) etc.), it seems appropriate to interpret results from classical
frequentist models in a “Bayesian way” (more details: documentation in
p_function()).

Citation

In order to cite this package, please use the following command:

  1. citation("parameters")
  2. To cite package 'parameters' in publications use:
  3. Lüdecke D, Ben-Shachar M, Patil I, Makowski D (2020). "Extracting,
  4. Computing and Exploring the Parameters of Statistical Models using
  5. R." _Journal of Open Source Software_, *5*(53), 2445.
  6. doi:10.21105/joss.02445 <https://doi.org/10.21105/joss.02445>.
  7. A BibTeX entry for LaTeX users is
  8. @Article{,
  9. title = {Extracting, Computing and Exploring the Parameters of Statistical Models using {R}.},
  10. volume = {5},
  11. doi = {10.21105/joss.02445},
  12. number = {53},
  13. journal = {Journal of Open Source Software},
  14. author = {Daniel Lüdecke and Mattan S. Ben-Shachar and Indrajeet Patil and Dominique Makowski},
  15. year = {2020},
  16. pages = {2445},
  17. }

Code of Conduct

Please note that the parameters project is released with a Contributor
Code of
Conduct
.
By contributing to this project, you agree to abide by its terms.

References





Amrhein, V., Korner-Nievergelt, F., & Roth, T. (2017). The earth is flat
( p > 0.05): Significance thresholds and the crisis of unreplicable
research. PeerJ, 5, e3544. https://doi.org/10.7717/peerj.3544



Greenland, S., Rafi, Z., Matthews, R., & Higgs, M. (2022). To Aid
Scientific Inference, Emphasize Unconditional Compatibility Descriptions
of Statistics
. Retrieved from http://arxiv.org/abs/1909.08583



Lakens, D. (2024). Improving Your Statistical Inferences.
https://doi.org/10.5281/ZENODO.6409077



Lakens, D., Scheel, A. M., & Isager, P. M. (2018). Equivalence testing
for psychological research: A tutorial. Advances in Methods and
Practices in Psychological Science
, 1(2), 259–269.
https://doi.org/10.1177/2515245918770963



Makowski, D., Ben-Shachar, M. S., Chen, S. H. A., & Lüdecke, D. (2019).
Indices of Effect Existence and Significance in the Bayesian Framework.
Frontiers in Psychology, 10, 2767.
https://doi.org/10.3389/fpsyg.2019.02767



Rafi, Z., & Greenland, S. (2020). Semantic and cognitive tools to aid
statistical science: Replace confidence and significance by
compatibility and surprise. BMC Medical Research Methodology, 20(1),
244. https://doi.org/10.1186/s12874-020-01105-9



Schweder, T. (2018). Confidence is epistemic probability for empirical
science. Journal of Statistical Planning and Inference, 195,
116–125. https://doi.org/10.1016/j.jspi.2017.09.016



Schweder, T., & Hjort, N. L. (2003). Frequentist Analogues of Priors and
Posteriors. In B. Stigum (Ed.), Econometrics and the Philosophy of
Economics: Theory-Data Confrontations in Economics
(pp. 285–217).
Retrieved from https://www.duo.uio.no/handle/10852/10425



Vos, P., & Holbert, D. (2022). Frequentist statistical inference without
repeated sampling. Synthese, 200(2), 89.
https://doi.org/10.1007/s11229-022-03560-x