项目作者: tidyverse

项目描述 :
Database (DBI) backend for dplyr
高级语言: R
项目地址: git://github.com/tidyverse/dbplyr.git
创建时间: 2017-03-28T20:29:16Z
项目社区:https://github.com/tidyverse/dbplyr

开源协议:Other

下载


" class="reference-link">dbplyr

CRAN
status
R-CMD-check
Codecov test
coverage

Overview

dbplyr is the database backend for dplyr.
It allows you to use remote database tables as if they are in-memory
data frames by automatically converting dplyr code into SQL.

To learn more about why you might use dbplyr instead of writing SQL, see
vignette("sql"). To learn more about the details of the SQL
translation, see vignette("translation-verb") and
vignette("translation-function").

Installation

  1. # The easiest way to get dbplyr is to install the whole tidyverse:
  2. install.packages("tidyverse")
  3. # Alternatively, install just dbplyr:
  4. install.packages("dbplyr")
  5. # Or the development version from GitHub:
  6. # install.packages("pak")
  7. pak::pak("tidyverse/dbplyr")

Usage

dbplyr is designed to work with database tables as if they were local
data frames. To demonstrate this I’ll first create an in-memory SQLite
database and copy over a dataset:

  1. library(dplyr, warn.conflicts = FALSE)
  2. con <- DBI::dbConnect(RSQLite::SQLite(), ":memory:")
  3. copy_to(con, mtcars)

Note that you don’t actually need to load dbplyr with library(dbplyr);
dplyr automatically loads it for you when it sees you working with a
database. Database connections are coordinated by the DBI package. Learn
more at https://dbi.r-dbi.org

Now you can retrieve a table using tbl() (see ?tbl_dbi for more
details). Printing it just retrieves the first few rows:

  1. mtcars2 <- tbl(con, "mtcars")
  2. mtcars2
  3. #> # Source: table<`mtcars`> [?? x 11]
  4. #> # Database: sqlite 3.45.0 [:memory:]
  5. #> mpg cyl disp hp drat wt qsec vs am gear carb
  6. #> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
  7. #> 1 21 6 160 110 3.9 2.62 16.5 0 1 4 4
  8. #> 2 21 6 160 110 3.9 2.88 17.0 0 1 4 4
  9. #> 3 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1
  10. #> 4 21.4 6 258 110 3.08 3.22 19.4 1 0 3 1
  11. #> 5 18.7 8 360 175 3.15 3.44 17.0 0 0 3 2
  12. #> 6 18.1 6 225 105 2.76 3.46 20.2 1 0 3 1
  13. #> 7 14.3 8 360 245 3.21 3.57 15.8 0 0 3 4
  14. #> 8 24.4 4 147. 62 3.69 3.19 20 1 0 4 2
  15. #> 9 22.8 4 141. 95 3.92 3.15 22.9 1 0 4 2
  16. #> 10 19.2 6 168. 123 3.92 3.44 18.3 1 0 4 4
  17. #> # ℹ more rows

All dplyr calls are evaluated lazily, generating SQL that is only sent
to the database when you request the data.

  1. # lazily generates query
  2. summary <- mtcars2 %>%
  3. group_by(cyl) %>%
  4. summarise(mpg = mean(mpg, na.rm = TRUE)) %>%
  5. arrange(desc(mpg))
  6. # see query
  7. summary %>% show_query()
  8. #> <SQL>
  9. #> SELECT `cyl`, AVG(`mpg`) AS `mpg`
  10. #> FROM `mtcars`
  11. #> GROUP BY `cyl`
  12. #> ORDER BY `mpg` DESC
  13. # execute query and retrieve results
  14. summary %>% collect()
  15. #> # A tibble: 3 × 2
  16. #> cyl mpg
  17. #> <dbl> <dbl>
  18. #> 1 4 26.7
  19. #> 2 6 19.7
  20. #> 3 8 15.1

Code of Conduct

Please note that the dbplyr project is released with a Contributor Code
of Conduct
. By
contributing to this project, you agree to abide by its terms.