项目作者: epix-project

项目描述 :
Monthly meteorological data from the Vietnamese IMHEN
高级语言: R
项目地址: git://github.com/epix-project/imhen.git
创建时间: 2017-04-18T08:06:35Z
项目社区:https://github.com/epix-project/imhen

开源协议:Other

下载


imhen

AppVeyor build
status
Travis build
status

This package contains meteorological data for Vietnam from the Vietnamese Institute of Meteorology, Hydrology and Environment (IMHEN). This is monthly data in 67 climatic stations from January 1960 to December 2015. Climatic variables are min, max, average temperatures, absolute and relative humidities, rainfall and hours of sunshine.

Installation and loading

You can install imhen from
GitHub

  1. # install.packages("devtools")
  2. devtools::install_github("epix-project/imhen", build_vignettes = TRUE)

Once installed, you can load the package:

  1. library(imhen)

Usage examples

The package contains two dataframes. The first one is meteo which
contains the climatic variables Tx, Ta, Tm, aH, rH, Rf and
Sh plus time (year and month) and space (station) information:

  1. head(meteo)
  2. #> year month station Ta Tx Tm Rf aH rH Sh
  3. #> 1 1961 January Bac Kan 13.9 19.1 10.5 5.3 13.1 82 NA
  4. #> 2 1961 February Bac Kan 15.1 18.3 13.2 21.5 14.7 85 NA
  5. #> 3 1961 March Bac Kan 19.6 23.2 17.5 85.4 20.1 87 NA
  6. #> 4 1961 April Bac Kan 23.5 28.1 20.5 185.8 24.8 87 NA
  7. #> 5 1961 May Bac Kan 25.8 31.2 22.1 34.9 27.1 83 NA
  8. #> 6 1961 June Bac Kan 26.9 32.6 23.1 314.7 29.3 83 NA

Note that the data frame is not “complete”, with some combinations of
the year, month and station being missing:

  1. table(with(meteo, table(year, month, station)))
  2. #>
  3. #> 0 1
  4. #> 7980 37848

The second one is stations which contains the coordinates (longitude
and latitude) and the elevation:

  1. head(stations)
  2. #> station elevation latitude geometry
  3. #> 1 Bac Kan 174 22.133333 105.81667, 22.13333
  4. #> 2 Bac Giang 7 21.283333 106.20000, 21.28333
  5. #> 3 Bac Lieu 2 9.283333 105.716667, 9.283333
  6. #> 4 Bac Ninh 5 21.200000 106.05, 21.20
  7. #> 5 Ba Tri 12 10.033333 106.60000, 10.03333
  8. #> 6 Ba Vi 20 21.083333 106.40000, 21.08333

Mapping the climatic stations

We can transform the climatic stations coordinates into a spatial
object:

  1. library(gadmVN)
  2. vietnam <- gadm(level = "country")
  3. coordinates(stations) <- ~ longitude + latitude
  4. proj4string(stations) <- vietnam@proj4string

And plot the stations on the map:

  1. plot(vietnam, col = "grey")
  2. points(stations, col = "blue", pch = 3)

Visualizing the climatic stations elevations

We can also look at the elevations of the climatic stations:

  1. plot(sort(stations$elevation, TRUE), type = "o",
  2. xlab = "stations ranked by decreasing elevation", ylab = "elevation (m)")

Exploring the climatic variables

Let’s look at the temperatures:

  1. val <- c("Tm", "Ta", "Tx")
  2. T_range <- range(meteo[, val], na.rm = TRUE)
  3. breaks <- seq(floor(T_range[1]), ceiling(T_range[2]), 2)
  4. par(mfrow = c(1, 3))
  5. for(i in val)
  6. hist(meteo[[i]], breaks, ann = FALSE, col = "lightgrey", ylim = c(0, 10500))

Looks good. Let’s check the consistency of the values:

  1. for(i in val) print(range(meteo[[i]], na.rm = TRUE))
  2. #> [1] -9.256667 29.900000
  3. #> [1] 0.0 35.8
  4. #> [1] 5.7 39.3
  5. with(meteo, any(!((Tm <= Ta) & (Ta <= Tx)), na.rm = TRUE))
  6. #> [1] FALSE

Let’s look at the other variables:

  1. val <- c("aH", "rH", "Rf", "Sh")
  2. par(mfrow = c(2, 2))
  3. for(i in val) hist(meteo[[i]], col = "lightgrey", ann = FALSE)

Looks good too.

  1. for(i in val) print(range(meteo[[i]], na.rm = TRUE))
  2. #> [1] 2.9 39.9
  3. #> [1] 49 100
  4. #> [1] 0.0 2451.7
  5. #> [1] 0 674

Visualizing the data spatio-temporally

Let’s first Make a year, month, station template for a full design
of the data:

  1. y <- sort(unique(meteo$year))
  2. m <- factor(levels(meteo$month), levels(meteo$month), ordered = TRUE)
  3. s <- stations$station[order(coordinates(stations)[, "latitude"])]
  4. s <- factor(s, s, ordered = TRUE)
  5. template <- setNames(expand.grid(y, m, s), c("year", "month", "station"))
  6. attr(template, "out.attrs") <- NULL # removing useless attributes

The full version of the data:

  1. meteo_full <- merge(template, meteo, all.x = TRUE)

Let’s visualize it:

  1. x <- as.Date(with(unique(meteo_full[, c("year", "month")]),
  2. paste0(year, "-", as.numeric(month), "-15")))
  3. y <- seq_along(stations)
  4. nb <- length(y)
  5. col <- rev(heat.colors(12))
  6. show_data <- function(var) {
  7. image(x, y, t(matrix(meteo_full[[var]], nb)), col = col,
  8. xlab = NA, ylab = "climatic stations")
  9. box(bty = "o")
  10. }

Missings values for all the temperature variables:

  1. opar <- par(mfrow = c(2, 2))
  2. for(i in c("Tx", "Ta", "Tm")) show_data(i)
  3. par(opar)

Showing very well the higher seasonality in the north than in the south.
Missing values for the absolute and relative humidities as well as for
rainfall and hours of sunshine:

  1. opar <- par(mfrow = c(2, 2))
  2. for(i in c("aH", "rH", "Rf", "Sh")) show_data(i)
  3. par(opar)

Showing strong seasonality of absolute humidity in the north of the
country, interesting pattern of relative humidity in the center of the
country, high rainfalls in the fall in the center of the country, and
out-of-phase oscillations of the number of hours of sunshine between the
north and the south of the country. It seems though that there are
strange outliers in sunshine in the north in 2008 or so. Let’s now
combine the missing values from all the climatic variables:

  1. library(magrittr)
  2. library(dplyr)
  3. meteo_full %<>% mutate(combined = is.na(Tx + Ta + Tm + aH + rH + Rf + Sh))
  4. show_data("combined")
  5. abline(v = as.Date("1995-01-01"))

The locations of the 6 stations with missing value in the recent year
are:

  1. subset(meteo_full, year > 1994 & combined, station, TRUE) %>% unique

Left to do

  • pairwise distances
  • time series (trends seasonalities)
  • time seasonal variation
  • PCA?