sf
packageInstall the package using: {r, echo=TRUE, eval=FALSE} ) install.packages(sf)
The installation should work out of the box, but one might get compatibility issues with the gdal
library. Depending on the error, googling it should do the job.
After the installation, we simply load the package, and also set the R working directory appropriately.
library(sf)
setwd(file.path(path.expand('~'), 'workplace', 'gis-slovakia'))
In order to be able to work with maps, we need obtain the files that have the geometrical representation of the maps, with the corresponding coordinate system. When working with global data, it is useful to use a coordinate system which takes into account the curvature of the Earth, but for analysis that involves small area (like Slovakia), it is practical to worth with planar coordinates where we can use Euclidean geometry that we are familiar with. In our analysis we will use the krovak coordinate system which covers Czechia and Slovak Republic. The legally-binding system that the cadastre uses is coded as EPSG:5513
.
This website is useful for navigating the map and reading off the coordinates for a particular point. To change the coordinate system, change the coordinates system code.
The map files come in different formats. One of the most common ones are shapefiles (.shp
), geodatabase (.gdb
), geopackage (.gpk
), geoJSON (.geojson
).
This page in the geoportal.sk has the necessary files in different formats. We stick to the shapefiles (.shp
). The portal allows downloading the maps at different resolutions. In this guide, we will be working with the most precise one, which is referred to as ‘Základná úroveň/ ZBGIS - Administratívne hranice’.
sf
package provides st_read()
function with two parameters: 1. dsn
: this stands for data source name, but it is effectively a file name. 2. layer
: certain file formats allow saving multiple layers, we will work with single layer files (it does not mean we can’t have multiple indicators/statistic for each location).
We start off with the maps at the municipality level. We simply run:
mapa.obec.sf <- st_read(dsn = file.path(getwd(), 'maps', 'ah_shp_0', 'obec_0.shp'))
## Reading layer `obec_0' from data source `/Users/jp2011/workplace/gis-slovakia/maps/ah_shp_0/obec_0.shp' using driver `ESRI Shapefile'
## Simple feature collection with 2927 features and 14 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: -591311.3 ymin: -1334762 xmax: -165436.9 ymax: -1132697
## epsg (SRID): NA
## proj4string: +proj=krovak +lat_0=49.5 +lon_0=24.83333333333333 +alpha=30.28813975277778 +k=0.9999 +x_0=0 +y_0=0 +ellps=bessel +units=m +no_defs
Now that, we loaded the object, we familiarise ourselves with it.
To see what coordinate system it uses, we run:
st_crs(mapa.obec.sf)
## Coordinate Reference System:
## No EPSG code
## proj4string: "+proj=krovak +lat_0=49.5 +lon_0=24.83333333333333 +alpha=30.28813975277778 +k=0.9999 +x_0=0 +y_0=0 +ellps=bessel +units=m +no_defs"
We see that the CRS used is krovak
, but we do not get the code, e.g. 5513. Since, the krovak CRS comes in different flavours, we do a transformation to the one that the cadastre office in Slovakia uses by using st_transform()
function. We also save that code into a constant for future use.
CRS.CODE.KU.SK <- 5513 # save the cadastre CRS to a constant
mapa.obec.sf <- st_transform(x = mapa.obec.sf, crs = CRS.CODE.KU.SK)
st_crs(mapa.obec.sf) #
## Coordinate Reference System:
## EPSG: 5513
## proj4string: "+proj=krovak +lat_0=49.5 +lon_0=24.83333333333333 +alpha=30.28813972222222 +k=0.9999 +x_0=0 +y_0=0 +ellps=bessel +towgs84=589,76,480,0,0,0,0 +units=m +no_defs"
Now, we see that the EPSG code for our coordinates system is set correctly.
Now, we explore the data that the maps comes with. First we check the column names:
names(mapa.obec.sf)
## [1] "DOW" "AUT" "ACH" "SOI" "FACC"
## [6] "IDN4" "NM4" "IDN3" "NM3" "IDN2"
## [11] "NM2" "VYMERA" "Shape_Leng" "Shape_Area" "geometry"
Some of these names are intuitive, but have a one more peek at the data to understand more. R’s built-in command summary()
is a useful start.
summary(mapa.obec.sf)
## DOW AUT ACH SOI FACC
## Min. :2019-04-30 Min. :1 Min. :1 Min. :9 FA004:2927
## 1st Qu.:2019-04-30 1st Qu.:1 1st Qu.:1 1st Qu.:9
## Median :2019-04-30 Median :1 Median :1 Median :9
## Mean :2019-04-30 Mean :1 Mean :1 Mean :9
## 3rd Qu.:2019-04-30 3rd Qu.:1 3rd Qu.:1 3rd Qu.:9
## Max. :2019-04-30 Max. :1 Max. :1 Max. :9
##
## IDN4 NM4 IDN3 NM3
## Min. :500011 Lúčka : 4 Min. :101.0 Košice - okolie: 114
## 1st Qu.:508989 Porúbka : 4 1st Qu.:402.0 Rimavská Sobota: 107
## Median :518204 Dúbrava : 3 Median :606.0 Prešov : 91
## Mean :521206 Lúčky : 3 Mean :547.7 Levice : 89
## 3rd Qu.:526138 Petrovce: 3 3rd Qu.:708.0 Bardejov : 86
## Max. :599972 Píla : 3 Max. :811.0 Trebišov : 82
## (Other) :2907 (Other) :2358
## IDN2 NM2 VYMERA
## Min. :1.000 Prešovský :665 Min. : 361255
## 1st Qu.:4.000 Banskobystrický:516 1st Qu.: 7207245
## Median :6.000 Košický :461 Median : 11655767
## Mean :5.415 Nitriansky :354 Mean : 16752332
## 3rd Qu.:7.000 Žilinský :315 3rd Qu.: 19769322
## Max. :8.000 Trenčiansky :276 Max. :359784254
## (Other) :340
## Shape_Leng Shape_Area geometry
## Min. : 3211 Min. : 354208 MULTIPOLYGON :2927
## 1st Qu.: 14031 1st Qu.: 7207035 epsg:5513 : 0
## Median : 18582 Median : 11647728 +proj=krov...: 0
## Mean : 21367 Mean : 16747701
## 3rd Qu.: 25503 3rd Qu.: 19794868
## Max. :204284 Max. :359936992
##
From this, it is obvious that: 1. IDN4
is a unique identifier for each municipality. 2. NM4
is the name of the municipality. 3. IDN3
is the identifier for okres. 4. NM3
is the name for okres. 5. IDN2
is the identifier for kraj. 5. NM2
is the name for kraj. 7. Shape_Leng
and Shape_Area
are the perimeter and the area of the municipality. 8. It’s not clear to me what VYMERA
is given that the values slightly deviate from the Shape_Area
.
The other columns are not relevant for this tutorial.
In the visualisation section, we will stick to these values even though they are not particularly interesting. In later sections we will add more interesting data.
Before, we proceed, we rename them to more meaningful names:
library(dplyr)
mapa.obec.sf <- mapa.obec.sf %>%
rename(obec = NM4) %>%
rename(obec_id = IDN4) %>%
rename(okres = NM3) %>%
rename(okres_id = IDN3) %>%
rename(kraj = NM2) %>%
rename(kraj_id = IDN2) %>%
rename(obvod_uzemia = Shape_Leng) %>%
rename(plocha_uzemia = Shape_Area)
For convenience, we also load other geometries: okres, kraj. We also transform them into the correct CRS.
mapa.okres.sf <- st_read(dsn = file.path(getwd(), 'maps', 'ah_shp_0', 'okres_0.shp'))
## Reading layer `okres_0' from data source `/Users/jp2011/workplace/gis-slovakia/maps/ah_shp_0/okres_0.shp' using driver `ESRI Shapefile'
## Simple feature collection with 79 features and 12 fields
## geometry type: POLYGON
## dimension: XY
## bbox: xmin: -591311.3 ymin: -1334762 xmax: -165436.9 ymax: -1132697
## epsg (SRID): NA
## proj4string: +proj=krovak +lat_0=49.5 +lon_0=24.83333333333333 +alpha=30.28813975277778 +k=0.9999 +x_0=0 +y_0=0 +ellps=bessel +units=m +no_defs
mapa.okres.sf <- st_transform(x = mapa.okres.sf, crs = CRS.CODE.KU.SK)
mapa.kraj.sf <- st_read(dsn = file.path(getwd(), 'maps', 'ah_shp_0', 'kraj_0.shp'))
## Reading layer `kraj_0' from data source `/Users/jp2011/workplace/gis-slovakia/maps/ah_shp_0/kraj_0.shp' using driver `ESRI Shapefile'
## Simple feature collection with 8 features and 10 fields
## geometry type: POLYGON
## dimension: XY
## bbox: xmin: -591311.3 ymin: -1334762 xmax: -165436.9 ymax: -1132697
## epsg (SRID): NA
## proj4string: +proj=krovak +lat_0=49.5 +lon_0=24.83333333333333 +alpha=30.28813975277778 +k=0.9999 +x_0=0 +y_0=0 +ellps=bessel +units=m +no_defs
mapa.kraj.sf <- st_transform(x = mapa.kraj.sf, crs = CRS.CODE.KU.SK)
And we do the renaming as before (not shown here).
One of the mosti important aspects of GIS is data visualisation. There are different way of going about it, here is a selection of what I’ve used personally.
Instinctively, the first thing one does when exploring an unknown object in R is call the plot()
method. Indeed, there is a default method that does surprisingly good job with little effort.
plot(mapa.okres.sf)
## Warning: plotting the first 9 out of 12 attributes; use max.plot = 12 to
## plot all
The command tried to plot all the columns, but by default it takes the first 9. Usually, we want to have more control over what we plot, so we choose what we are interested in. In this case, we plot the area and the perimeter.
plot(x = mapa.okres.sf[c('obvod_uzemia', 'plocha_uzemia')],
key.pos = 1) # this add the legend to the bottom (1)
This is a good place to start when doing exploratory analysis, but it’s better to use more sophisticated functions to produce better-looking plots.
ggplot is the tool of choice for general X-Y plots. It is possible to
library(ggplot2) # Load the ggplot library
library(viridis) # Load the library for a nice colour palette
ggplot(mapa.okres.sf) +
geom_sf(aes(fill = plocha_uzemia)) +
scale_fill_viridis("Plocha m\u00B2") +
ggtitle("Plocha okresov") +
theme_bw()
Whatever applied to ggplot craft, applies here as well.
This is a very versatile library that allows integration with Open Street Map data and interactive maps on top of regular plotting. What we show here is only scratching the surface, for more examples, please visit the tmap’s github page.
Before we proceed, we install the libraries and load them.
install.packages('tmap')
library(tmap)
tmap_mode("plot")
## tmap mode set to plotting
tm_shape(mapa.okres.sf) +
tm_fill(col = "plocha_uzemia", style = "cont", title="Plocha m\u00B2") +
tm_style('gray', title = 'Plocha okresov m\u00B2')
For example, we can do a breakdown by kraj.
tmap_mode("plot")
## tmap mode set to plotting
tm_shape(mapa.okres.sf) +
tm_fill(col = "plocha_uzemia", style = "cont", title="Plocha m\u00B2") +
tm_style('gray', title = 'Okresy SR') +
tm_facets('kraj')
For plots with fewer components (e.g. kraj), we might want to add labels. Be careful with the overlaps of the labels and overflows outside the frame. This post gives a good solution for the latter.
tmap_mode("plot")
## tmap mode set to plotting
tm_shape(mapa.kraj.sf) +
tm_fill(col = "plocha_uzemia", style = "cont", title="Plocha m\u00B2") +
tm_style('gray', title = 'Okresy SR') +
tm_text('kraj', auto.placement = TRUE)
To save the map, use tmap_save()
function.
mapa.okres.sf.gps <- st_transform(x = mapa.okres.sf, crs=4326)
osm_tiles = tmaptools::read_osm(st_bbox(mapa.okres.sf.gps), type = "osm")
tmap.object <- qtm(osm_tiles, raster.alpha = 0.6) +
tm_shape(mapa.okres.sf.gps) +
tm_fill(col = "Shape_Area", style = "cont", title="Plocha m\u00B2") +
tm_style('gray', title = 'Okresy SR')
tmap_mode("view")
## tmap mode set to interactive viewing
tm_shape(mapa.okres.sf) +
tm_fill(col = "plocha_uzemia", style = "cont", title="Plocha m\u00B2", scale = 0.8, alpha = 0.7) +
tm_style('gray', title = 'Okresy SR')