SUNGEO R Package

For users working with original (unpublished, private, proprietary) data, SUNGEO offers a statistical software package with embedded methodological tools for sub-national data processing and integration. The SUNGEO Geoprocessing Toolkit R package provides researchers with routines, documentation and source code to implement multiple (dis-)aggregation techniques with their own data. This includes tools for interpolating and integrating spatially-misaligned GIS datasets, tools for batch geocoding of addresses, and tools for selecting, downloading and loading spatial data directly into R from SUNGEO’s servers.

Some downloads will take longer than others depending on the scope of the information requested.  If your download fails or returns a timed out response, consider breaking down the query into smaller sub-components by separating:

  • Countries
  • Topic

Alternatively, you can download these data through the R package.  See Tools for Researchers.

Thank you for your flexibility!

Useful Links

Download R: cran.r-project.org
Current stable package release: cran.r-project.org/package=SUNGEO
Most recent beta version: github.com/zhukovyuri/SUNGEO
To report a bug or view the list of known issues with the R package, view our Issue Tracker


Vignettes and illustrative examples

1. Installing and loading the package

To install in R (from CRAN):

install.packages("SUNGEO", dependencies = TRUE)

To install in R (from GitHub):

library(devtools)
devtools::install_github("zhukovyuri/SUNGEO", dependencies = TRUE)

Load package:

library(SUNGEO)

Read help files:

?get_data
?poly2poly_ap
?utm_select

2. Download data through SUNGEO API

Single country, single topic:

data_1 <- get_data(
    country_name="Afghanistan", 
    topics="Demographics:Population:GHS")
data_1

Multiple countries, multiple topics:

data_2 <- get_data(
	country_name=c("Albania","Moldova"),
	topics=c("Demographics:Ethnicity:GREG","Demographics:Population:GHS"))
data_2

Other boundary sets, spatial and time units:

data_3 <- get_data(
	country_name="Albania",
	topics="Weather:AirTemperatureAndPrecipitation:NOAA",
	geoset="GAUL",geoset_yr=1990,space_unit="adm2",time_unit="month",
	year_min=1990,year_max=1991)
data_3
Get list of available data through SUNGEO API

Get list of all available data:

info_1 <- get_info()
info_1["summary"]
info_1["topics"]
info_1["geosets"]

Get list of available data for a single country:

info_2 <- get_info(country_names="Afghanistan")
info_2["summary"]
info_2["topics"]
info_2["geosets"]

Get list of available data for a single topic:

info_3 <- get_info(topics="Elections:LowerHouse:CLEA")
info_3["summary"]
info_3["topics"]

Get list of available data for a multiple countries and topics:

info_4 <- get_info(
           country_names=c("Afghanistan","Zambia"),
           topics=c("Elections:LowerHouse:CLEA","Events:PoliticalViolence:GED"))
info_4["summary"]

3. Geocoding addresses

Get geographic coordinates for a single address (top match only):

geocode_osm("Michigan Stadium")

Return detailed results for top match:

geocode_osm("Michigan Stadium", details=TRUE)

Return detailed results for all matches:

geocode_osm("Michigan Stadium", details=TRUE, return_all = TRUE)

Batch geocode multiple addresses (top matches only):

geocode_osm_batch(c("Ann Arbor","East Lansing","Columbus"))

... with progress reports:

geocode_osm_batch(c("Ann Arbor","East Lansing","Columbus"), 
                  verbose = TRUE)

Return detailed results for all matches:

geocode_osm_batch(c("Ann Arbor","East Lansing","Columbus"),
                  details = TRUE, return_all = TRUE)

4. Changes of geographic support

Area-weighted polygon-to-polygon interpolation

Load legislative election results (from CLEA):

data(clea_deu2009)

Visualize voter turnout at constituency level:

plot(clea_deu2009["to1"])

Load 0.5 degree hexagonal grid:

data(hex_05_deu)

Interpolate:

out_1 <- poly2poly_ap(poly_from = clea_deu2009,
                      poly_to = hex_05_deu,
                      poly_to_id = "HEX_ID",
                      varz = "to1"
                      )

Visualize voter turnout at grid cell level:

plot(out_1["to1_aw"])
Population-weighted polygon-to-polygon interpolation

Load population raster (from GPW v4)

data(gpw4_deu2010)

Interpolate:

out_2 <- poly2poly_ap(poly_from = clea_deu2009,
                      poly_to = hex_05_deu,
                      poly_to_id = "HEX_ID",
                      varz = "to1",
                      methodz = "pw",
                      pop_raster = gpw4_deu2010)

Visualize voter turnout at grid cell level:

plot(out_2["to1_pw"])
Point-to-polygon interpolation using tessellation method and area weights

Load point-level election results

data(clea_deu2009_pt)

Interpolate:

out_4 <- point2poly_tess(pointz = clea_deu2009_pt,
                         polyz = hex_05_deu,
                         poly_id = "HEX_ID",
                         varz = "to1")

Visualize voter turnout at grid cell level:

plot(out_4["to1_aw"])
Point-to-polygon interpolation using ordinary Kriging

Ordinary Kriging with one outcome variable

out_5 <- point2poly_krige(pointz = clea_deu2009_pt,
                          polyz = clea_deu2009,
                          yvarz = "to1")

Compare observed values to predictions:

par(mfrow=c(1,2))
plot(clea_deu2009["to1"], key.pos = NULL, reset = FALSE)
plot(out_5["to1.pred"], key.pos = NULL, reset = FALSE)

Ordinary Kriging with multiple outcome variables:

out_6 <- point2poly_krige(pointz = clea_deu2009_pt,
                          polyz = clea_deu2009,
                          yvarz = c("to1","pvs1_margin"))

Compare observed values to predictions:

par(mfrow=c(1,2))
plot(clea_deu2009["pvs1_margin"], key.pos = NULL, reset = FALSE)
plot(out_6["pvs1_margin.pred"], key.pos = NULL, reset = FALSE)
Point-to-polygon interpolation using universal Kriging

Universal Kriging with one outcome variable and one covariate:

out_7 <- point2poly_krige(pointz = clea_deu2009_pt,
                        polyz = clea_deu2009,
                        yvarz = "to1",
                        rasterz = gpw4_deu2010)

Compare observed values to predictions:

par(mfrow=c(1,2))
plot(clea_deu2009["to1"], key.pos = NULL, reset = FALSE)
plot(out_7["to1.pred"], key.pos = NULL, reset = FALSE)
Line-in-polygon analysis

Load highways data (from Digital Chart of the World):

data(highways_deu1992)

Basic map overlay:

plot(hex_05_deu["geometry"])
plot(highways_deu1992$geometry, add=TRUE, col = "blue", lwd=2)

Calculate road lengths, densities and distances from each polygon to nearest highway:

out_8 <- line2poly(linez = highways_deu1992,
                   polyz = hex_05_deu,
                   poly_id = "HEX_ID")

Visualize results:

plot(out_8["line_length"])
plot(out_8["line_density"])
plot(out_8["line_distance"])

Replace missing road lengths and densities with 0's, rename variables:

out_9 <- line2poly(linez = highways_deu1992,
                   polyz = hex_05_deu,
                   poly_id = "HEX_ID",
                   outvar_name = "road",
                   na_val = 0)

Visualize results:

plot(out_9["road_length"])
plot(out_9["road_density"])
plot(out_9["road_distance"])
dev.off()

5. Projections and coordinate reference systems

Automatically find a planar CRS for an unprojected dataset.

Visualize original geometries (WGS1984, degrees):

plot(clea_deu2009["geometry"], axes=TRUE)

Find a suitable CRS and re-project:

out_10 <- utm_select(clea_deu2009)

Visualize transformed geometries (UTM 32N, meters):

plot(out_10["geometry"], axes=TRUE)

Extract proj4string of transformed data:

utm_select(clea_deu2009, return_list=TRUE)$proj_out

6. Rasterization of polygons

Transform sf polygon layer into 1km-by-1km RasterLayer (requires planar CRS):

out_11 <- sf2raster(polyz_from = utm_select(clea_deu2009),
                   input_variable = "to1")

Extract proj4string of transformed data:

# Visualize raster
raster::plot(out_11)

Extract proj4string of transformed data:

# 25km-by-25km RasterLayer (requires planar CRS)
out_12 <- sf2raster(polyz_from = utm_select(clea_deu2009),
                   input_variable = "to1",
                   grid_res = c(25000, 25000))

Extract proj4string of transformed data:

# Visualize raster
raster::plot(out_12)
Create cartogram

Cartogram of turnout scaled by number of valid votes:

out_13 <- sf2raster(polyz_from = utm_select(clea_deu2009),
                  input_variable = "to1",
                  cartogram = TRUE,
                  carto_var = "vv1")
raster::plot(out_13)
Reverse rasterization

Polygonization of cartogram raster:

out_14a <- sf2raster(polyz_from = utm_select(clea_deu2009),
                    input_variable = "to1",
                    cartogram = TRUE,
                    carto_var = "vv1",
                    return_list = TRUE)
out_14 <- sf2raster(reverse = TRUE,
                   poly_to = out_14a$poly_to,
                   return_output = out_14a$return_output,
                   return_field = out_14a$return_field)
plot(out_14["to1"])