Title: | Disaggregation Modelling |
---|---|
Description: | Fits disaggregation regression models using 'TMB' ('Template Model Builder'). When the response data are aggregated to polygon level but the predictor variables are at a higher resolution, these models can be useful. Regression models with spatial random fields. The package is described in detail in Nandi et al. (2023) <doi:10.18637/jss.v106.i11>. |
Authors: | Anita Nandi [aut] |
Maintainer: | Tim Lucas <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.4.0 |
Built: | 2025-02-11 06:52:54 UTC |
Source: | https://github.com/timcdlucas/disaggregation |
Function to fit the disaggregation model
as.disag_data( polygon_shapefile, shapefile_names, covariate_rasters, polygon_data, covariate_data, aggregation_pixels, coords_for_fit, coords_for_prediction, start_end_index, mesh = NULL, coordsForFit = NULL, coordsForPrediction = NULL, startendindex = NULL )
as.disag_data( polygon_shapefile, shapefile_names, covariate_rasters, polygon_data, covariate_data, aggregation_pixels, coords_for_fit, coords_for_prediction, start_end_index, mesh = NULL, coordsForFit = NULL, coordsForPrediction = NULL, startendindex = NULL )
polygon_shapefile |
sf object containing the response data |
shapefile_names |
List of 2: polygon id variable name and response variable name from x |
covariate_rasters |
SpatRaster of covariates |
polygon_data |
data.frame with two columns: polygon id and response |
covariate_data |
data.frame with cell id, polygon id and covariate columns |
aggregation_pixels |
vector with value of aggregation raster at each pixel |
coords_for_fit |
coordinates of the covariate data points within the polygons in x |
coords_for_prediction |
coordinates of the covariate data points in the whole raster extent |
start_end_index |
matrix containing the start and end index for each polygon |
mesh |
inla.mesh object to use in the fit |
coordsForFit |
Deprecated. |
coordsForPrediction |
Deprecated. |
startendindex |
Deprecated. |
A list is returned of class disag_data
.
The functions summary, print and plot can be used on disag_data
.
The list of class disag_data
contains:
x |
The sf object used as an input. |
covariate_rasters |
The SpatRaster used as an input. |
polygon_data |
A data frame with columns of area_id, response and N (sample size: all NAs unless using binomial data). Each row represents a polygon. |
covariate_data |
A data frame with columns of area_id, cell_id and one for each covariate in covariate_rasters. Each row represents a pixel in a polygon. |
aggregation_pixels |
An array with the value of the aggregation raster for each pixel in the same order as the rows of covariate_data. |
coords_for_fit |
A matrix with two columns of x, y coordinates of pixels within the polygons. Used to make the spatial field. |
coords_for_prediction |
A matrix with two columns of x, y coordinates of pixels in the whole Raster. Used to make predictions. |
start_end_index |
A matrix with two columns containing the start and end index of the pixels within each polygon. |
mesh |
A INLA mesh to be used for the spatial field of the disaggregation model. |
build_mesh function takes a sf object and mesh arguments to build an appropriate mesh for the spatial field.
build_mesh(shapes, mesh_args = NULL, mesh.args = NULL)
build_mesh(shapes, mesh_args = NULL, mesh.args = NULL)
shapes |
sf covering the region under investigation. |
mesh_args |
list of parameters that control the mesh structure. convex, concave and resolution, to control the boundary of the inner mesh, and max.edge, cutoff and offset, to control the mesh itself, with the parameters having the same meaning as in the INLA functions inla.convex.hull and inla.mesh.2d. cut has been deprecated - use cutoff instead. |
mesh.args |
Deprecated. |
The mesh is created by finding a tight boundary around the polygon data, and creating a fine mesh within the boundary and a coarser mesh outside. This speeds up computation time by only having a very fine mesh within the area of interest and having a small region outside with a coarser mesh to avoid edge effects.
Six mesh parameters can be specified as arguments: convex, concave and resolution, to control the boundary of the inner mesh, and max.edge, cutoff and offset, to control the mesh itself, with the names meaning the same as used by the fmesher functions fm_nonconvex_hull_inla and fm_mesh_2d.
Defaults are: pars <- list(convex = -0.01, concave = -0.5, resolution = 300, max.edge = c(3.0, 8), cutoff = 0.4, offset = c(1, 15)).
An inla.mesh object
## Not run: polygons <- list() for(i in 1:14) { row <- ceiling(i/10) col <- ifelse(i %% 10 != 0, i %% 10, 10) xmin = 2*(col - 1); xmax = 2*col; ymin = 2*(row - 1); ymax = 2*row polygons[[i]] <- list(cbind(c(xmin, xmax, xmax, xmin, xmin), c(ymax, ymax, ymin, ymin, ymax))) } polys <- lapply(polygons, sf::st_polygon) response_df <- data.frame(area_id = 1:100, response = runif(100, min = 0, max = 10)) spdf <- sf::st_sf(polys, response_df) my_mesh <- build_mesh(spdf) ## End(Not run)
## Not run: polygons <- list() for(i in 1:14) { row <- ceiling(i/10) col <- ifelse(i %% 10 != 0, i %% 10, 10) xmin = 2*(col - 1); xmax = 2*col; ymin = 2*(row - 1); ymax = 2*row polygons[[i]] <- list(cbind(c(xmin, xmax, xmax, xmin, xmin), c(ymax, ymax, ymin, ymin, ymax))) } polys <- lapply(polygons, sf::st_polygon) response_df <- data.frame(area_id = 1:100, response = runif(100, min = 0, max = 10)) spdf <- sf::st_sf(polys, response_df) my_mesh <- build_mesh(spdf) ## End(Not run)
disag_model function takes a disag_data object created by
prepare_data
and performs a Bayesian disaggregation fit.
disag_model( data, priors = NULL, family = "gaussian", link = "identity", iterations = 100, field = TRUE, iid = TRUE, hess_control_parscale = NULL, hess_control_ndeps = 1e-04, silent = TRUE )
disag_model( data, priors = NULL, family = "gaussian", link = "identity", iterations = 100, field = TRUE, iid = TRUE, hess_control_parscale = NULL, hess_control_ndeps = 1e-04, silent = TRUE )
data |
disag_data object returned by |
priors |
list of prior values:
|
family |
likelihood function: gaussian, binomial or poisson |
link |
link function: logit, log or identity |
iterations |
number of iterations to run the optimisation for |
field |
logical. Flag the spatial field on or off |
iid |
logical. Flag the iid effect on or off |
hess_control_parscale |
Argument to scale parameters during the calculation of the Hessian.
Must be the same length as the number of parameters. See |
hess_control_ndeps |
Argument to control step sizes during the calculation of the Hessian.
Either length 1 (same step size applied to all parameters) or the same length as the number of parameters.
Default is 1e-3, try setting a smaller value if you get NaNs in the standard error of the parameters.
See |
silent |
logical. Suppress verbose output. |
The model definition
The disaggregation model makes predictions at the pixel level:
And then aggregates these predictions to the polygon level using the weighted sum (via the aggregation raster, ):
The different likelihood correspond to slightly different models ( is the response count data):
Gaussian:
If is the dispersion of the pixel data,
is the dispersion of the polygon data, where
- predicts incidence rate.
Binomial:
For a survey in polygon j, is the number positive and
is the number tested.
- predicts prevalence rate.
Poisson:
- predicts incidence count.
Specify priors for the regression parameters, field and iid effect as a single list. Hyperpriors for the field
are given as penalised complexity priors you specify and
for the range of the field
where
, and
and
for the variation of the field
where
. Also, specify pc priors for the iid effect
The family and link arguments are used to specify the likelihood and link function respectively. The likelihood function can be one of gaussian, poisson or binomial. The link function can be one of logit, log or identity. These are specified as strings.
The field and iid effect can be turned on or off via the field and iid logical flags. Both are default TRUE.
The iterations argument specifies the maximum number of iterations the model can run for to find an optimal point.
The silent argument can be used to publish/suppress verbose output. Default TRUE.
A list is returned of class disag_model
.
The functions summary, print and plot can be used on disag_model
.
The list of class disag_model
contains:
obj |
The TMB model object returned by |
opt |
The optimized model object returned by |
sd_out |
The TMB object returned by |
data |
The disag_data object used as an input to the model. |
model_setup |
A list of information on the model setup. Likelihood function (family), link function(link), logical: whether a field was used (field) and logical: whether an iid effect was used (iid). |
Nanda et al. (2023) disaggregation: An R Package for Bayesian Spatial Disaggregation Modeling. <doi:10.18637/jss.v106.i11>
## Not run: polygons <- list() n_polygon_per_side <- 10 n_polygons <- n_polygon_per_side * n_polygon_per_side n_pixels_per_side <- n_polygon_per_side * 2 for(i in 1:n_polygons) { row <- ceiling(i/n_polygon_per_side) col <- ifelse(i %% n_polygon_per_side != 0, i %% n_polygon_per_side, n_polygon_per_side) xmin = 2*(col - 1); xmax = 2*col; ymin = 2*(row - 1); ymax = 2*row polygons[[i]] <- list(cbind(c(xmin, xmax, xmax, xmin, xmin), c(ymax, ymax, ymin, ymin, ymax))) } polys <- lapply(polygons,sf::st_polygon) N <- floor(runif(n_polygons, min = 1, max = 100)) response_df <- data.frame(area_id = 1:n_polygons, response = runif(n_polygons, min = 0, max = 1000)) spdf <- sf::st_sf(response_df, geometry = polys) # Create raster stack r <- terra::rast(ncol=n_pixels_per_side, nrow=n_pixels_per_side) terra::ext(r) <- terra::ext(spdf) r[] <- sapply(1:terra::ncell(r), function(x){ rnorm(1, ifelse(x %% n_pixels_per_side != 0, x %% n_pixels_per_side, n_pixels_per_side), 3))} r2 <- terra::rast(ncol=n_pixels_per_side, nrow=n_pixels_per_side) terra::ext(r2) <- terra::ext(spdf) r2[] <- sapply(1:terra::ncell(r), function(x) rnorm(1, ceiling(x/n_pixels_per_side), 3)) cov_stack <- c(r, r2) names(cov_stack) <- c('layer1', 'layer2') test_data <- prepare_data(polygon_shapefile = spdf, covariate_rasters = cov_stack) result <- disag_model(test_data, iterations = 2) ## End(Not run)
## Not run: polygons <- list() n_polygon_per_side <- 10 n_polygons <- n_polygon_per_side * n_polygon_per_side n_pixels_per_side <- n_polygon_per_side * 2 for(i in 1:n_polygons) { row <- ceiling(i/n_polygon_per_side) col <- ifelse(i %% n_polygon_per_side != 0, i %% n_polygon_per_side, n_polygon_per_side) xmin = 2*(col - 1); xmax = 2*col; ymin = 2*(row - 1); ymax = 2*row polygons[[i]] <- list(cbind(c(xmin, xmax, xmax, xmin, xmin), c(ymax, ymax, ymin, ymin, ymax))) } polys <- lapply(polygons,sf::st_polygon) N <- floor(runif(n_polygons, min = 1, max = 100)) response_df <- data.frame(area_id = 1:n_polygons, response = runif(n_polygons, min = 0, max = 1000)) spdf <- sf::st_sf(response_df, geometry = polys) # Create raster stack r <- terra::rast(ncol=n_pixels_per_side, nrow=n_pixels_per_side) terra::ext(r) <- terra::ext(spdf) r[] <- sapply(1:terra::ncell(r), function(x){ rnorm(1, ifelse(x %% n_pixels_per_side != 0, x %% n_pixels_per_side, n_pixels_per_side), 3))} r2 <- terra::rast(ncol=n_pixels_per_side, nrow=n_pixels_per_side) terra::ext(r2) <- terra::ext(spdf) r2[] <- sapply(1:terra::ncell(r), function(x) rnorm(1, ceiling(x/n_pixels_per_side), 3)) cov_stack <- c(r, r2) names(cov_stack) <- c('layer1', 'layer2') test_data <- prepare_data(polygon_shapefile = spdf, covariate_rasters = cov_stack) result <- disag_model(test_data, iterations = 2) ## End(Not run)
Looks in a specified folder for raster files. Returns a multi-layered SpatRaster of the rasters cropped to the extent specified by the shape parameter.
getCovariateRasters(directory, file_pattern = ".tif$", shape)
getCovariateRasters(directory, file_pattern = ".tif$", shape)
directory |
Filepath to the directory containing the rasters. |
file_pattern |
Pattern the filenames must match. Default is all files ending in .tif . |
shape |
An object with an extent that the rasters will be cropped to. |
A multi-layered SpatRaster of the raster files in the directory
## Not run: getCovariateRasters('/home/rasters', '.tif$', shape) ## End(Not run)
## Not run: getCovariateRasters('/home/rasters', '.tif$', shape) ## End(Not run)
Returns a data.frame with a row for each polygon in the sf object and columns: area_id, response and N, containing the id of the polygon, the values of the response for that polygon, and the sample size respectively. If the data is not survey data (the sample size does not exist), this column will contain NAs.
getPolygonData( shape, id_var = "area_id", response_var = "response", sample_size_var = NULL )
getPolygonData( shape, id_var = "area_id", response_var = "response", sample_size_var = NULL )
shape |
A sf object containing response data. |
id_var |
Name of column in shape object with the polygon id. Default 'area_id'. |
response_var |
Name of column in shape object with the response data. Default 'response'. |
sample_size_var |
For survey data, name of column in sf object (if it exists) with the sample size data. Default NULL. |
A data.frame with a row for each polygon in the sf object and columns: area_id, response and N, containing the id of the polygon, the values of the response for that polygon, and the sample size respectively. If the data is not survey data (the sample size does not exist), this column will contain NAs.
{ polygons <- list() for(i in 1:100) { row <- ceiling(i/10) col <- ifelse(i %% 10 != 0, i %% 10, 10) xmin = 2*(col - 1); xmax = 2*col; ymin = 2*(row - 1); ymax = 2*row polygons[[i]] <- list(cbind(c(xmin, xmax, xmax, xmin, xmin), c(ymax, ymax, ymin, ymin, ymax))) } polys <- lapply(polygons,sf::st_polygon) response_df <- data.frame(area_id = 1:100, response = runif(100, min = 0, max = 10)) spdf <- sf::st_sf(response_df, geometry = polys) getPolygonData(spdf, id_var = 'area_id', response_var = 'response') }
{ polygons <- list() for(i in 1:100) { row <- ceiling(i/10) col <- ifelse(i %% 10 != 0, i %% 10, 10) xmin = 2*(col - 1); xmax = 2*col; ymin = 2*(row - 1); ymax = 2*row polygons[[i]] <- list(cbind(c(xmin, xmax, xmax, xmin, xmin), c(ymax, ymax, ymin, ymin, ymax))) } polys <- lapply(polygons,sf::st_polygon) response_df <- data.frame(area_id = 1:100, response = runif(100, min = 0, max = 10)) spdf <- sf::st_sf(response_df, geometry = polys) getPolygonData(spdf, id_var = 'area_id', response_var = 'response') }
From the covariate data and polygon data, the function matches the polygon id between the two to find which pixels from the covariate data are contained in each of the polygons.
getStartendindex(covariates, polygon_data, id_var = "area_id")
getStartendindex(covariates, polygon_data, id_var = "area_id")
covariates |
data.frame with each covariate as a column an and id column. |
polygon_data |
data.frame with polygon id and response data. |
id_var |
string with the name of the column in the covariate data.frame containing the polygon id. |
Takes a data.frame containing the covariate data with a polygon id column and one column for each covariate,
and another data.frame containing polygon data with a polygon id, response and sample size column (as returned
by getPolygonData
function).
Returns a matrix with two columns and one row for each polygon. The first column is the index of the first row in covariate data that corresponds to that polygon, the second column is the index of the last row in covariate data that corresponds to that polygon.
A matrix with two columns and one row for each polygon. The first column is the index of the first row in covariate data that corresponds to that polygon, the second column is the index of the last row in covariate data that corresponds to that polygon.
{ covs <- data.frame(area_id = c(1, 1, 1, 2, 2, 3, 3, 3, 3), response = c(3, 9, 5, 2, 3, 6, 7, 3, 5)) response <- data.frame(area_id = c(1, 2, 3), response = c(4, 7, 2), N = c(NA, NA, NA)) getStartendindex(covs, response, 'area_id') }
{ covs <- data.frame(area_id = c(1, 1, 1, 2, 2, 3, 3, 3, 3), response = c(3, 9, 5, 2, 3, 6, 7, 3, 5)) response <- data.frame(area_id = c(1, 2, 3), response = c(4, 7, 2), N = c(NA, NA, NA)) getStartendindex(covs, response, 'area_id') }
make_model_object function takes a disag_data object created by prepare_data
and creates a TMB model object to be used in fitting.
make_model_object( data, priors = NULL, family = "gaussian", link = "identity", field = TRUE, iid = TRUE, silent = TRUE )
make_model_object( data, priors = NULL, family = "gaussian", link = "identity", field = TRUE, iid = TRUE, silent = TRUE )
data |
disag_data object returned by |
priors |
list of prior values |
family |
likelihood function: gaussian, binomial or poisson |
link |
link function: logit, log or identity |
field |
logical. Flag the spatial field on or off |
iid |
logical. Flag the iid effect on or off |
silent |
logical. Suppress verbose output. |
The model definition
The disaggregation model make predictions at the pixel level:
And then aggregates these predictions to the polygon level using the weighted sum (via the aggregation raster, ):
The different likelihood correspond to slightly different models ( is the response count data):
Gaussian:
If is the dispersion of the pixel data,
is the dispersion of the polygon data, where
- predicts incidence rate.
Binomial:
For a survey in polygon j, is the number positive and
is the number tested.
- predicts prevalence rate.
Poisson:
- predicts incidence count.
Specify priors for the regression parameters, field and iid effect as a single named list. Hyperpriors for the field
are given as penalised complexity priors you specify and
for the range of the field
where
, and
and
for the variation of the field
where
. Also, specify pc priors for the iid effect.
The precise names and default values for these priors are:
priormean_intercept: 0
priorsd_intercept: 10.0
priormean_slope: 0.0
priorsd_slope: 0.5
prior_rho_min: A third the length of the diagonal of the bounding box.
prior_rho_prob: 0.1
prior_sigma_max: sd(response/mean(response))
prior_sigma_prob: 0.1
prior_iideffect_sd_max: 0.1
prior_iideffect_sd_prob: 0.01
The family and link arguments are used to specify the likelihood and link function respectively. The likelihood function can be one of gaussian, poisson or binomial. The link function can be one of logit, log or identity. These are specified as strings.
The field and iid effect can be turned on or off via the field and iid logical flags. Both are default TRUE.
The iterations argument specifies the maximum number of iterations the model can run for to find an optimal point.
The silent argument can be used to publish/supress verbose output. Default TRUE.
The TMB model object returned by MakeADFun
.
## Not run: polygons <- list() n_polygon_per_side <- 10 n_polygons <- n_polygon_per_side * n_polygon_per_side n_pixels_per_side <- n_polygon_per_side * 2 for(i in 1:n_polygons) { row <- ceiling(i/n_polygon_per_side) col <- ifelse(i %% n_polygon_per_side != 0, i %% n_polygon_per_side, n_polygon_per_side) xmin = 2*(col - 1); xmax = 2*col; ymin = 2*(row - 1); ymax = 2*row polygons[[i]] <- list(cbind(c(xmin, xmax, xmax, xmin, xmin), c(ymax, ymax, ymin, ymin, ymax))) } polys <- lapply(polygons,sf::st_polygon) N <- floor(runif(n_polygons, min = 1, max = 100)) response_df <- data.frame(area_id = 1:n_polygons, response = runif(n_polygons, min = 0, max = 1000)) spdf <- sf::st_sf(response_df, geometry = polys) # Create raster stack r <- terra::rast(ncol=n_pixels_per_side, nrow=n_pixels_per_side) terra::ext(r) <- terra::ext(spdf) r[] <- sapply(1:terra::ncell(r), function(x){ rnorm(1, ifelse(x %% n_pixels_per_side != 0, x %% n_pixels_per_side, n_pixels_per_side), 3))} r2 <- terra::rast(ncol=n_pixels_per_side, nrow=n_pixels_per_side) terra::ext(r2) <- terra::ext(spdf) r2[] <- sapply(1:terra::ncell(r), function(x) rnorm(1, ceiling(x/n_pixels_per_side), 3)) cov_stack <- c(r, r2) names(cov_stack) <- c('layer1', 'layer2') test_data <- prepare_data(polygon_shapefile = spdf, covariate_rasters = cov_stack) result <- make_model_object(test_data) ## End(Not run)
## Not run: polygons <- list() n_polygon_per_side <- 10 n_polygons <- n_polygon_per_side * n_polygon_per_side n_pixels_per_side <- n_polygon_per_side * 2 for(i in 1:n_polygons) { row <- ceiling(i/n_polygon_per_side) col <- ifelse(i %% n_polygon_per_side != 0, i %% n_polygon_per_side, n_polygon_per_side) xmin = 2*(col - 1); xmax = 2*col; ymin = 2*(row - 1); ymax = 2*row polygons[[i]] <- list(cbind(c(xmin, xmax, xmax, xmin, xmin), c(ymax, ymax, ymin, ymin, ymax))) } polys <- lapply(polygons,sf::st_polygon) N <- floor(runif(n_polygons, min = 1, max = 100)) response_df <- data.frame(area_id = 1:n_polygons, response = runif(n_polygons, min = 0, max = 1000)) spdf <- sf::st_sf(response_df, geometry = polys) # Create raster stack r <- terra::rast(ncol=n_pixels_per_side, nrow=n_pixels_per_side) terra::ext(r) <- terra::ext(spdf) r[] <- sapply(1:terra::ncell(r), function(x){ rnorm(1, ifelse(x %% n_pixels_per_side != 0, x %% n_pixels_per_side, n_pixels_per_side), 3))} r2 <- terra::rast(ncol=n_pixels_per_side, nrow=n_pixels_per_side) terra::ext(r2) <- terra::ext(spdf) r2[] <- sapply(1:terra::ncell(r), function(x) rnorm(1, ceiling(x/n_pixels_per_side), 3)) cov_stack <- c(r, r2) names(cov_stack) <- c('layer1', 'layer2') test_data <- prepare_data(polygon_shapefile = spdf, covariate_rasters = cov_stack) result <- make_model_object(test_data) ## End(Not run)
Convert results of the model ready for plotting
plot_disag_model_data(x)
plot_disag_model_data(x)
x |
Object of class disag_model to be plotted. |
A list that contains:
posteriors |
A data.frame of posteriors |
data |
A data.frame of observed and predicted data |
title |
The title of the observed vs. predicted plot |
Plotting function for class disag_data (the input data for disaggregation).
## S3 method for class 'disag_data' plot(x, which = c(1, 2, 3), ...)
## S3 method for class 'disag_data' plot(x, which = c(1, 2, 3), ...)
x |
Object of class disag_data to be plotted. |
which |
If a subset of plots is required, specify a subset of the numbers 1:3 |
... |
Further arguments to plot function. |
Produces three plots: polygon response data, covariate rasters and INLA mesh.
A list of three plots: the polygon plot (ggplot), covariate plot (spplot) and INLA mesh plot (ggplot)
Plotting function for class disag_model (the result of the disaggregation fitting).
## S3 method for class 'disag_model' plot(x, include_iid = FALSE, ...)
## S3 method for class 'disag_model' plot(x, include_iid = FALSE, ...)
x |
Object of class disag_model to be plotted. |
include_iid |
logical. Whether or not to include predictions that include the IID effect. |
... |
Further arguments to plot function. |
Produces two plots: results of the fixed effects and in-sample observed vs predicted plot.
A list of two ggplot plots: results of the fixed effects and an in-sample observed vs predicted plot
Plotting function for class disag_prediction (the mean and uncertainty predictions of the disaggregation fitting).
## S3 method for class 'disag_prediction' plot(x, ...)
## S3 method for class 'disag_prediction' plot(x, ...)
x |
Object of class disag_prediction to be plotted. |
... |
Further arguments to plot function. |
Produces raster plots of the mean prediction, and the lower and upper confidence intervals.
A list of plots of rasters from the prediction: mean prediction, lower CI and upper CI.
predict_model function takes a disag_model object created by disaggregation::disag_model and predicts mean maps.
predict_model( model_output, new_data = NULL, predict_iid = FALSE, newdata = NULL )
predict_model( model_output, new_data = NULL, predict_iid = FALSE, newdata = NULL )
model_output |
disag_model object returned by disag_model function |
new_data |
If NULL, predictions are made using the data in model_output. If this is a SpatRaster, predictions will be made over this data. Default NULL. |
predict_iid |
If TRUE, any polygon iid effect from the model will be used in the prediction. Default FALSE. |
newdata |
Deprecated. |
Function returns rasters of the mean predictions as well as the covariate and field contributions to the linear predictor.
To predict over a different spatial extent to that used in the model, a SpatRaster covering the region to make predictions over is passed to the argument new_data. If this is not given predictions are made over the data used in the fit.
The predict_iid logical flag should be set to TRUE if the results of the iid effect from the model are to be used in the prediction.
The mean prediction, which is a list of:
prediction Raster of mean predictions based.
field Raster of the field component of the linear predictor.
iid Raster of the iid component of the linear predictor.
covariates Raster of the covariate component of the linear predictor.
## Not run: predict_model(result) ## End(Not run)
## Not run: predict_model(result) ## End(Not run)
predict_uncertainty function takes a disag_model object created by disaggregation::disag_model and predicts upper and lower credible interval maps.
predict_uncertainty( model_output, new_data = NULL, predict_iid = FALSE, N = 100, CI = 0.95, newdata = NULL )
predict_uncertainty( model_output, new_data = NULL, predict_iid = FALSE, N = 100, CI = 0.95, newdata = NULL )
model_output |
disag_model object returned by disag_model function. |
new_data |
If NULL, predictions are made using the data in model_output. If this is a raster stack or brick, predictions will be made over this data. Default NULL. |
predict_iid |
If TRUE, any polygon iid effect from the model will be used in the prediction. Default FALSE. |
N |
number of realisations. Default: 100. |
CI |
credible interval. Default: 0.95. |
newdata |
Deprecated. |
Function returns a SpatRaster of the realisations as well as the upper and lower credible interval rasters.
To predict over a different spatial extent to that used in the model, a SpatRaster covering the region to make predictions over is passed to the argument new_data. If this is not given predictions are made over the data used in the fit.
The predict_iid logical flag should be set to TRUE if the results of the iid effect from the model are to be used in the prediction.
The number of the realisations and the size of the credible interval to be calculated. are given by the arguments N and CI respectively.
The uncertainty prediction, which is a list of:
realisations SpatRaster of realisations of predictions. Number of realisations defined by argument N.
predictions_ci SpatRaster of the upper and lower credible intervals. Defined by argument CI.
## Not run: predict_uncertainty(result) ## End(Not run)
## Not run: predict_uncertainty(result) ## End(Not run)
predict.disag_model function takes a disag_model object created by disaggregation::disag_model and predicts mean and uncertainty maps.
## S3 method for class 'disag_model' predict( object, new_data = NULL, predict_iid = FALSE, N = 100, CI = 0.95, newdata = NULL, ... )
## S3 method for class 'disag_model' predict( object, new_data = NULL, predict_iid = FALSE, N = 100, CI = 0.95, newdata = NULL, ... )
object |
disag_model object returned by disag_model function. |
new_data |
If NULL, predictions are made using the data in model_output. If this is a raster stack or brick, predictions will be made over this data. |
predict_iid |
logical. If TRUE, any polygon iid effect from the model will be used in the prediction. Default FALSE. |
N |
Number of realisations. Default: 100. |
CI |
Credible interval to be calculated from the realisations. Default: 0.95. |
newdata |
Deprecated. |
... |
Further arguments passed to or from other methods. |
To predict over a different spatial extent to that used in the model, a SpatRaster covering the region to make predictions over is passed to the argument new_data. If this is not given predictions are made over the data used in the fit.
The predict_iid logical flag should be set to TRUE if the results of the iid effect from the model are to be used in the prediction.
For the uncertainty calculations, the number of the realisations and the size of the credible interval to be calculated are given by the arguments N and CI respectively.
An object of class disag_prediction which consists of a list of two objects:
mean_prediction |
List of:
|
uncertainty_prediction: |
List of:
|
## Not run: predict(fit_result) ## End(Not run)
## Not run: predict(fit_result) ## End(Not run)
prepare_data function is used to extract all the data required for fitting a disaggregation model. Designed to be used in the disaggregation::disag_model function.
prepare_data( polygon_shapefile, covariate_rasters, aggregation_raster = NULL, id_var = "area_id", response_var = "response", sample_size_var = NULL, mesh_args = NULL, na_action = FALSE, make_mesh = TRUE, mesh.args = NULL, na.action = NULL, makeMesh = NULL, ncores = NULL )
prepare_data( polygon_shapefile, covariate_rasters, aggregation_raster = NULL, id_var = "area_id", response_var = "response", sample_size_var = NULL, mesh_args = NULL, na_action = FALSE, make_mesh = TRUE, mesh.args = NULL, na.action = NULL, makeMesh = NULL, ncores = NULL )
polygon_shapefile |
sf object containing at least three columns: one with the geometried, one with the id for the polygons (id_var) and one with the response count data (response_var); for binomial data, i.e survey data, it can also contain a sample size column (sample_size_var). |
covariate_rasters |
SpatRaster of covariate rasters to be used in the model. |
aggregation_raster |
SpatRaster to aggregate pixel level predictions to polygon level e.g. population to aggregate prevalence. If this is not supplied a uniform raster will be used. |
id_var |
Name of column in sf object with the polygon id. |
response_var |
Name of column in sf object with the response data. |
sample_size_var |
For survey data, name of column in sf object (if it exists) with the sample size data. |
mesh_args |
list of parameters that control the mesh structure with the same names as used by INLA. |
na_action |
logical. If TRUE, NAs in response will be removed, covariate NAs will be given the median value, aggregation NAs will be set to zero. Default FALSE (NAs in response or covariate data within the polygons will give errors). |
make_mesh |
logical. If TRUE, build INLA mesh, takes some time. Default TRUE. |
mesh.args |
Deprecated. |
na.action |
Deprecated. |
makeMesh |
Deprecated. |
ncores |
Deprecated. |
Takes a sf object with the response data and a SpatRaster of covariates.
Extract the values of the covariates (as well as the aggregation raster, if given) at each pixel within the polygons (parallelExtract function). This is done in parallel and n.cores argument is used to set the number of cores to use for covariate extraction. This can be the number of covariates used in the model.
The aggregation raster defines how the pixels within each polygon are aggregated. The disaggregation model performs a weighted sum of the pixel prediction, weighted by the pixel values in the aggregation raster. For disease incidence rate you use the population raster to aggregate pixel incidence rate by summing the number of cases (rate weighted by population). If no aggregation raster is provided a uniform distribution is assumed, i.e. the pixel predictions are aggregated to polygon level by summing the pixel values.
Makes a matrix that contains the start and end pixel index for each polygon. Builds an INLA mesh to use for the spatial field (getStartendindex function).
The mesh.args argument allows you to supply a list of INLA mesh parameters to control the mesh used for the spatial field (build_mesh function).
The na.action flag is automatically off. If there are any NAs in the response or covariate data within the polygons the prepare_data method will error. Ideally the NAs in the data would be dealt with beforehand, however, setting na.action = TRUE will automatically deal with NAs. It removes any polygons that have NAs as a response, sets any aggregation pixels with NA to zero and sets covariate NAs pixels to the median value for the that covariate.
A list is returned of class disag_data
.
The functions summary, print and plot can be used on disag_data
.
The list of class disag_data
contains:
x |
The sf object used as an input. |
covariate_rasters |
The SpatRaster used as an input. |
polygon_data |
A data frame with columns of area_id, response and N (sample size: all NAs unless using binomial data). Each row represents a polygon. |
covariate_data |
A data frame with columns of area_id, cell_id and one for each covariate in covariate_rasters. Each row represents a pixel in a polygon. |
aggregation_pixels |
An array with the value of the aggregation raster for each pixel in the same order as the rows of covariate_data. |
coords_for_fit |
A matrix with two columns of x, y coordinates of pixels within the polygons. Used to make the spatial field. |
coords_for_prediction |
A matrix with two columns of x, y coordinates of pixels in the whole Raster. Used to make predictions. |
start_end_index |
A matrix with two columns containing the start and end index of the pixels within each polygon. |
mesh |
A INLA mesh to be used for the spatial field of the disaggregation model. |
polygons <- list() for(i in 1:100) { row <- ceiling(i/10) col <- ifelse(i %% 10 != 0, i %% 10, 10) xmin = 2*(col - 1); xmax = 2*col; ymin = 2*(row - 1); ymax = 2*row polygons[[i]] <- list(cbind(c(xmin, xmax, xmax, xmin, xmin), c(ymax, ymax, ymin, ymin, ymax))) } polys <- lapply(polygons,sf::st_polygon) response_df <- data.frame(area_id = 1:100, response = runif(100, min = 0, max = 10)) spdf <- sf::st_sf(response_df,geometry=polys) r <- terra::rast(nrow=20,ncol=20) terra::ext(r) <- terra::ext(spdf) r[] <- sapply(1:terra::ncell(r), function(x) rnorm(1, ifelse(x %% 20 != 0, x %% 20, 20), 3)) r2 <- terra::rast(nrow=20,ncol=20) terra::ext(r2) <- terra::ext(spdf) r2[] <- sapply(1:terra::ncell(r), function(x) rnorm(1, ceiling(x/10), 3)) cov_rasters <- c(r, r2) test_data <- prepare_data(polygon_shapefile = spdf, covariate_rasters = cov_rasters)
polygons <- list() for(i in 1:100) { row <- ceiling(i/10) col <- ifelse(i %% 10 != 0, i %% 10, 10) xmin = 2*(col - 1); xmax = 2*col; ymin = 2*(row - 1); ymax = 2*row polygons[[i]] <- list(cbind(c(xmin, xmax, xmax, xmin, xmin), c(ymax, ymax, ymin, ymin, ymax))) } polys <- lapply(polygons,sf::st_polygon) response_df <- data.frame(area_id = 1:100, response = runif(100, min = 0, max = 10)) spdf <- sf::st_sf(response_df,geometry=polys) r <- terra::rast(nrow=20,ncol=20) terra::ext(r) <- terra::ext(spdf) r[] <- sapply(1:terra::ncell(r), function(x) rnorm(1, ifelse(x %% 20 != 0, x %% 20, 20), 3)) r2 <- terra::rast(nrow=20,ncol=20) terra::ext(r2) <- terra::ext(spdf) r2[] <- sapply(1:terra::ncell(r), function(x) rnorm(1, ceiling(x/10), 3)) cov_rasters <- c(r, r2) test_data <- prepare_data(polygon_shapefile = spdf, covariate_rasters = cov_rasters)
Function that prints the input data from the disaggregation model.
## S3 method for class 'disag_data' print(x, ...)
## S3 method for class 'disag_data' print(x, ...)
x |
Object returned from prepare_data. |
... |
Further arguments to print function. |
Prints the number of polyons and pixels, the number of pixels in the largest and smallest polygons and summaries of the covariates.
Function that prints the result of the fit from the disaggregation model.
## S3 method for class 'disag_model' print(x, ...)
## S3 method for class 'disag_model' print(x, ...)
x |
Object returned from disag_model. |
... |
Further arguments to print function. |
Prints the negative log likelihood, model parameters and calculates metrics from in-sample performance.
Function that prints the prediction from the disaggregation model.
## S3 method for class 'disag_prediction' print(x, ...)
## S3 method for class 'disag_prediction' print(x, ...)
x |
Object returned from predict.disag_model. |
... |
Further arguments to print function. |
Prints the number of polyons and pixels, the number of pixels in the largest and smallest polygons and summaries of the covariates.
Function that summarizes the input data from the disaggregation model.
## S3 method for class 'disag_data' summary(object, ...)
## S3 method for class 'disag_data' summary(object, ...)
object |
Object returned from prepare_data. |
... |
Further arguments to summary function. |
Prints the number of polyons and pixels, the number of pixels in the largest and smallest polygons and summaries of the covariates.
A list of the number of polyons, the number of covariates and summaries of the covariates.
Function that summarises the result of the fit from the disaggregation model.
## S3 method for class 'disag_model' summary(object, ...)
## S3 method for class 'disag_model' summary(object, ...)
object |
Object returned from disag_model. |
... |
Further arguments to summary function. |
Prints the negative log likelihood, model parameters and calculates metrics from in-sample performance.
A list of the model parameters, negative log likelihood and metrics from in-sample performance.
Function that summarizes the prediction from the disaggregation model.
## S3 method for class 'disag_prediction' summary(object, ...)
## S3 method for class 'disag_prediction' summary(object, ...)
object |
Object returned from predict.disag_model |
... |
Further arguments to summary function. |
Prints the number of polyons and pixels, the number of pixels in the largest and smallest polygons and summaries of the covariates.
A list of the number of polyons, the number of covariates and summaries of the covariates.