Package 'disaggregation' reference manual

Title:	Disaggregation Modelling
Description:	Fits disaggregation regression models using 'TMB' ('Template Model Builder'). When the response data are aggregated to polygon level but the predictor variables are at a higher resolution, these models can be useful. Regression models with spatial random fields. The package is described in detail in Nandi et al. (2023) <doi:10.18637/jss.v106.i11>.
Authors:	Anita Nandi [aut] , Tim Lucas [aut, cre] , Rohan Arambepola [aut], Andre Python [aut] , Simon Smart [ctb]
Maintainer:	Tim Lucas <[email protected]>
License:	MIT + file LICENSE
Version:	0.4.0
Built:	2025-03-13 06:05:32 UTC
Source:	https://github.com/timcdlucas/disaggregation

Function to fit the disaggregation model

Description

Function to fit the disaggregation model

Usage

as.disag_data(
  polygon_shapefile,
  shapefile_names,
  covariate_rasters,
  polygon_data,
  covariate_data,
  aggregation_pixels,
  coords_for_fit,
  coords_for_prediction,
  start_end_index,
  mesh = NULL,
  coordsForFit = NULL,
  coordsForPrediction = NULL,
  startendindex = NULL
)
as.disag_data(
  polygon_shapefile,
  shapefile_names,
  covariate_rasters,
  polygon_data,
  covariate_data,
  aggregation_pixels,
  coords_for_fit,
  coords_for_prediction,
  start_end_index,
  mesh = NULL,
  coordsForFit = NULL,
  coordsForPrediction = NULL,
  startendindex = NULL
)

Arguments

`polygon_shapefile`	sf object containing the response data
`shapefile_names`	List of 2: polygon id variable name and response variable name from x
`covariate_rasters`	SpatRaster of covariates
`polygon_data`	data.frame with two columns: polygon id and response
`covariate_data`	data.frame with cell id, polygon id and covariate columns
`aggregation_pixels`	vector with value of aggregation raster at each pixel
`coords_for_fit`	coordinates of the covariate data points within the polygons in x
`coords_for_prediction`	coordinates of the covariate data points in the whole raster extent
`start_end_index`	matrix containing the start and end index for each polygon
`mesh`	inla.mesh object to use in the fit
`coordsForFit`	Deprecated.
`coordsForPrediction`	Deprecated.
`startendindex`	Deprecated.

Value

A list is returned of class disag_data. The functions summary, print and plot can be used on disag_data. The list of class disag_data contains:

`x`	The sf object used as an input.
`covariate_rasters`	The SpatRaster used as an input.
`polygon_data`	A data frame with columns of area_id, response and N (sample size: all NAs unless using binomial data). Each row represents a polygon.
`covariate_data`	A data frame with columns of area_id, cell_id and one for each covariate in covariate_rasters. Each row represents a pixel in a polygon.
`aggregation_pixels`	An array with the value of the aggregation raster for each pixel in the same order as the rows of covariate_data.
`coords_for_fit`	A matrix with two columns of x, y coordinates of pixels within the polygons. Used to make the spatial field.
`coords_for_prediction`	A matrix with two columns of x, y coordinates of pixels in the whole Raster. Used to make predictions.
`start_end_index`	A matrix with two columns containing the start and end index of the pixels within each polygon.
`mesh`	A INLA mesh to be used for the spatial field of the disaggregation model.

Build mesh for disaggregaton model

Description

build_mesh function takes a sf object and mesh arguments to build an appropriate mesh for the spatial field.

Usage

build_mesh(shapes, mesh_args = NULL, mesh.args = NULL)
build_mesh(shapes, mesh_args = NULL, mesh.args = NULL)

Arguments

`shapes`	sf covering the region under investigation.
`mesh_args`	list of parameters that control the mesh structure. convex, concave and resolution, to control the boundary of the inner mesh, and max.edge, cutoff and offset, to control the mesh itself, with the parameters having the same meaning as in the INLA functions inla.convex.hull and inla.mesh.2d. cut has been deprecated - use cutoff instead.
`mesh.args`	Deprecated.

Details

The mesh is created by finding a tight boundary around the polygon data, and creating a fine mesh within the boundary and a coarser mesh outside. This speeds up computation time by only having a very fine mesh within the area of interest and having a small region outside with a coarser mesh to avoid edge effects.

Six mesh parameters can be specified as arguments: convex, concave and resolution, to control the boundary of the inner mesh, and max.edge, cutoff and offset, to control the mesh itself, with the names meaning the same as used by the fmesher functions fm_nonconvex_hull_inla and fm_mesh_2d.

Defaults are: pars <- list(convex = -0.01, concave = -0.5, resolution = 300, max.edge = c(3.0, 8), cutoff = 0.4, offset = c(1, 15)).

Value

An inla.mesh object

Examples

## Not run: 
polygons <- list()
for(i in 1:14) {
  row <- ceiling(i/10)
  col <- ifelse(i %% 10 != 0, i %% 10, 10)
  xmin = 2*(col - 1); xmax = 2*col; ymin = 2*(row - 1); ymax = 2*row
polygons[[i]] <- list(cbind(c(xmin, xmax, xmax, xmin, xmin),
                            c(ymax, ymax, ymin, ymin, ymax)))
}

polys <- lapply(polygons, sf::st_polygon)
response_df <- data.frame(area_id = 1:100,
                          response = runif(100, min = 0, max = 10))
spdf <- sf::st_sf(polys, response_df)

my_mesh <- build_mesh(spdf)

## End(Not run)

## Not run: 
polygons <- list()
for(i in 1:14) {
  row <- ceiling(i/10)
  col <- ifelse(i %% 10 != 0, i %% 10, 10)
  xmin = 2*(col - 1); xmax = 2*col; ymin = 2*(row - 1); ymax = 2*row
polygons[[i]] <- list(cbind(c(xmin, xmax, xmax, xmin, xmin),
                            c(ymax, ymax, ymin, ymin, ymax)))
}

polys <- lapply(polygons, sf::st_polygon)
response_df <- data.frame(area_id = 1:100,
                          response = runif(100, min = 0, max = 10))
spdf <- sf::st_sf(polys, response_df)

my_mesh <- build_mesh(spdf)

## End(Not run)

Fit the disaggregation model

Description

disag_model function takes a disag_data object created by prepare_data and performs a Bayesian disaggregation fit.

Usage

disag_model(
  data,
  priors = NULL,
  family = "gaussian",
  link = "identity",
  iterations = 100,
  field = TRUE,
  iid = TRUE,
  hess_control_parscale = NULL,
  hess_control_ndeps = 1e-04,
  silent = TRUE
)
disag_model(
  data,
  priors = NULL,
  family = "gaussian",
  link = "identity",
  iterations = 100,
  field = TRUE,
  iid = TRUE,
  hess_control_parscale = NULL,
  hess_control_ndeps = 1e-04,
  silent = TRUE
)

Arguments

`data`	disag_data object returned by `prepare_data` function that contains all the necessary objects for the model fitting
`priors`	list of prior values: `priormean_intercept` `priorsd_intercept` `priormean_slope` `priorsd_slope` `prior_rho_min` `prior_rho_prob` `prior_sigma_max` `prior_sigma_prob` `prior_iideffect_sd_max` `prior_iideffect_sd_prob`
`family`	likelihood function: gaussian, binomial or poisson
`link`	link function: logit, log or identity
`iterations`	number of iterations to run the optimisation for
`field`	logical. Flag the spatial field on or off
`iid`	logical. Flag the iid effect on or off
`hess_control_parscale`	Argument to scale parameters during the calculation of the Hessian. Must be the same length as the number of parameters. See `optimHess` for details.
`hess_control_ndeps`	Argument to control step sizes during the calculation of the Hessian. Either length 1 (same step size applied to all parameters) or the same length as the number of parameters. Default is 1e-3, try setting a smaller value if you get NaNs in the standard error of the parameters. See `optimHess` for details.
`silent`	logical. Suppress verbose output.

Details

The model definition

The disaggregation model makes predictions at the pixel level:

$link(pred_i) = \beta_0 + \beta X + GP(s_i) + u_i$

And then aggregates these predictions to the polygon level using the weighted sum (via the aggregation raster, $agg_i$ ):

$cases_j = \sum_{i \epsilon j} pred_i \times agg_i$

$rate_j = \frac{\sum_{i \epsilon j} pred_i \times agg_i}{\sum_{i \epsilon j} agg_i}$

The different likelihood correspond to slightly different models ( $y_j$ is the response count data):

Gaussian: If $\sigma$ is the dispersion of the pixel data, $\sigma_j$ is the dispersion of the polygon data, where $\sigma_j = \sigma \sqrt{\sum agg_i^2} / \sum agg_i$

$dnorm(y_j/\sum agg_i, rate_j, \sigma_j)$

- predicts incidence rate.
Binomial: For a survey in polygon j, $y_j$ is the number positive and $N_j$ is the number tested.

$dbinom(y_j, N_j, rate_j)$

- predicts prevalence rate.
Poisson:

$dpois(y_j, cases_j)$

- predicts incidence count.

Specify priors for the regression parameters, field and iid effect as a single list. Hyperpriors for the field are given as penalised complexity priors you specify $\rho_{min}$ and $\rho_{prob}$ for the range of the field where $P(\rho < \rho_{min}) = \rho_{prob}$ , and $\sigma_{min}$ and $\sigma_{prob}$ for the variation of the field where $P(\sigma > \sigma_{min}) = \sigma_{prob}$ . Also, specify pc priors for the iid effect

The family and link arguments are used to specify the likelihood and link function respectively. The likelihood function can be one of gaussian, poisson or binomial. The link function can be one of logit, log or identity. These are specified as strings.

The field and iid effect can be turned on or off via the field and iid logical flags. Both are default TRUE.

The iterations argument specifies the maximum number of iterations the model can run for to find an optimal point.

The silent argument can be used to publish/suppress verbose output. Default TRUE.

Value

A list is returned of class disag_model. The functions summary, print and plot can be used on disag_model. The list of class disag_model contains:

`obj`	The TMB model object returned by `MakeADFun`.
`opt`	The optimized model object returned by `nlminb`.
`sd_out`	The TMB object returned by `sdreport`.
`data`	The disag_data object used as an input to the model.
`model_setup`	A list of information on the model setup. Likelihood function (family), link function(link), logical: whether a field was used (field) and logical: whether an iid effect was used (iid).

References

Nanda et al. (2023) disaggregation: An R Package for Bayesian Spatial Disaggregation Modeling. <doi:10.18637/jss.v106.i11>

Examples

## Not run: 
polygons <- list()
n_polygon_per_side <- 10
n_polygons <- n_polygon_per_side * n_polygon_per_side
n_pixels_per_side <- n_polygon_per_side * 2

for(i in 1:n_polygons) {
  row <- ceiling(i/n_polygon_per_side)
  col <- ifelse(i %% n_polygon_per_side != 0, i %% n_polygon_per_side, n_polygon_per_side)
  xmin = 2*(col - 1); xmax = 2*col; ymin = 2*(row - 1); ymax = 2*row
  polygons[[i]] <- list(cbind(c(xmin, xmax, xmax, xmin, xmin),
                              c(ymax, ymax, ymin, ymin, ymax)))
}

polys <- lapply(polygons,sf::st_polygon)
N <- floor(runif(n_polygons, min = 1, max = 100))
response_df <- data.frame(area_id = 1:n_polygons, response = runif(n_polygons, min = 0, max = 1000))

spdf <- sf::st_sf(response_df, geometry = polys)

# Create raster stack
r <- terra::rast(ncol=n_pixels_per_side, nrow=n_pixels_per_side)
terra::ext(r) <- terra::ext(spdf)
r[] <- sapply(1:terra::ncell(r), function(x){
rnorm(1, ifelse(x %% n_pixels_per_side != 0, x %% n_pixels_per_side, n_pixels_per_side), 3))}
r2 <- terra::rast(ncol=n_pixels_per_side, nrow=n_pixels_per_side)
terra::ext(r2) <- terra::ext(spdf)
r2[] <- sapply(1:terra::ncell(r), function(x) rnorm(1, ceiling(x/n_pixels_per_side), 3))
cov_stack <- c(r, r2)
names(cov_stack) <- c('layer1', 'layer2')

test_data <- prepare_data(polygon_shapefile = spdf,
                          covariate_rasters = cov_stack)

 result <- disag_model(test_data, iterations = 2)
 
## End(Not run)

## Not run: 
polygons <- list()
n_polygon_per_side <- 10
n_polygons <- n_polygon_per_side * n_polygon_per_side
n_pixels_per_side <- n_polygon_per_side * 2

for(i in 1:n_polygons) {
  row <- ceiling(i/n_polygon_per_side)
  col <- ifelse(i %% n_polygon_per_side != 0, i %% n_polygon_per_side, n_polygon_per_side)
  xmin = 2*(col - 1); xmax = 2*col; ymin = 2*(row - 1); ymax = 2*row
  polygons[[i]] <- list(cbind(c(xmin, xmax, xmax, xmin, xmin),
                              c(ymax, ymax, ymin, ymin, ymax)))
}

polys <- lapply(polygons,sf::st_polygon)
N <- floor(runif(n_polygons, min = 1, max = 100))
response_df <- data.frame(area_id = 1:n_polygons, response = runif(n_polygons, min = 0, max = 1000))

spdf <- sf::st_sf(response_df, geometry = polys)

# Create raster stack
r <- terra::rast(ncol=n_pixels_per_side, nrow=n_pixels_per_side)
terra::ext(r) <- terra::ext(spdf)
r[] <- sapply(1:terra::ncell(r), function(x){
rnorm(1, ifelse(x %% n_pixels_per_side != 0, x %% n_pixels_per_side, n_pixels_per_side), 3))}
r2 <- terra::rast(ncol=n_pixels_per_side, nrow=n_pixels_per_side)
terra::ext(r2) <- terra::ext(spdf)
r2[] <- sapply(1:terra::ncell(r), function(x) rnorm(1, ceiling(x/n_pixels_per_side), 3))
cov_stack <- c(r, r2)
names(cov_stack) <- c('layer1', 'layer2')

test_data <- prepare_data(polygon_shapefile = spdf,
                          covariate_rasters = cov_stack)

 result <- disag_model(test_data, iterations = 2)
 
## End(Not run)

Roxygen commands

Description

Roxygen commands

Usage

dummy()
dummy()

Get a SpatRaster of covariates from a folder containing .tif files

Description

Looks in a specified folder for raster files. Returns a multi-layered SpatRaster of the rasters cropped to the extent specified by the shape parameter.

Usage

getCovariateRasters(directory, file_pattern = ".tif$", shape)
getCovariateRasters(directory, file_pattern = ".tif$", shape)

Arguments

`directory`	Filepath to the directory containing the rasters.
`file_pattern`	Pattern the filenames must match. Default is all files ending in .tif .
`shape`	An object with an extent that the rasters will be cropped to.

Value

A multi-layered SpatRaster of the raster files in the directory

Examples

## Not run: 
  getCovariateRasters('/home/rasters', '.tif$', shape)
 
## End(Not run)

## Not run: 
  getCovariateRasters('/home/rasters', '.tif$', shape)
 
## End(Not run)

Extract polygon id and response data into a data.frame from a sf object

Description

Returns a data.frame with a row for each polygon in the sf object and columns: area_id, response and N, containing the id of the polygon, the values of the response for that polygon, and the sample size respectively. If the data is not survey data (the sample size does not exist), this column will contain NAs.

Usage

getPolygonData(
  shape,
  id_var = "area_id",
  response_var = "response",
  sample_size_var = NULL
)
getPolygonData(
  shape,
  id_var = "area_id",
  response_var = "response",
  sample_size_var = NULL
)

Arguments

`shape`	A sf object containing response data.
`id_var`	Name of column in shape object with the polygon id. Default 'area_id'.
`response_var`	Name of column in shape object with the response data. Default 'response'.
`sample_size_var`	For survey data, name of column in sf object (if it exists) with the sample size data. Default NULL.

Value

A data.frame with a row for each polygon in the sf object and columns: area_id, response and N, containing the id of the polygon, the values of the response for that polygon, and the sample size respectively. If the data is not survey data (the sample size does not exist), this column will contain NAs.

Examples

{
polygons <- list()
for(i in 1:100) {
  row <- ceiling(i/10)
  col <- ifelse(i %% 10 != 0, i %% 10, 10)
  xmin = 2*(col - 1); xmax = 2*col; ymin = 2*(row - 1); ymax = 2*row
  polygons[[i]] <- list(cbind(c(xmin, xmax, xmax, xmin, xmin),
                              c(ymax, ymax, ymin, ymin, ymax)))
}

polys <- lapply(polygons,sf::st_polygon)
response_df <- data.frame(area_id = 1:100, response = runif(100, min = 0, max = 10))
spdf <- sf::st_sf(response_df, geometry = polys)

 getPolygonData(spdf, id_var = 'area_id', response_var = 'response')
}


{
polygons <- list()
for(i in 1:100) {
  row <- ceiling(i/10)
  col <- ifelse(i %% 10 != 0, i %% 10, 10)
  xmin = 2*(col - 1); xmax = 2*col; ymin = 2*(row - 1); ymax = 2*row
  polygons[[i]] <- list(cbind(c(xmin, xmax, xmax, xmin, xmin),
                              c(ymax, ymax, ymin, ymin, ymax)))
}

polys <- lapply(polygons,sf::st_polygon)
response_df <- data.frame(area_id = 1:100, response = runif(100, min = 0, max = 10))
spdf <- sf::st_sf(response_df, geometry = polys)

 getPolygonData(spdf, id_var = 'area_id', response_var = 'response')
}

Function to match pixels to their corresponding polygon

Description

From the covariate data and polygon data, the function matches the polygon id between the two to find which pixels from the covariate data are contained in each of the polygons.

Usage

getStartendindex(covariates, polygon_data, id_var = "area_id")
getStartendindex(covariates, polygon_data, id_var = "area_id")

Arguments

`covariates`	data.frame with each covariate as a column an and id column.
`polygon_data`	data.frame with polygon id and response data.
`id_var`	string with the name of the column in the covariate data.frame containing the polygon id.

Details

Takes a data.frame containing the covariate data with a polygon id column and one column for each covariate, and another data.frame containing polygon data with a polygon id, response and sample size column (as returned by getPolygonData function).

Returns a matrix with two columns and one row for each polygon. The first column is the index of the first row in covariate data that corresponds to that polygon, the second column is the index of the last row in covariate data that corresponds to that polygon.

Value

A matrix with two columns and one row for each polygon. The first column is the index of the first row in covariate data that corresponds to that polygon, the second column is the index of the last row in covariate data that corresponds to that polygon.

Examples

{
 covs <- data.frame(area_id = c(1, 1, 1, 2, 2, 3, 3, 3, 3), response = c(3, 9, 5, 2, 3, 6, 7, 3, 5))
 response <- data.frame(area_id = c(1, 2, 3), response = c(4, 7, 2), N = c(NA, NA, NA))
 getStartendindex(covs, response, 'area_id')
}


{
 covs <- data.frame(area_id = c(1, 1, 1, 2, 2, 3, 3, 3, 3), response = c(3, 9, 5, 2, 3, 6, 7, 3, 5))
 response <- data.frame(area_id = c(1, 2, 3), response = c(4, 7, 2), N = c(NA, NA, NA))
 getStartendindex(covs, response, 'area_id')
}

Create the TMB model object for the disaggregation model

Description

make_model_object function takes a disag_data object created by prepare_data and creates a TMB model object to be used in fitting.

Usage

make_model_object(
  data,
  priors = NULL,
  family = "gaussian",
  link = "identity",
  field = TRUE,
  iid = TRUE,
  silent = TRUE
)
make_model_object(
  data,
  priors = NULL,
  family = "gaussian",
  link = "identity",
  field = TRUE,
  iid = TRUE,
  silent = TRUE
)

Arguments

`data`	disag_data object returned by `prepare_data` function that contains all the necessary objects for the model fitting
`priors`	list of prior values
`family`	likelihood function: gaussian, binomial or poisson
`link`	link function: logit, log or identity
`field`	logical. Flag the spatial field on or off
`iid`	logical. Flag the iid effect on or off
`silent`	logical. Suppress verbose output.

Details

The model definition

The disaggregation model make predictions at the pixel level:

$link(pred_i) = \beta_0 + \beta X + GP(s_i) + u_i$

And then aggregates these predictions to the polygon level using the weighted sum (via the aggregation raster, $agg_i$ ):

$cases_j = \sum_{i \epsilon j} pred_i \times agg_i$

$rate_j = \frac{\sum_{i \epsilon j} pred_i \times agg_i}{\sum_{i \epsilon j} agg_i}$

The different likelihood correspond to slightly different models ( $y_j$ is the response count data):

Gaussian: If $\sigma$ is the dispersion of the pixel data, $\sigma_j$ is the dispersion of the polygon data, where $\sigma_j = \sigma \sqrt{\sum agg_i^2} / \sum agg_i$

$dnorm(y_j/\sum agg_i, rate_j, \sigma_j)$

- predicts incidence rate.
Binomial: For a survey in polygon j, $y_j$ is the number positive and $N_j$ is the number tested.

$dbinom(y_j, N_j, rate_j)$

- predicts prevalence rate.
Poisson:

$dpois(y_j, cases_j)$

- predicts incidence count.

Specify priors for the regression parameters, field and iid effect as a single named list. Hyperpriors for the field are given as penalised complexity priors you specify $\rho_{min}$ and $\rho_{prob}$ for the range of the field where $P(\rho < \rho_{min}) = \rho_{prob}$ , and $\sigma_{min}$ and $\sigma_{prob}$ for the variation of the field where $P(\sigma > \sigma_{min}) = \sigma_{prob}$ . Also, specify pc priors for the iid effect.

The precise names and default values for these priors are:

priormean_intercept: 0
priorsd_intercept: 10.0
priormean_slope: 0.0
priorsd_slope: 0.5
prior_rho_min: A third the length of the diagonal of the bounding box.
prior_rho_prob: 0.1
prior_sigma_max: sd(response/mean(response))
prior_sigma_prob: 0.1
prior_iideffect_sd_max: 0.1
prior_iideffect_sd_prob: 0.01

The field and iid effect can be turned on or off via the field and iid logical flags. Both are default TRUE.

The iterations argument specifies the maximum number of iterations the model can run for to find an optimal point.

The silent argument can be used to publish/supress verbose output. Default TRUE.

Value

The TMB model object returned by MakeADFun.

Examples

## Not run: 
polygons <- list()
n_polygon_per_side <- 10
n_polygons <- n_polygon_per_side * n_polygon_per_side
n_pixels_per_side <- n_polygon_per_side * 2

for(i in 1:n_polygons) {
  row <- ceiling(i/n_polygon_per_side)
  col <- ifelse(i %% n_polygon_per_side != 0, i %% n_polygon_per_side, n_polygon_per_side)
  xmin = 2*(col - 1); xmax = 2*col; ymin = 2*(row - 1); ymax = 2*row
  polygons[[i]] <- list(cbind(c(xmin, xmax, xmax, xmin, xmin),
                              c(ymax, ymax, ymin, ymin, ymax)))
}

polys <- lapply(polygons,sf::st_polygon)
N <- floor(runif(n_polygons, min = 1, max = 100))
response_df <- data.frame(area_id = 1:n_polygons, response = runif(n_polygons, min = 0, max = 1000))

spdf <- sf::st_sf(response_df, geometry = polys)

# Create raster stack
r <- terra::rast(ncol=n_pixels_per_side, nrow=n_pixels_per_side)
terra::ext(r) <- terra::ext(spdf)
r[] <- sapply(1:terra::ncell(r), function(x){
rnorm(1, ifelse(x %% n_pixels_per_side != 0, x %% n_pixels_per_side, n_pixels_per_side), 3))}
r2 <- terra::rast(ncol=n_pixels_per_side, nrow=n_pixels_per_side)
terra::ext(r2) <- terra::ext(spdf)
r2[] <- sapply(1:terra::ncell(r), function(x) rnorm(1, ceiling(x/n_pixels_per_side), 3))
cov_stack <- c(r, r2)
names(cov_stack) <- c('layer1', 'layer2')

test_data <- prepare_data(polygon_shapefile = spdf,
                          covariate_rasters = cov_stack)

 result <- make_model_object(test_data)
 
## End(Not run)

## Not run: 
polygons <- list()
n_polygon_per_side <- 10
n_polygons <- n_polygon_per_side * n_polygon_per_side
n_pixels_per_side <- n_polygon_per_side * 2

for(i in 1:n_polygons) {
  row <- ceiling(i/n_polygon_per_side)
  col <- ifelse(i %% n_polygon_per_side != 0, i %% n_polygon_per_side, n_polygon_per_side)
  xmin = 2*(col - 1); xmax = 2*col; ymin = 2*(row - 1); ymax = 2*row
  polygons[[i]] <- list(cbind(c(xmin, xmax, xmax, xmin, xmin),
                              c(ymax, ymax, ymin, ymin, ymax)))
}

polys <- lapply(polygons,sf::st_polygon)
N <- floor(runif(n_polygons, min = 1, max = 100))
response_df <- data.frame(area_id = 1:n_polygons, response = runif(n_polygons, min = 0, max = 1000))

spdf <- sf::st_sf(response_df, geometry = polys)

# Create raster stack
r <- terra::rast(ncol=n_pixels_per_side, nrow=n_pixels_per_side)
terra::ext(r) <- terra::ext(spdf)
r[] <- sapply(1:terra::ncell(r), function(x){
rnorm(1, ifelse(x %% n_pixels_per_side != 0, x %% n_pixels_per_side, n_pixels_per_side), 3))}
r2 <- terra::rast(ncol=n_pixels_per_side, nrow=n_pixels_per_side)
terra::ext(r2) <- terra::ext(spdf)
r2[] <- sapply(1:terra::ncell(r), function(x) rnorm(1, ceiling(x/n_pixels_per_side), 3))
cov_stack <- c(r, r2)
names(cov_stack) <- c('layer1', 'layer2')

test_data <- prepare_data(polygon_shapefile = spdf,
                          covariate_rasters = cov_stack)

 result <- make_model_object(test_data)
 
## End(Not run)

Convert results of the model ready for plotting

Description

Convert results of the model ready for plotting

Usage

plot_disag_model_data(x)
plot_disag_model_data(x)

Arguments

`x`	Object of class disag_model to be plotted.

Value

A list that contains:

`posteriors`	A data.frame of posteriors
`data`	A data.frame of observed and predicted data
`title`	The title of the observed vs. predicted plot

Plot input data for disaggregation

Description

Plotting function for class disag_data (the input data for disaggregation).

Usage

## S3 method for class 'disag_data'
plot(x, which = c(1, 2, 3), ...)
## S3 method for class 'disag_data'
plot(x, which = c(1, 2, 3), ...)

Arguments

`x`	Object of class disag_data to be plotted.
`which`	If a subset of plots is required, specify a subset of the numbers 1:3
`...`	Further arguments to plot function.

Details

Produces three plots: polygon response data, covariate rasters and INLA mesh.

Value

A list of three plots: the polygon plot (ggplot), covariate plot (spplot) and INLA mesh plot (ggplot)

Plot results of fitted model

Description

Plotting function for class disag_model (the result of the disaggregation fitting).

Usage

## S3 method for class 'disag_model'
plot(x, include_iid = FALSE, ...)
## S3 method for class 'disag_model'
plot(x, include_iid = FALSE, ...)

Arguments

`x`	Object of class disag_model to be plotted.
`include_iid`	logical. Whether or not to include predictions that include the IID effect.
`...`	Further arguments to plot function.

Details

Produces two plots: results of the fixed effects and in-sample observed vs predicted plot.

Value

A list of two ggplot plots: results of the fixed effects and an in-sample observed vs predicted plot

Plot mean and uncertainty predictions from the disaggregation model results

Description

Plotting function for class disag_prediction (the mean and uncertainty predictions of the disaggregation fitting).

Usage

## S3 method for class 'disag_prediction'
plot(x, ...)
## S3 method for class 'disag_prediction'
plot(x, ...)

Arguments

`x`	Object of class disag_prediction to be plotted.
`...`	Further arguments to plot function.

Details

Produces raster plots of the mean prediction, and the lower and upper confidence intervals.

Value

A list of plots of rasters from the prediction: mean prediction, lower CI and upper CI.

Function to predict mean from the model result

Description

predict_model function takes a disag_model object created by disaggregation::disag_model and predicts mean maps.

Usage

predict_model(
  model_output,
  new_data = NULL,
  predict_iid = FALSE,
  newdata = NULL
)
predict_model(
  model_output,
  new_data = NULL,
  predict_iid = FALSE,
  newdata = NULL
)

Arguments

`model_output`	disag_model object returned by disag_model function
`new_data`	If NULL, predictions are made using the data in model_output. If this is a SpatRaster, predictions will be made over this data. Default NULL.
`predict_iid`	If TRUE, any polygon iid effect from the model will be used in the prediction. Default FALSE.
`newdata`	Deprecated.

Details

Function returns rasters of the mean predictions as well as the covariate and field contributions to the linear predictor.

To predict over a different spatial extent to that used in the model, a SpatRaster covering the region to make predictions over is passed to the argument new_data. If this is not given predictions are made over the data used in the fit.

The predict_iid logical flag should be set to TRUE if the results of the iid effect from the model are to be used in the prediction.

Value

The mean prediction, which is a list of:

prediction Raster of mean predictions based.
field Raster of the field component of the linear predictor.
iid Raster of the iid component of the linear predictor.
covariates Raster of the covariate component of the linear predictor.

Examples

## Not run: 
predict_model(result)

## End(Not run)

## Not run: 
predict_model(result)

## End(Not run)

Function to predict uncertainty from the model result

Description

predict_uncertainty function takes a disag_model object created by disaggregation::disag_model and predicts upper and lower credible interval maps.

Usage

predict_uncertainty(
  model_output,
  new_data = NULL,
  predict_iid = FALSE,
  N = 100,
  CI = 0.95,
  newdata = NULL
)
predict_uncertainty(
  model_output,
  new_data = NULL,
  predict_iid = FALSE,
  N = 100,
  CI = 0.95,
  newdata = NULL
)

Arguments

`model_output`	disag_model object returned by disag_model function.
`new_data`	If NULL, predictions are made using the data in model_output. If this is a raster stack or brick, predictions will be made over this data. Default NULL.
`predict_iid`	If TRUE, any polygon iid effect from the model will be used in the prediction. Default FALSE.
`N`	number of realisations. Default: 100.
`CI`	credible interval. Default: 0.95.
`newdata`	Deprecated.

Details

Function returns a SpatRaster of the realisations as well as the upper and lower credible interval rasters.

The predict_iid logical flag should be set to TRUE if the results of the iid effect from the model are to be used in the prediction.

The number of the realisations and the size of the credible interval to be calculated. are given by the arguments N and CI respectively.

Value

The uncertainty prediction, which is a list of:

realisations SpatRaster of realisations of predictions. Number of realisations defined by argument N.
predictions_ci SpatRaster of the upper and lower credible intervals. Defined by argument CI.

Examples

## Not run: 
predict_uncertainty(result)

## End(Not run)

## Not run: 
predict_uncertainty(result)

## End(Not run)

Predict mean and uncertainty from the disaggregation model result

Description

predict.disag_model function takes a disag_model object created by disaggregation::disag_model and predicts mean and uncertainty maps.

Usage

## S3 method for class 'disag_model'
predict(
  object,
  new_data = NULL,
  predict_iid = FALSE,
  N = 100,
  CI = 0.95,
  newdata = NULL,
  ...
)
## S3 method for class 'disag_model'
predict(
  object,
  new_data = NULL,
  predict_iid = FALSE,
  N = 100,
  CI = 0.95,
  newdata = NULL,
  ...
)

Arguments

`object`	disag_model object returned by disag_model function.
`new_data`	If NULL, predictions are made using the data in model_output. If this is a raster stack or brick, predictions will be made over this data.
`predict_iid`	logical. If TRUE, any polygon iid effect from the model will be used in the prediction. Default FALSE.
`N`	Number of realisations. Default: 100.
`CI`	Credible interval to be calculated from the realisations. Default: 0.95.
`newdata`	Deprecated.
`...`	Further arguments passed to or from other methods.

Details

The predict_iid logical flag should be set to TRUE if the results of the iid effect from the model are to be used in the prediction.

For the uncertainty calculations, the number of the realisations and the size of the credible interval to be calculated are given by the arguments N and CI respectively.

Value

An object of class disag_prediction which consists of a list of two objects:

mean_prediction

List of:

prediction Raster of mean predictions based.
field Raster of the field component of the linear predictor.
iid Raster of the iid component of the linear predictor.
covariates Raster of the covariate component of the linear predictor.

uncertainty_prediction:

List of:

realisations SpatRaster of realisations of predictions. Number of realisations defined by argument N.
predictions_ci SpatRaster of the upper and lower credible intervals. Defined by argument CI.

Examples

## Not run: 
predict(fit_result)

## End(Not run)

## Not run: 
predict(fit_result)

## End(Not run)

Prepare data for disaggregation modelling

Description

prepare_data function is used to extract all the data required for fitting a disaggregation model. Designed to be used in the disaggregation::disag_model function.

Usage

prepare_data(
  polygon_shapefile,
  covariate_rasters,
  aggregation_raster = NULL,
  id_var = "area_id",
  response_var = "response",
  sample_size_var = NULL,
  mesh_args = NULL,
  na_action = FALSE,
  make_mesh = TRUE,
  mesh.args = NULL,
  na.action = NULL,
  makeMesh = NULL,
  ncores = NULL
)
prepare_data(
  polygon_shapefile,
  covariate_rasters,
  aggregation_raster = NULL,
  id_var = "area_id",
  response_var = "response",
  sample_size_var = NULL,
  mesh_args = NULL,
  na_action = FALSE,
  make_mesh = TRUE,
  mesh.args = NULL,
  na.action = NULL,
  makeMesh = NULL,
  ncores = NULL
)

Arguments

`polygon_shapefile`	sf object containing at least three columns: one with the geometried, one with the id for the polygons (id_var) and one with the response count data (response_var); for binomial data, i.e survey data, it can also contain a sample size column (sample_size_var).
`covariate_rasters`	SpatRaster of covariate rasters to be used in the model.
`aggregation_raster`	SpatRaster to aggregate pixel level predictions to polygon level e.g. population to aggregate prevalence. If this is not supplied a uniform raster will be used.
`id_var`	Name of column in sf object with the polygon id.
`response_var`	Name of column in sf object with the response data.
`sample_size_var`	For survey data, name of column in sf object (if it exists) with the sample size data.
`mesh_args`	list of parameters that control the mesh structure with the same names as used by INLA.
`na_action`	logical. If TRUE, NAs in response will be removed, covariate NAs will be given the median value, aggregation NAs will be set to zero. Default FALSE (NAs in response or covariate data within the polygons will give errors).
`make_mesh`	logical. If TRUE, build INLA mesh, takes some time. Default TRUE.
`mesh.args`	Deprecated.
`na.action`	Deprecated.
`makeMesh`	Deprecated.
`ncores`	Deprecated.

Details

Takes a sf object with the response data and a SpatRaster of covariates.

Extract the values of the covariates (as well as the aggregation raster, if given) at each pixel within the polygons (parallelExtract function). This is done in parallel and n.cores argument is used to set the number of cores to use for covariate extraction. This can be the number of covariates used in the model.

The aggregation raster defines how the pixels within each polygon are aggregated. The disaggregation model performs a weighted sum of the pixel prediction, weighted by the pixel values in the aggregation raster. For disease incidence rate you use the population raster to aggregate pixel incidence rate by summing the number of cases (rate weighted by population). If no aggregation raster is provided a uniform distribution is assumed, i.e. the pixel predictions are aggregated to polygon level by summing the pixel values.

Makes a matrix that contains the start and end pixel index for each polygon. Builds an INLA mesh to use for the spatial field (getStartendindex function).

The mesh.args argument allows you to supply a list of INLA mesh parameters to control the mesh used for the spatial field (build_mesh function).

The na.action flag is automatically off. If there are any NAs in the response or covariate data within the polygons the prepare_data method will error. Ideally the NAs in the data would be dealt with beforehand, however, setting na.action = TRUE will automatically deal with NAs. It removes any polygons that have NAs as a response, sets any aggregation pixels with NA to zero and sets covariate NAs pixels to the median value for the that covariate.

Value

A list is returned of class disag_data. The functions summary, print and plot can be used on disag_data. The list of class disag_data contains:

`x`	The sf object used as an input.
`covariate_rasters`	The SpatRaster used as an input.
`polygon_data`	A data frame with columns of area_id, response and N (sample size: all NAs unless using binomial data). Each row represents a polygon.
`covariate_data`	A data frame with columns of area_id, cell_id and one for each covariate in covariate_rasters. Each row represents a pixel in a polygon.
`aggregation_pixels`	An array with the value of the aggregation raster for each pixel in the same order as the rows of covariate_data.
`coords_for_fit`	A matrix with two columns of x, y coordinates of pixels within the polygons. Used to make the spatial field.
`coords_for_prediction`	A matrix with two columns of x, y coordinates of pixels in the whole Raster. Used to make predictions.
`start_end_index`	A matrix with two columns containing the start and end index of the pixels within each polygon.
`mesh`	A INLA mesh to be used for the spatial field of the disaggregation model.

Examples


polygons <- list()
for(i in 1:100) {
  row <- ceiling(i/10)
  col <- ifelse(i %% 10 != 0, i %% 10, 10)
  xmin = 2*(col - 1); xmax = 2*col; ymin = 2*(row - 1); ymax = 2*row
  polygons[[i]] <- list(cbind(c(xmin, xmax, xmax, xmin, xmin),
                              c(ymax, ymax, ymin, ymin, ymax)))
}

polys <- lapply(polygons,sf::st_polygon)
response_df <- data.frame(area_id = 1:100, response = runif(100, min = 0, max = 10))
spdf <- sf::st_sf(response_df,geometry=polys)

r <- terra::rast(nrow=20,ncol=20)
terra::ext(r) <- terra::ext(spdf)
r[] <- sapply(1:terra::ncell(r), function(x) rnorm(1, ifelse(x %% 20 != 0, x %% 20, 20), 3))

r2 <- terra::rast(nrow=20,ncol=20)
terra::ext(r2) <- terra::ext(spdf)
r2[] <- sapply(1:terra::ncell(r), function(x) rnorm(1, ceiling(x/10), 3))
cov_rasters <- c(r, r2)

test_data <- prepare_data(polygon_shapefile = spdf,
                          covariate_rasters = cov_rasters)


polygons <- list()
for(i in 1:100) {
  row <- ceiling(i/10)
  col <- ifelse(i %% 10 != 0, i %% 10, 10)
  xmin = 2*(col - 1); xmax = 2*col; ymin = 2*(row - 1); ymax = 2*row
  polygons[[i]] <- list(cbind(c(xmin, xmax, xmax, xmin, xmin),
                              c(ymax, ymax, ymin, ymin, ymax)))
}

polys <- lapply(polygons,sf::st_polygon)
response_df <- data.frame(area_id = 1:100, response = runif(100, min = 0, max = 10))
spdf <- sf::st_sf(response_df,geometry=polys)

r <- terra::rast(nrow=20,ncol=20)
terra::ext(r) <- terra::ext(spdf)
r[] <- sapply(1:terra::ncell(r), function(x) rnorm(1, ifelse(x %% 20 != 0, x %% 20, 20), 3))

r2 <- terra::rast(nrow=20,ncol=20)
terra::ext(r2) <- terra::ext(spdf)
r2[] <- sapply(1:terra::ncell(r), function(x) rnorm(1, ceiling(x/10), 3))
cov_rasters <- c(r, r2)

test_data <- prepare_data(polygon_shapefile = spdf,
                          covariate_rasters = cov_rasters)

Print function for disaggregation input data

Description

Function that prints the input data from the disaggregation model.

Usage

## S3 method for class 'disag_data'
print(x, ...)
## S3 method for class 'disag_data'
print(x, ...)

Arguments

`x`	Object returned from prepare_data.
`...`	Further arguments to print function.

Details

Prints the number of polyons and pixels, the number of pixels in the largest and smallest polygons and summaries of the covariates.

Print function for disaggregation fit result.

Description

Function that prints the result of the fit from the disaggregation model.

Usage

## S3 method for class 'disag_model'
print(x, ...)
## S3 method for class 'disag_model'
print(x, ...)

Arguments

`x`	Object returned from disag_model.
`...`	Further arguments to print function.

Details

Prints the negative log likelihood, model parameters and calculates metrics from in-sample performance.

Print function for disaggregation prediction

Description

Function that prints the prediction from the disaggregation model.

Usage

## S3 method for class 'disag_prediction'
print(x, ...)
## S3 method for class 'disag_prediction'
print(x, ...)

Arguments

`x`	Object returned from predict.disag_model.
`...`	Further arguments to print function.

Details

Prints the number of polyons and pixels, the number of pixels in the largest and smallest polygons and summaries of the covariates.

Summary function for disaggregation input data

Description

Function that summarizes the input data from the disaggregation model.

Usage

## S3 method for class 'disag_data'
summary(object, ...)
## S3 method for class 'disag_data'
summary(object, ...)

Arguments

`object`	Object returned from prepare_data.
`...`	Further arguments to summary function.

Details

Prints the number of polyons and pixels, the number of pixels in the largest and smallest polygons and summaries of the covariates.

Value

A list of the number of polyons, the number of covariates and summaries of the covariates.

Summary function for disaggregation fit result

Description

Function that summarises the result of the fit from the disaggregation model.

Usage

## S3 method for class 'disag_model'
summary(object, ...)
## S3 method for class 'disag_model'
summary(object, ...)

Arguments

`object`	Object returned from disag_model.
`...`	Further arguments to summary function.

Details

Prints the negative log likelihood, model parameters and calculates metrics from in-sample performance.

Value

A list of the model parameters, negative log likelihood and metrics from in-sample performance.

Summary function for disaggregation prediction

Description

Function that summarizes the prediction from the disaggregation model.

Usage

## S3 method for class 'disag_prediction'
summary(object, ...)
## S3 method for class 'disag_prediction'
summary(object, ...)

Arguments

`object`	Object returned from predict.disag_model
`...`	Further arguments to summary function.

Details

Prints the number of polyons and pixels, the number of pixels in the largest and smallest polygons and summaries of the covariates.

Value

A list of the number of polyons, the number of covariates and summaries of the covariates.

Package 'disaggregation'

Help Index

Function to fit the disaggregation model

Description

Usage

Arguments

Value

Build mesh for disaggregaton model

Description

Usage

Arguments

Details

Value

Examples

Fit the disaggregation model

Description

Usage

Arguments

Details

Value

References

Examples

Roxygen commands

Description

Usage

Get a SpatRaster of covariates from a folder containing .tif files

Description

Usage

Arguments

Value

Examples

Extract polygon id and response data into a data.frame from a sf object

Description

Usage

Arguments

Value

Examples

Function to match pixels to their corresponding polygon

Description

Usage

Arguments

Details

Value

Examples

Create the TMB model object for the disaggregation model

Description

Usage

Arguments

Details

Value

Examples

Convert results of the model ready for plotting

Description

Usage

Arguments

Value

Plot input data for disaggregation

Description

Usage

Arguments

Details

Value

Plot results of fitted model

Description

Usage

Arguments

Details

Value

Plot mean and uncertainty predictions from the disaggregation model results

Description

Usage

Arguments

Details

Value

Function to predict mean from the model result

Description

Usage

Arguments

Details

Value