Welcome to the fdaPDE 2.0’s alpha-testing!

You can check the status of all open and closed issues, as well as known bugs, here

Note

In the following, we will refer with fdaPDE-CRAN to the version 1.1-16 of the library currently available on CRAN (link). fdaPDE-2.0 refers to the new version which will replace the currently official one.

This page reports the guidelines to follow during the alpha-testing phase of fdaPDE-2.0. It also contains a not official draft of the R package documentation. Moreover, this document will be kept updated for the whole alpha-testing phase. Check the Changelog for the latest updates.

Important

fdaPDE-2.0 is currently in active development. You should frequently reinstall the package. Moreover, ignore eventual warnings raised at compile and installation time. They will be fixed, if possible, before the release on CRAN. Interface changes can happen for the whole testing period, up to the official release on CRAN.

Installation

fdaPDE-2.0 needs the following dependencies:

  • a C++17 compiler

  • Rcpp and RcppEigen (we plan to drop this dependency before the official release on CRAN)

Then, to install the library on your system, under the name fdaPDE2, you can either:

  • use the devtools package. From the R console, execute

    devtools::install_github("fdaPDE/fdaPDE-R", ref="stable")
    
  • clone the fdaPDE-R repository and install. From a (linux) terminal, execute

    git clone --recurse-submodules -b stable git@github.com:fdaPDE/fdaPDE-R.git
    cd path/to/fdaPDE-R
    

    then, install the package from the R console with

    install.packages(".", type="source", repos=NULL)
    

Changelog

30/11/23: release of the alpha-testing version.
06/12/23: code tested against the following compilers: gcc, clang, apple-clang (solves fdaPDE-R#3, package should install on macos systems).

Bug reports

If you find a bug, please notify it by opening a draft pull request here. A bug report must contain a MWE (Minimal Working Example) which reproduces the error. Select also a severity level for the error.

Library interface

This is just an informal presentation of the R interface. In particular, it is not an API specification.

Functional Space

A functional space represents the concept of a functional basis over a given domain. Currently only Lagrangian basis are supported (aka finite element basis functions).

library(fdaPDE2)
data("unit_square", package = "fdaPDE2")
unit_square <- Mesh(unit_square)

## the functional space of finite element functions of order 1, over the unit square
Vh <- FunctionSpace(unit_square, fe_order = 1)
basis <- Vh$get_basis() ## recover the (lagrangian) basis system of Vh

## evaluate the basis system on a given set of locations (compute Psi matrix)
locations <- unit_square$nodes
Psi <- basis$eval(type = "pointwise", locations)

## integrate a function (expressed as basis expansion on Vh) over the domain
f <- function(p) { p[,1]^2 + p[,2]^2 } ## x^2 + y^2
Vh$integrate(f)

PDEs

The new version of the library let you explicitly write differential operators in strong form. If you are going to define a statistical model, what reported below is not of any interest if you decide to default to a standard laplacian penalty.

library(fdaPDE2)
data("unit_square", package = "fdaPDE2")
unit_square <- Mesh(unit_square)

## the functional space of finite element functions of order 1, over the unit square
Vh <- FunctionSpace(unit_square, fe_order = 1)
f  <- Function(Vh) ## a generic element of the space Vh

## compose the differential operator (in strong form)
Lf <- -laplace(f) + dot(c(1,1), grad(f)) ## a costant coefficients advection-diffusion problem
## define the forcing term
u <- function(points) { return(rep(1, times = nrow(points))) }

## create your penalty
penalty <- PDE(Lf, u)

Supported operators are

operator

code

note

laplacian

laplacian(f)

observe that expressions are signed, laplacian(f) is differnt from -laplacian(f)

divergence

div(K*grad(f))

the diffusion tensor K can be either a constant matrix, or a function returning a matrix (space-varying coefficient)

transport

dot(b,grad(f))

the transport coefficient b can be either a constant vector or a vector field (space-varying coefficient)

reaction

c*f

the reaction coefficient c can be either a constant or a scalar field (space-varying coefficient)

dt(f)

specifies that the problem is time dependent (parabolic PDE)

## general linear second order parabolic operator
Lf <- dt(f) - div(K * grad(f)) + dot(b, grad(f)) + c * f

More examples can be found on the femR documentation or on the test scripts.

SRPDE : Spatial Regression models

Here the interface for a general spatial regression model:

library(fdaPDE2)
data("unit_square", package = "fdaPDE2")
unit_square <- Mesh(unit_square)

data_frame <- ## obtain your data in some way...
## currently, the only requirement is data_frame to be a data.frame object, e.g.,

##            y         x1         x2
## 1 -0.04044303  0.1402058 0.00000000
## 2  0.15079619  1.1989599 0.03447593
## 3  0.02391597 -2.3299685 0.06891086
## 4  0.38927632  0.5709451 0.10326387
## 5  0.39417457  2.7482761 0.13749409
## 6  0.33297548  1.7080400 0.17156085

## a model is first described in an "abstract" way, think for instance to the continuous
## functional J(f, \beta) we minimize when solving a smoothing problem

## a nonparametric spatial regression model
model <- SRPDE(y ~ f, domain = unit_square, data = data_frame, lambda = 1e-6)

## this will inject in the current environment a Function object named f, representing the
## unknown spatial field and defined on a FunctionSpace(unit_square, fe_order = 1)
## the name of the spatial field supplied in formula can be any

model$fit() ## fits the model

## we can describe a semi-parametric spatial regression model as
model <- SRPDE(y ~ x1 + f, domain = unit_square, data = data_frame, lambda = 1e-6)
model$fit()

The following is an equivalent, but more general, way to describe the above model

## define the differential penalty
f <- Function(FunctionSpace(unit_square, fe_order = 1))
Lf <- -laplace(f) ## simple laplacian penalty
u <- function(points) { return(rep(0, times = nrow(points))) }

## supply the penalty to the model and fit
model <- SRPDE(y ~ f, penalty = PDE(Lf, u), data = data_frame, lambda = 1e-6)
model$fit()

SRPDE exposes a family parameter whose default is set to gaussian, and which makes the model to behave as a linear spatial regression model, as described, for instance in Sangalli, L.M. (2021), Spatial regression with partial differential equation regularization, International Statistical Review. Possible values for the family parameters are

family

note

poisson, gamma, bernulli, exponential

generalized spatial regression model, solved via FPIRLS, as detailed in Wilhelm, M., Sangalli, L.M. (2016), Generalized Spatial Regression with Differential Regularization, Journal of Statistical Computation and Simulation

quantile

quantile spatial regression model, solved via FPIRLS, as described in De Sanctis, M., Di Battista, I., Spatial Quantile regression with Partial Differential Equation Regularization, PACS report

STRPDE : Spatio-Temporal Regression models

Coming soon

Model fit customization

When we write something like

model <- SRPDE(y ~ f, penalty = PDE(Lf, u), data = data_frame, lambda = 1e-6)

we are describing a model without fixing any computational detail regarding its resolution. Specifically, we are defining the estimation functional

\[J(\beta, f) = \sum_{i=1}^n (y_i - f(\boldsymbol{p}_i))^2 + \lambda_{\mathcal{D}} \int_{\mathcal{D}} (Lf - u)^2 d\mathcal{D}.\]

We can customize the way a model is fit using the fit method. The code below shows how to select the smoothing parameter via GCV minimization, using a grid approach.

library(fdaPDE2)
data("unit_square", package = "fdaPDE2")
unit_square <- Mesh(unit_square)

data_frame <- ## obtain your data in some way...
model <- SRPDE(y ~ f, domain = unit_square, data = data_frame) ## do not fix any lambda...

## ... because you might want to select it via GCV minimization!
lambda_grid = 10^seq(-4, -3, by = 0.1)
model$fit(
   lambda = gcv(optimizer = "grid", lambda = lambda_grid)
)

See GCV for more details on the use of the gcv() method. The idea is extended to any other type of models. The snippet below shows how to customize the behaviour of the Functional Penalized Iterative Reweighted Least Square (FPIRLS) method used inside a generalized regression model

library(fdaPDE2)
data("unit_square", package = "fdaPDE2")
unit_square <- Mesh(unit_square)

data_frame <- ## obtain your data in some way...
model <- SRPDE(y ~ f, domain = unit_square, data = data_frame, family = "poisson")

lambda_grid = 10^seq(-4, -3, by = 0.1)
model$fit(
   lambda = gcv(optimizer = "grid", lambda = lambda_grid),
   fprils_params = list( ## customize FPIRLS execution
      max_iterations = 100,
      tolerance = 1e-6
   )
)

## you are not forced to set all the parameters, if you do not set some (or all) of them,
## there will always be a default

Important

More on this must be explored and tested. Moreover any parameter should always have a default, so that model$fit() is always well defined. Check the changelog to see updates in this direction.

GCV

This sections shows the API provided by the gcv() method, which can be used to tune the smoothing parameter in a regression model.

## stochastic approximation of degrees of freedom, fast but approximate
model$fit(
   lambda = gcv(
      edf_computation = "stochastic",
      seed = 143547654,    ## set seed in internal stochastic engine (for reproducibility)
      n_mc_samples = 500,  ## number of mc samples in stochastic algorithm
      optimizer = "...",
      ... ## additional arguments forwarded to the optimizer
)

## exact approximation of degrees of freedom, slow but exact
model$fit(
   lambda = gcv(
      edf_computation = "exact",
      optimizer = "...",
      ... ## additional arguments forwarded to the optimizer
)

Info

In case edf_computation = "stochastic", the following defaults are available: * seed = NULL forces a random initialization for the stochastic algorithm * n_mc_samples = NULL sets the number of monte carlo realizations to 100

The GCV objective can be optimizer using different methods, reported in the table below

optimizer

note

grid

optimization on a fixed and user-defined grid of values. Currently the only possibility for space-time models. Requires

  • lambda: vector of smoothing parameters to explore

newton

Newton optimization. This is an iterative optimization method. Requires

  • lambda: starting point of the iterative method

  • exact_derivative: a boolean flag which, if set to false, enables the approximate computation of the GCV derivative via central finite differences (exact_derivatives = TRUE currently not supported)

  • tolerance : tolearance on the error between consecutive iterations to stop the optimization

  • step: step value used in the Newton update (adaptve steps backtracking, wolfe currently not supported)

gradient_descent

Gradient descent optimization. This is an iterative optimization method.

Warning

This optimization method has never been explored in the context of GCV minimization.

Requires

  • lambda: starting point of the iterative method

  • exact_derivative: a boolean flag which, if set to false, enables the approximate computation of the GCV derivative via central finite differences (exact_derivatives = TRUE currently not supported)

  • tolerance : tolearance on the error between consecutive iterations to stop the optimization

  • step: step value used in the Newton update (adaptve steps backtracking, wolfe currently not supported)

bfgs

The method of Broyden–Fletcher–Goldfarb–Shanno for unconstrained nonlinear optimization. This is an iterative optimization method.

Warning

This optimization method has never been explored in the context of GCV minimization.

Requires

  • lambda: starting point of the iterative method

  • exact_derivative: a boolean flag which, if set to false, enables the approximate computation of the GCV derivative via central finite differences (exact_derivatives = TRUE currently not supported)

  • tolerance : tolearance on the error between consecutive iterations to stop the optimization

  • step: step value used in the Newton update (adaptve steps backtracking, wolfe currently not supported)