Welcome to the fdaPDE 2.0’s alpha-testing!¶
You can check the status of all open and closed issues, as well as known bugs, here
Note
In the following, we will refer with fdaPDE-CRAN to the version 1.1-16 of the library currently available on CRAN (link). fdaPDE-2.0 refers to the new version which will replace the currently official one.
This page reports the guidelines to follow during the alpha-testing phase of fdaPDE-2.0. It also contains a not official draft of the R package documentation. Moreover, this document will be kept updated for the whole alpha-testing phase. Check the Changelog for the latest updates.
Important
fdaPDE-2.0 is currently in active development. You should frequently reinstall the package. Moreover, ignore eventual warnings raised at compile and installation time. They will be fixed, if possible, before the release on CRAN. Interface changes can happen for the whole testing period, up to the official release on CRAN.
Installation¶
fdaPDE-2.0 needs the following dependencies:
a C++17 compiler
Rcpp
andRcppEigen
(we plan to drop this dependency before the official release on CRAN)
Then, to install the library on your system, under the name fdaPDE2
, you can either:
use the devtools package. From the R console, execute
devtools::install_github("fdaPDE/fdaPDE-R", ref="stable")
clone the fdaPDE-R repository and install. From a (linux) terminal, execute
git clone --recurse-submodules -b stable git@github.com:fdaPDE/fdaPDE-R.git cd path/to/fdaPDE-R
then, install the package from the R console with
install.packages(".", type="source", repos=NULL)
Changelog¶
Bug reports¶
If you find a bug, please notify it by opening a draft pull request here. A bug report must contain a MWE (Minimal Working Example) which reproduces the error. Select also a severity level for the error.
Library interface¶
This is just an informal presentation of the R interface. In particular, it is not an API specification.
Functional Space¶
A functional space represents the concept of a functional basis over a given domain. Currently only Lagrangian basis are supported (aka finite element basis functions).
library(fdaPDE2)
data("unit_square", package = "fdaPDE2")
unit_square <- Mesh(unit_square)
## the functional space of finite element functions of order 1, over the unit square
Vh <- FunctionSpace(unit_square, fe_order = 1)
basis <- Vh$get_basis() ## recover the (lagrangian) basis system of Vh
## evaluate the basis system on a given set of locations (compute Psi matrix)
locations <- unit_square$nodes
Psi <- basis$eval(type = "pointwise", locations)
## integrate a function (expressed as basis expansion on Vh) over the domain
f <- function(p) { p[,1]^2 + p[,2]^2 } ## x^2 + y^2
Vh$integrate(f)
PDEs¶
The new version of the library let you explicitly write differential operators in strong form. If you are going to define a statistical model, what reported below is not of any interest if you decide to default to a standard laplacian penalty.
library(fdaPDE2)
data("unit_square", package = "fdaPDE2")
unit_square <- Mesh(unit_square)
## the functional space of finite element functions of order 1, over the unit square
Vh <- FunctionSpace(unit_square, fe_order = 1)
f <- Function(Vh) ## a generic element of the space Vh
## compose the differential operator (in strong form)
Lf <- -laplace(f) + dot(c(1,1), grad(f)) ## a costant coefficients advection-diffusion problem
## define the forcing term
u <- function(points) { return(rep(1, times = nrow(points))) }
## create your penalty
penalty <- PDE(Lf, u)
Supported operators are
operator |
code |
note |
---|---|---|
laplacian |
|
observe that expressions are signed, |
divergence |
|
the diffusion tensor K can be either a constant matrix, or a function returning a matrix (space-varying coefficient) |
transport |
|
the transport coefficient b can be either a constant vector or a vector field (space-varying coefficient) |
reaction |
|
the reaction coefficient c can be either a constant or a scalar field (space-varying coefficient) |
|
specifies that the problem is time dependent (parabolic PDE) |
## general linear second order parabolic operator
Lf <- dt(f) - div(K * grad(f)) + dot(b, grad(f)) + c * f
More examples can be found on the femR documentation or on the test scripts.
SRPDE : Spatial Regression models¶
Here the interface for a general spatial regression model:
library(fdaPDE2)
data("unit_square", package = "fdaPDE2")
unit_square <- Mesh(unit_square)
data_frame <- ## obtain your data in some way...
## currently, the only requirement is data_frame to be a data.frame object, e.g.,
## y x1 x2
## 1 -0.04044303 0.1402058 0.00000000
## 2 0.15079619 1.1989599 0.03447593
## 3 0.02391597 -2.3299685 0.06891086
## 4 0.38927632 0.5709451 0.10326387
## 5 0.39417457 2.7482761 0.13749409
## 6 0.33297548 1.7080400 0.17156085
## a model is first described in an "abstract" way, think for instance to the continuous
## functional J(f, \beta) we minimize when solving a smoothing problem
## a nonparametric spatial regression model
model <- SRPDE(y ~ f, domain = unit_square, data = data_frame, lambda = 1e-6)
## this will inject in the current environment a Function object named f, representing the
## unknown spatial field and defined on a FunctionSpace(unit_square, fe_order = 1)
## the name of the spatial field supplied in formula can be any
model$fit() ## fits the model
## we can describe a semi-parametric spatial regression model as
model <- SRPDE(y ~ x1 + f, domain = unit_square, data = data_frame, lambda = 1e-6)
model$fit()
The following is an equivalent, but more general, way to describe the above model
## define the differential penalty
f <- Function(FunctionSpace(unit_square, fe_order = 1))
Lf <- -laplace(f) ## simple laplacian penalty
u <- function(points) { return(rep(0, times = nrow(points))) }
## supply the penalty to the model and fit
model <- SRPDE(y ~ f, penalty = PDE(Lf, u), data = data_frame, lambda = 1e-6)
model$fit()
SRPDE
exposes a family
parameter whose default is set to gaussian
, and which makes the model to behave as a linear spatial regression model, as described, for instance in Sangalli, L.M. (2021), Spatial regression with partial differential equation regularization, International Statistical Review. Possible values for the family
parameters are
family |
note |
---|---|
|
generalized spatial regression model, solved via FPIRLS, as detailed in Wilhelm, M., Sangalli, L.M. (2016), Generalized Spatial Regression with Differential Regularization, Journal of Statistical Computation and Simulation |
|
quantile spatial regression model, solved via FPIRLS, as described in De Sanctis, M., Di Battista, I., Spatial Quantile regression with Partial Differential Equation Regularization, PACS report |
STRPDE : Spatio-Temporal Regression models¶
Coming soon
Model fit customization¶
When we write something like
model <- SRPDE(y ~ f, penalty = PDE(Lf, u), data = data_frame, lambda = 1e-6)
we are describing a model without fixing any computational detail regarding its resolution. Specifically, we are defining the estimation functional
We can customize the way a model is fit using the fit
method. The code below shows how to select the smoothing parameter via GCV minimization, using a grid approach.
library(fdaPDE2)
data("unit_square", package = "fdaPDE2")
unit_square <- Mesh(unit_square)
data_frame <- ## obtain your data in some way...
model <- SRPDE(y ~ f, domain = unit_square, data = data_frame) ## do not fix any lambda...
## ... because you might want to select it via GCV minimization!
lambda_grid = 10^seq(-4, -3, by = 0.1)
model$fit(
lambda = gcv(optimizer = "grid", lambda = lambda_grid)
)
See GCV for more details on the use of the gcv()
method. The idea is extended to any other type of models. The snippet below shows how to customize the behaviour of the Functional Penalized Iterative Reweighted Least Square (FPIRLS) method used inside a generalized regression model
library(fdaPDE2)
data("unit_square", package = "fdaPDE2")
unit_square <- Mesh(unit_square)
data_frame <- ## obtain your data in some way...
model <- SRPDE(y ~ f, domain = unit_square, data = data_frame, family = "poisson")
lambda_grid = 10^seq(-4, -3, by = 0.1)
model$fit(
lambda = gcv(optimizer = "grid", lambda = lambda_grid),
fprils_params = list( ## customize FPIRLS execution
max_iterations = 100,
tolerance = 1e-6
)
)
## you are not forced to set all the parameters, if you do not set some (or all) of them,
## there will always be a default
Important
More on this must be explored and tested. Moreover any parameter should always have a default, so that model$fit()
is always well defined. Check the changelog to see updates in this direction.
GCV¶
This sections shows the API provided by the gcv()
method, which can be used to tune the smoothing parameter in a regression model.
## stochastic approximation of degrees of freedom, fast but approximate
model$fit(
lambda = gcv(
edf_computation = "stochastic",
seed = 143547654, ## set seed in internal stochastic engine (for reproducibility)
n_mc_samples = 500, ## number of mc samples in stochastic algorithm
optimizer = "...",
... ## additional arguments forwarded to the optimizer
)
## exact approximation of degrees of freedom, slow but exact
model$fit(
lambda = gcv(
edf_computation = "exact",
optimizer = "...",
... ## additional arguments forwarded to the optimizer
)
Info
In case edf_computation = "stochastic"
, the following defaults are available:
* seed = NULL
forces a random initialization for the stochastic algorithm
* n_mc_samples = NULL
sets the number of monte carlo realizations to 100
The GCV objective can be optimizer using different methods, reported in the table below
optimizer |
note |
---|---|
|
optimization on a fixed and user-defined grid of values. Currently the only possibility for space-time models. Requires
|
|
Newton optimization. This is an iterative optimization method. Requires
|
|
Gradient descent optimization. This is an iterative optimization method. Warning This optimization method has never been explored in the context of GCV minimization. Requires
|
|
The method of Broyden–Fletcher–Goldfarb–Shanno for unconstrained nonlinear optimization. This is an iterative optimization method. Warning This optimization method has never been explored in the context of GCV minimization. Requires
|