Latest - September 2024¶
Introduction of several core API changes, which significantly alter the usage of the library at the C++ level. From this update on, support for PDEs in strong form is dropped.
core
constexpr linear algebra: (this is something you won’t see outside fdaPDE eheheh) Introduction of several built-in types to enable for compile-time dense linear algebra, togheter with the possibility to solve dense linear system \(Ax = b\) at compile-time using LU factorization with partial pivoting.
You find under the
cexpr
namespace, templatesMatrix<Scalar, Rows, Cols>
andVector<Scalar, Rows>
which are the backbone for compile-time dense linear algerba. Here a brief explanation of what this means:#include <fdaPDE/linear_algebra.h> using namespace fdapde::cexpr; int main() { // a 2X2 matrix of int static constexpr Matrix<int, 2, 2> M({1, 2, 3, 4}); // using M, you can perform computations at compile time. Suppose you want to sum M with itself constexpr auto A = M + M; // A is an expression template encoding the action of summing M with itself. // The point is that, the matrix addition can be performed by the compiler // during compilation. Don't you believe it? Let's query the compiler by writing // a static_assert which for sure is falsified static_assert(A(0, 0) == 0); return 0; }
We know
A(0, 0) = 2
, and indeed if we compile the program above we expect the compilation to fail:> g++ -o test test.cpp -I.. -O2 -std=c++20 test.cpp: In function ‘int main()’: test.cpp:57:27: error: static assertion failed 57 | static_assert(A(0, 0) == 0); | ~~~~~~~~^~~~ test.cpp:57:27: note: the comparison reduces to ‘(2 == 0)’ ^ here the value of A(0, 0) produced by the compiler
If this still seems useless to you, check the produced assembly code (aka, machine code) to see how much the compiler is able to optimize the operation.
#include <fdaPDE/linear_algebra.h> using namespace fdapde::cexpr; int main() { constexpr Matrix<int, 2, 2> M({1, 2, 3, 4}); constexpr auto A = M + M; return A(0, 0); }
Dump of assembler code for function main(): 0x00000000000010b0 <+0>: mov $0x2,%eax // return statment (which indeed returns 0x2 = 2) 0x00000000000010b5 <+5>: ret End of assembler dump.
Just one move instruction! The compiler was able to completely optimize out the matrix addition, and to return directly the result (0x2, which is
A(0, 0) = 2
).Note
It must be pointer out that Eigen is also able to reach the same assembly code with the
-DNDEBUG
flag activated. For instance, this Eigen code#include <Eigen/Dense> int main() { SMatrix<2, 2, double> M; M << 1, 2, 3, 4; auto A = M + M; return A(0, 0); }
produces the same assembly code above (the compiler is smart enought to completely optimize the matrix addition). This is true for any arithmetic operation between static-sized Eigen types. Nonethless, Eigen types cannot be used in a constexpr-context, and this poses several limitations, for instance, to evaluate mathematical expressions at compile-time (see next).
Note
Another limitation of using Eigen (for complie-time linear algebra) is the impossibility to store the result of a computation done at compile-time, and later use it at compile-time.
#include <fdaPDE/linear_algebra.h> using namespace fdapde; // since C++20 lambdas have a constexpr call operator. We can define a linear system Ax = b inside a lambda and get it solved // at compile time. The solution can be bound to a (static) constexpr variable constexpr cexpr::Matrix<double, 3, 1> x = []() { cexpr::Matrix<double, 3, 3> M; for (int i = 0; i < 3; ++i) { for (int j = 0; j < 3; ++j) { M(i, j) = 3 * i + j + 1; } } cexpr::PartialPivLU<cexpr::Matrix<double, 3, 3>> invM; invM.compute(M); cexpr::Matrix<double, 3, 1> b = invM.solve(cexpr::Matrix<double, 3, 1>(1, 1, 1)); return b; // solution is [-3 5 -2] }(); int main() { return x[0] + x[0]; } // generated assembly Dump of assembler code for function main(): 0x00000000000010d0 <+0>: mov $0xfffffffa,%eax // returns -6 0x00000000000010d5 <+5>: ret End of assembler dump.
Again, the compiler was able to completely optimize out all the computations and just return what asked. This is not possible with Eigen, as the nearest thing we can do without constexpr is the following
#include <Eigen/Dense> // solve the linear system and bound the solution to a (static) variable Eigen::Matrix<double, 3, 1> x = []() { Eigen::Matrix<double, 3, 3> M; for (int i = 0; i < 3; ++i) { for (int j = 0; j < 3; ++j) { M(i, j) = 3 * i + j + 1; } } Eigen::PartialPivLU<Eigen::Matrix<double, 3, 3>> invM; invM.compute(M); Eigen::Matrix<double, 3, 1> b = invM.solve(Eigen::Matrix<double, 3, 1>(1, 1, 1)); return b; }(); int main() { return x[0] + x[0]; } // generated assembly Dump of assembler code for function main(): 0x00000000000010d0 <+0>: movsd 0x30a8(%rip),%xmm0 // load data from global variable x 0x00000000000010d8 <+8>: addsd %xmm0,%xmm0 // execute addition x[0] + x[0] 0x00000000000010dc <+12>: cvttsd2si %xmm0,%eax // round result from double to integer and return 0x00000000000010e0 <+16>: ret End of assembler dump.
The compiler, while was still able to solve the linear system at compile-time (since no function-call is executed, this is lambda expressions black magic), is not able to directly return the result, as it is no capable to inspect an Eigen type during compilation. In my opinion, there is no point in having code executed at run-time for a computation which is exactly known at compile-time. Therefore, fdaPDE was able to produce better optimized code in this situation.
Is not difficult to start noticing the advantages of constexpr computations (if you are in the position to do so).
// assume you just want to compute the sum of the elements of x int main() { double sum = 0; for(int i = 0; i < x.rows(); ++x) { sum += x[i]; } // sum is 0 return sum; } // assembly generated by fdaPDE Dump of assembler code for function main(): 0x0000000000001110 <+0>: xor %eax,%eax // returns 0 (just a nice machine trick to return 0, xor a register with itself) 0x0000000000001112 <+2>: ret End of assembler dump. // assembly generated by Eigen Dump of assembler code for function main(): 0x0000000000001110 <+0>: pxor %xmm0,%xmm0 // zero xmm0 register 0x0000000000001114 <+4>: addsd 0x3064(%rip),%xmm0 // load x[0] in xmm0 0x000000000000111c <+12>: lea 0x3085(%rip),%rdx 0x0000000000001123 <+19>: lea -0x20(%rdx),%rax 0x0000000000001127 <+23>: addsd (%rax),%xmm0 // execute for loop <--------------------- 0x000000000000112b <+27>: add $0x10,%rax // | 0x000000000000112f <+31>: addsd -0x8(%rax),%xmm0 // sum += x[i] | 0x0000000000001134 <+36>: cmp %rdx,%rax // | 0x0000000000001137 <+39>: jne 0x1127 <main()+23> // --------------------------------------- 0x0000000000001139 <+41>: cvttsd2si %xmm0,%eax // return computed sum 0x000000000000113d <+45>: ret End of assembler dump.
While fdaPDE was completely able to avoid any for loop at run-time (as it perfectly knows how to compute the for-loop at compile time), Eigen cannot, and must execute code (therefore, waste time) to produce a result. You will further notice the advantages of such data-types when involved in the much more involved math expressions.
Below a summary of the API exposed for constexpr dense linear algebra at the time of this update:
¶ measure()
measure of the simplex.
bounding_box()
returns the smallest rectangle containing the simplex.
barycenter()
returns the midpoint (center of mass) of the simplex.
barycentric_coords(const NodeType& p)
given a point \(p\), rewrites it in the barycentric coordinate system of the simplex.
circumcenter()
returns the center of the (hyper)-sphere passing through the vertices of the simplex.
circumradius()
returns the radius of the (hyper)-sphere passing through the vertices of the simplex.
contains(const NodeType& p)
given a point \(p\) determines if the point is inside, on a boundary cell, on a vertex or outside of the simplex.
supporting_plane()
the
HyperPlane<Order, EmbedDim>
passing through the simplex.normal()
the normal direction to the simplex (meaningfull only if
Order != EmbedDim
, otherwise returns zero).nearest(const NodeType& p)
given a free point \(p\), finds the best approximation of \(p\) in the simplex (e.g., the point in the simplex which is nearest to \(p\)).
field expressions revised: since writing mathmetical expressions is now a core part of the library API, the whole expression template mechanism has been reimplemented from scratch.
Tip
One notable addition to the fields API are constexpr-enabled expressions, i.e., under some circumstances expressions can be evaluated at compile-time.
new finite element module:
shapefile reader: