Weighted Residual Methods | ALL YOU NEED IS LOVE

There is a significant difference among various spectral methods, which lies in the “error distribution” principle they employ. Specifically, this refers to how each method minimizes the residual function , where

with being the differential or integral operator, and representing the approximation to the solution , given by

where are some chosen basis functions.

The “Method of Mean Weighted Residuals (MWR)” refers to a class of methods that require the residual to be zero in some sense. If the chosen basis functions already satisfy the boundary conditions, MWR determines the unknown coefficients by imposing the following conditions:

where is a set of appropriate test functions and are the nodes. The inner product is defined by some non-negative weight function :

Different choices of test functions lead to different methods. Below are several common weighted residual methods along with their corresponding test functions:

Pseudospectral Methods: ;
Method of Moments: ;
Least Squares: ;
Galerkin Methods: .

These methods each have distinct characteristics, with the most important and commonly used being the pseudospectral methods and Galerkin methods. The following sections provide a brief overview of these approaches. This article restricts the discussion to the one-dimensional case, but the ideas can be easily extended to higher dimensions.

Pseudospectral Methods

Pseudospectral methods, also known as collocation spectral methods or method of selected points, choose as the test function. In this case, the corresponding constraint equation that the residual must satisfy is:

where is referred to the collocation point set for the method. In fact, the constraints above are assuming that

This is the reason why the pseudospectral methods are also called the method of selected points: it forces the approximation to be exact at the collocation points. Specially, if the basis functions satisfy , such as the Lagrange basis functions or cardinal functions, then the contraints lead to .

The most important idea behind the pseudospectral methods is that the series coefficients and the values of the function at the collocation points are equivalent and co-equal representations of the pesudospectral approximation to : Given the coefficients of the approximating series, we can certainly evaluate the function at the collocation points to get the values . It’s equally true, however, that given the values of the function at the collocation points, we have all the information needed to determine the coefficients of the approximating series by applying the discrete inner product, i.e., interpolation.

Because of this equivalence, pseduspectral methods may use either the series coefficients or the grid values as the unknowns. For example, consider differential equation

with some boundary conditions, there are two ways to solve this problem:

Construct Value Equation , where and are the vectors of the values of and at the nodes .
Construct Coefficient Equation , where are the coefficients of the approximating series, and are the coefficients of expanded by the same basis functions , that is,

The first approach is usually called the value approach, in which the basis functions are chosen to be the Lagrange basis functions with invoking the barycentric Lagrange interpolation introduced in the following section. Cardinal functions can also be used in this approach, they are not as numerically stable as the Lagrange basis functions, but are useful for theoretical analysis. The second approach is called the coefficient approach, in which the basis functions are chosen to be the Chebyshev, Legendre polynomials, or generalized ultraspherical polynomials [4] and Jacobi polynomials (see this note for more details).

Barycentric Lagrange Interpolation

The barycentric Lagrange interpolation in [3] serves as a powerful tool for implementing pseudospectral methods (of value approach), especially for constructing the differentiation matrices.

Brief Derivation

Firstly, recall that given a set of distinct nodes and the corresponding function values , the Lagrange interpolation is given by

where is the -th Lagrange polynomial. Denote the numerator of as , that is,

If we define the barycentric weights by

that is, , we can thus rewrite as

Therefor the Lagrange interpolation becomes

If we interpolate the constant function by Lagrange interpolation and above formula, then

which implies

and thus we obtain

which is the barycentric Lagrange interpolation of at given nodes .

Differentiation of Polynomials Interpolants

Suppose is represented by a polynomial interpolant in Lagrange form, that is,

then the first and second derivatives of are

Since the barycentric representation of is

multiplying through, and then multiplying both sides by to render them differentiable at , we have

from which it follows with that

and

At , straightforward calculation yield , , and , from which, together with , we get for

Now the first and second differentiation matrices and can be constructed by

thus we have

where , and are the vectors of the values of , , and at the nodes .

Application to Differential Equations

Differentiation of interpolants is the basis of one of the most important applications of polynomial interpolation: spectral collocation methods for the numerical solution of differential equations. Suppose is a function defined on , and we want to solve numerically the boundary value problem

One way to do this is to set up a Chebyshev grid and consider the matrix equation

where is the matrix consisting of the interior rows and columns of the second derivative matrix , is an -vector of unknown approximations to at the interior grid points, and is now the -vector of values of sampled at the same grid points. If a continuous function is constructed by barycentric interpolation from the solution vector together with the boundary values , then may be an extremely accurate approximation to the exact solution . For example, if is analytic in a neighborhood of , then the accuracy of this approximation will improve geometrically as increases.

Boundary Conditions

To solve a boundary value problem, it’s necessary to enforce the numerical solution satisfy boundary conditions. In general, boundary conditions can be classified into two types: behavioral and numerical. Behavioral boundary conditions include ones that does not impose any specific numerical value upon or its derivatives at any particular points, like periodicity, being bounded and infinitely differentiable at a point where the equation is singular, and so on. In contrast, conditions such as and are numerical boundary conditions.

The significance of these categories is that it’s almost always necessary to explicitly enforce numerical boundary conditions, while behavioral boundary conditions may be satisfied implicitly by choosing basis functions such that each have the required property or behavior.

For numerical boundary conditions which must be explicitly imposed, there are two common strategies:

Boundary bordering: reducing the number of collocation conditions on the residual of the differential equation, and using rows of the pseudospectral matrix to explicitly enforce the boundary contraints;
Basis recombination: modifying the problem (if need be) so that the boundary conditions of the modified problem are homogeneous and the alterting the basis functions individually to satisfy the boundary conditions.

Galerkin Methods

Galerkin methods are a class of methods that choose the basis functions as the test functions, that is, assume . The corresponding constraint equation is:

This choice of test function is successful because any well-chosen set of basis functions will have all the properties desired of weighting functions, too, including linear independence and completeness.

Note that the residual function can be expanded as a series of basis functions like any other function of

where the coefficients are given by the usual inner product in the case of are complete orthonormal basis functions:

Hence in such case, the Galerkin method, as defined above, enforces that the residual is minimized in the sense that the first terms of its spectral expansion are zero. Because the basis functions are a complete set, we have as . Furthermore, the Galerkin method is a very robust algorithm if the test functions are orthogonal, and therefore are highly linearly independent, which makes the method numerically stable.

For simplicity, we assume the operator is linear. In this case the Galerkin method leads to a linear system

where for and for . And the th row of above equation is the th constraint equation, that is, the inner product of with the residual function

If the basis functions do not individually satisfy the boundary conditions, then it’s necessary to replace some rows of previous linear system by equations that express the boundary conditions. In the end, we need construct and then solve the final linear system to calculate . Since usually can be obtained efficiently by make full use of some good properties of the basis functions , it’s therefore crucial to choose the basis functions such that:

right-hand side can be computed efficiently;
the linear system can be solved efficiently.

To achieve these goals, the common idea is to use compact combinations of orthogonal polynomials or orthonormal functions to construct the basis functions.

Petrov-Galerkin

The Petrov-Galerkin method is a generalization of the Galerkin method, where the test functions can be different from the basis functions (sometimes called the trial functions). The boundary conditions determine the trial basis, and the boundary conditions associated to the dual differential equation can determine the test basis.

For example, now we consider the following linear differential equation

with homogeneous linear boundary conditions (if not, the lifting technique can be used to convert the problem to a homogeneous one by subtracting from the solution a low degree polynomial that satisfies the boundary conditions). Let be the space of polynomials of degree at most , denote the trial space by and the test space by . is usually of dimension and a subspace of whose elements satisfy certain given contrains, such as boundary conditions. For odd , a preferable choice is , where elements of are the “dual” of elements in 。 For even , a common practice is . Given a weight function , the standard Petrov-Galerkin method is to find s.t.

where denotes the inner product with respect to , according to [7]. Let

where and are the basis functions of and , respectively. If we assume that the approximation has the form

then the Petrov-Galerkin method leads to the following linear system

where and , is the vector of the coefficients of the approximating series. Often the entries of the matrix can only be obtained by numerical integration if the problem is variable-coefficient, which makes the Petrov-Galerkin method is usually computationally expensive. For this reason, there are many researches about how to choose appropriate trial and test functions to generate a banded matrix that can be solved efficiently.

For more information about efficient Petrov-Galerkin methods, see papers [5] and [6], as well as notes MPG and EfficientGalerkin.

Tau-method

The tau-method is a modification of the Galerkin method, which leads to non-symmetric variational formulations and only sub-optimal error estimates are available. In short, the philosophy of the tau-method is to compute the exact solution to a modification of the original differential equation instead of computing an approximate solution to the exact, unmodified differential equation.

Method of Moments

Generally, the term refers to the -th moment of the function with respect to the given inner product. The method of moments selects as the test function, hence the corresponding constraint equation is:

This method is relatively easy to implement and exhibits good numerical stability when is small, making it commonly used in engineering applications. However, when the degree of is large, rounding errors can cause high-degree terms to become linearly dependent, leading to numerical instability. Thus, the set is not an ideal choice for a basis in numerical methods. A more suitable alternative would be to use Chebyshev or Legendre polynomials as basis functions.

The method of moments works well only if (1) is small or (2) calculations are performed in exact rational arithmetic using symbolic computation.

Least Squares

The least squares method chooses as the test function provided that is a linear operator. The corresponding constraint equation is:

The reason for the name is that one can show that this minimizes the “norm” of the residual where the norm is defined by

In other words, the least squares method determines the coefficients in such a way that the is made as small as possible for the class of functions defined by an -term spectral series. The mathematical statement (in the view of optimization theory) of this is

If the basis functions have the property of completeness, i.e., if we can approximate the exact solution to arbitrary high accuracy by using a sufficiently large number of terms in the spectral series, then the least squares method is guaranteed to work.

Remark. One great advantage of the least squares method is it always generates symmetric matrices even if the operator is not self-adjoint. This symmetry property is helpful in the following three ways:

It greatly simplifies the theoretical numerical analysis;
Symmetric matrices can be solved by a variety of special tricks which are faster and need less storage than the algorithms for non-symmetric matrices;
Symmetric matrices are more stable in time marching schemes.

For nonlinear problems, it’s possible to generalize the least squares method by defining

which includes the linear case as a special case.

Despite its many virtue, the least squares method is not as popular as pseduspectral method and Galerkin method in practice for given specific problems, such as:

For self-adjoint problems, the Galerkin method is simpler and has the same virtues: minimum norm of the residual and a symmetric matrix.
For nonlinear problems, collocation/pseudospectral methods are preferable choices, since the least squares method gives a very unwieldy set of algebraic equations.

However, the least squares method is still the safest way of generating approximations that dependent nonlinearly on the unknowns.

References

John P. Boyd, Chebyshev and Fourier Spectral Methods, 2nd Edition, Dover Publications, 2001.
Jie Shen, Tao Tang, and Li-Lian Wang, Spectral Methods - Algorithms, Analysis and Applications, Springer, 2011.
Jean-Paul Berrut and Lloyd N. Trefethen, Barycentric Lagrange Interpolation, SIAM Rev, 46(3), 501-517, 2004.
Sheehan Olver, Alex Townsend, A Fast and Well-Conditioned Spectral Method, SIAM Rev. 55(3), 462-489, 2013.
Mikael Mortensen, A Generic and Strict Banded Spectral Petrov-Galerkin Method for Differential Equations with Polynomial Coefficients, SIAM J. Sci. Comput. 45(3), A1741-A1766, 2023.
Jie Shen, Efficient Spectral-Galerkin Method I, II, III, IV, series papers in SIAM J. Sci. Comput.
Ouyuan Qin, Lu Cheng, Kuan Xu, A New Banded Petrov-Galerkin Spectral Method, arXiv, 2025.