## Abstract

In this paper, we propose a new method for computing a distance between two shapes embedded in three-dimensional space. Instead of comparing directly the geometric properties of the two shapes, we measure the cost of deforming one of the two shapes into the other. The deformation is computed as the geodesic between the two shapes in the space of shapes. The geodesic is found as a minimizer of the Onsager–Machlup action, based on an elastic energy for shapes that we define. Its length is set to be the integral of the action along that path; it defines an intrinsic quasi-metric on the space of shapes. We illustrate applications of our method to geometric morphometrics using three datasets representing bones and teeth of primates. Experiments on these datasets show that the variational quasi-metric we have introduced performs remarkably well both in shape recognition and in identifying evolutionary patterns, with success rates similar to, and in some cases better than, those obtained by expert observers.

## 1. Introduction

The functions of all life forms depend on organization in space and time. Spatial organization involves the interactions of shapes with their surroundings, while organization in time concerns changes in shape. Geometric methods therefore play an essential role in the analyses and simulations of biological systems. This idea has a long history in biology, as illustrated in the work of D'Arcy Thompson [1]. Geometry and topology are now used regularly for representing, searching, simulating, analysing and comparing biological systems. In structural molecular biology, for example, the notion that the structure (or shape) of a protein is a major determinant of its function has led to the development of methods for representing, measuring and comparing protein structures [2–4]. Similar methods have been developed for larger biological systems. Brain morphometry, for example, is concerned with the measurement of the brain geometric structures and the changes they undergo during development, ageing, learning, disease and evolution [5–9].

In a chapter titled ‘The comparison of related forms’, Thompson explored how differences in the forms of related animals can be described by means of simple mathematical transformations [1]. This inspired the development of several shape comparison techniques, whose aim is to define a map between two shapes that can be used to measure their similarity. This is a challenging problem, as the space of possible maps is extremely large and difficult to characterize mathematically. Morphometricians usually attempt to reduce the dimension of this space by enforcing correspondences between specific landmarks. Landmark-based methods that find maps between two shapes work on the premise that knowledge of a mapping on a small number of correspondences can be extended to give the full map between the two surfaces of interest [10–12]. By contrast, landmark-free methods skip the search for landmarks altogether. They rely on a geometric representation of the shape in which some automatically selected points are assigned a signature, under the premise that points with similar signatures are more likely to correspond. Spectral techniques fit in this category. Rustamov, for example, introduces the global point signature of a point on the shape, which encodes both the eigenvalues and the eigenfunctions of the Laplace–Beltrami operator evaluated at that point [13]. Guibas and co-workers [14] introduced the heat kernel signature, a similar robust and multi-scale invariant defined on points on the surface of the shape. Fischl *et al*. [15,16] introduced an average convexity that they used to drive the alignment of cortical surfaces. McCane [17] developed a variational method for matching curves in two or three dimensions by optimizing their parametrizations. Laga *et al.* [18] developed statistical models of shapes based on the squared root velocity function. They recently implemented this procedure to study the shapes of plant leaves.

All the methods described above focus on the differences in the geometry of the two shapes to be compared. An alternative approach is to attempt to capture the cost of deforming one shape into the other. This is implemented, for example, in the concept of the earth mover's distance, or Wasserstein metric, which has been used for image retrieval [19] as well as in different contexts for shape comparisons [20–22]. The large deformation diffeomorphic metric mapping method follows a similar concept: it captures a flow of transformations between the two shapes to be compared by introducing a concept of time, with the constraint that those transformations remain diffeomorphic. It has been adapted to matching points [23], curves [24], as well as surfaces [25,26]. Plotkin [27] also considered time when comparing two curves, introducing a variational distance that generalizes the concept of Euclidean distance to high-dimensional objects. This distance is computed as the length of a geodesic on the space of curves, where the geodesic is derived as a solution to a differential equation [27]. It has been used for computing folding pathways for proteins [28], as well as for aligning the curves representing protein structures [29]. None of those methods, however, consider physical constraints in the deformation process between the two objects of study. This can lead to non-physical transformations, such as the appearance of knots when comparing two curves [30].

The problem of non-physical transformations mentioned above highlights in fact the actual challenges that need to be solved: generating a physically sound geodesic flow between two shapes that leads to an actual metric in the space of such shapes. Addressing the former challenge, namely the generation of a physically sound trajectory, usually involves designing energy functions that yield deformation paths with a specified behaviour [31]. Examples of such energies include functions that imitate a physical energy dissipation induced by the deformation of a viscous material [32]. Other measures capture stretching and bending variations of the surface (or thin layers around this surface) of the shape [33–35]. Once the energy function is derived, the best deformation path is chosen as the path that requires the least total energy. Finding such a path usually relies on variational mechanical approaches [34,36,37]. Our motivation in this work is to expand on this concept of deriving physically sound paths between two shapes, focusing mostly on designing an efficient variational approach for generating such paths that is based on the concept of minimum action paths.

We propose a new distance between shapes that captures the complexities of deforming one of the two shapes into the other. Specifically, we consider the generalization of the conventional notion of distance to polymeric objects introduced by Plotkin [27], with two major differences. First, we extend its applications from curves to surfaces. Second, we modify the functional for computing the geodesic distance with the introduction of an energy term to enforce physically sound elastic deformations. We use for this purpose the Onsager–Machlup action functional [38] that is defined to study finite temperature transitions over finite time intervals. In addition to defining the action functional for shape deformation, we propose a method for finding the corresponding minimum action path. The action along this path defines a distance that depends only on the geometry of the initial and final configurations of the shape considered. This distance is a special case of an intrinsic quasi-metric from differential geometry with the Onsager–Machlup action corresponding to a Finsler function [39]. We note that the present paper develops on our own work on computing transition paths between different conformations of biomolecules [40,41].

While three-dimensional data representing a shape come in many forms, we concentrate on the important case where the surface of the shape is available and described with a discrete triangular mesh. Mathematically, these objects are two-dimensional Riemannian manifolds in the smooth case, and piecewise-flat surfaces in the discrete setting. We work with both representations. The method we propose to compute the distance between two such shapes proceeds in three steps. Given two shapes represented by two meshes with the same combinatorics, but different geometry, we construct first an energy surface that has those two shapes as minima. Second, we find the path with minimum action along this energy surface, where the action functional is the Onsager–Machlup action. Finally, we set the distance between the two shapes to be the integral of the action along this path.

Working with shapes described by discrete triangular meshes introduces two issues that we address in this paper. First, there are no reasons to assume that the combinatorics of the meshes representing the two surfaces of interest are identical. As such, the problem of constructing a deformation path between the two surfaces includes two sub-problems: parametrization and deformation. Parametrization corresponds to finding a mapping of the mesh structure of one surface onto the other, while deformation is concerned with finding a sequence of geometric realizations of this mesh as it evolves from its geometry in the start surface to its geometry on the target surface. While there have been recent developments of frameworks that solve those two sub-problems simultaneously [42–44], the approach we propose solves those problems sequentially. For the parametrization problem, we rely on a recent conformal diffeomorphic mapping technique that we have developed for the specific case of surfaces of genus zero [45–47]. The main focus of this paper is on the second sub-problem, which is solved using the minimum action method briefly discussed above and that will be described in detail in the following section. The second issue relates to the definition of an energy that captures the property of a triangular mesh. Many such energies have been defined, such as the as-rigid-as-possible energy and extensions [48–50] that measure the deviation from rigidity, and the shell models that capture stretching and bending [34,51]. In this work, we rely on an ‘elastic energy’ that is inspired from the concept of elastic networks in computational structural biology. Those networks are coarse-grained representations of biomolecules that significantly simplify the modelling of their dynamics [52]. The elastic energy of such a network measures the changes in the lengths of its springs, which we map in our method to the edges of the triangular mesh that is considered. This energy is similar in design to the Riemannian metric introduced by Kilian *et al.* [53].

The paper is organized as follows. Section 2 provides the mathematical background for our method. In particular, we cover the definition of the variational distance between shapes, the Onsager–Machlup action functional, the definition of the energy surface in the space of shapes, and the method used to find a minimum action path on this energy surface. The following section covers the details of the implementation for shapes represented by discrete surface meshes. In particular, we include details on the pre-processing step of mapping a mesh from one surface to another, as this is a necessary step if the two input meshes do not have the same combinatorics. We also introduce the elastic energy of a discrete surface. In §4, we discuss the applications of the deformation part of our method on generic, high-resolution non-rigid shapes from the TOSCA dataset [54]. Section 5 presents and discusses the results obtained by our method on three test cases, representing three regions of the skeletal anatomy of a collection of primates [22]. We conclude the paper with a brief discussion on future developments of the method.

## 2. A generalized variational distance between multidimensional objects

We follow the formalism introduced by Plotkin [27]. For illustration, let us start with the simple case of a distance between two shapes of dimension 0. The distance between the corresponding two points *A* and *B* at position **X**_{A} and **X**_{B} in a Euclidean space is the minimal arclength travelled in the ‘deformation’ from *A* to *B*. This deformation can be captured by a transformation **X**(*t*), a vector function parametrized with a time variable *t*, such that **X**(0) = **X**_{A} and **X**(*T*) = **X**_{B}, for a given large time *T*. The infinitesimal distance travelled is a function of time, . Using the variation of this functional , the minimal distance between *A* and *B* is then defined by [55]
2.1where **X**(*t*)* is the minimal transformation between *A* and *B*. It is easy to see that this distance maps to the traditional Euclidean distance between two points. It can be easily expanded to multidimensional objects [27].

This definition of distance, however, does not account directly for the shape of the space on which the two points *A* and *B* reside. The shape of that space can be captured by a more general functional of the deformation **X**(*t*). Following the concept of Finsler manifold [39], we equip the space of shapes *M* with an induced quasi-metric that is an extension of the distance defined for points:
2.2We apply this definition to compute distances between two conformations of a shape in . In the following, we define our choice for the functional for such shapes and describe the method we propose for finding the corresponding minimum transformation **X**(*t*)*. For generality, we do not state the exact energy function within the functional , and leave this for the following section in which we describe the implementation of this method to discrete shapes. We note that this formalism is inspired by our own work on computing transition paths between two conformations of a biomolecule [40,41]. We refer the reader to the corresponding papers for extended descriptions of the methods.

### 2.1. An action for elastic deformation of shapes

Let us consider a shape *M* described by a position vector **X**. Assuming that we are only interested in effects over long time scales, the dynamic behaviour of this shape under a potential energy *U* can be described by the overdamped Langevin equation:
2.3where *γ* is a friction coefficient, *U*(**X**) is the potential energy of the system, ∇(*U*(**X**)) is the gradient of *U*, and **B** is an uncorrelated random force with zero mean. Note that by rescaling the units of time, it is possible to set *γ* = 1.

Equation (2.3) is a stochastic differential equation; as such, it does not have an analytical solution. However, given a trajectory **X**(*t*) over the time interval [0, *T*], it is possible to evaluate the probability that equation (2.3) would have generated that trajectory. Following the formalism introduced by Onsager & Machlup [38], this probability is given by
2.4where *α* is a normalizing factor, *k*_{B} is the Boltzmann constant and *T*_{e} the temperature. The action is given by
2.5where the Lagrangian is defined by
2.6

Given two possible configurations **X**_{A} and **X**_{B} for the shape *M*, the path of maximum likelihood by which the solution of equation (2.3) connects these two configurations is the one that maximizes the probability given by equation (2.4), i.e. the path that minimizes the action with the boundary conditions **X**(0) = **X**_{A} and **X**(*T*) = **X**_{B}. In the following section, we define a general form for the energy function *U* that enables the implementation of fast solutions for equation (2.3).

### 2.2. A quadratic elastic energy for shapes

Let *E*_{A}(**X**_{A}) and *E*_{B}(**X**_{B}) be the energy values for the shape *M* in configurations **X**_{A} and **X**_{B}, respectively. Let us assume that those energies are values of two analytical functions, *E*_{A}(**X**) and *E*_{B}(**X**), defined such that both have a unique minimum, at **X**_{A} and **X**_{B}, respectively. Assuming small displacements around those minima, these energy functions can be approximated as *U*_{A}(**X**) and *U*_{B}(**X**) using a second-order Taylor expansion:
2.7and
2.8where ∇(*E*_{A})(**X**_{A}) and *H*_{A}(**X**_{A}) = ∇∇(*E*_{A})(**X**_{A}) are the derivatives and Hessian of *E*_{A} at **X**_{A}, respectively, with similar definitions for ∇(*E*_{B})(**X**_{B}) and *H*_{B}(**X**_{B}). As we assume that *E*_{A} is minimum at **X**_{A}, its first derivatives are set to 0 at its minimum **X**_{A}. We can also assume that *E*_{A}(**X**_{A}) = 0. The same applies to *E*_{B}. Therefore,
2.9and
2.10

The energy function *U* for any conformation ** X** is then defined as
2.11This is the form of the energy function that will be used for solving equation (2.3). We note that

*U*is the combination of two quadratic functions; this significantly simplifies the process of solving this equation. This is discussed in the next section. We do note that the introduction of the min function leads to an energy function

*U*and consequently of a functional that are only

*C*

^{0}. As such, this is only an approximation.

### 2.3. A minimum action path between two conformations of a shape

For a given *T*, the path **X** that minimizes the action defined in equation (2.5) satisfies the Euler–Lagrange equation:
2.12subject to **X**(0) = **X**_{A} and **X**(*T*) = **X**_{B}, where ∇∇(*U*) is the Hessian of the potential energy *U* defined above.

Let us consider a possible trajectory solution to the Euler–Lagrange equations. Because of the format of *U* defined above, this trajectory can be seen as the concatenation of two parts, one in a quadratic well *W*_{A} whose minimum is **X**_{A}, and one in the quadratic well *W*_{B} whose minimum is **X**_{B}. Those two trajectories meet at the transition state, **X**_{ts}, at time *t*_{s} with 0 < *t*_{s} < *T*. In well *W*_{A},
2.13and
2.14The equation of motion is therefore
2.15with the two boundary conditions **X**(0) = **X**_{A} and **X**(*t*_{s}) = **X**_{ts}. As *H*_{A}(*X*_{A}) is a constant, the solution of equation (2.15) is then
2.16where *f*_{A}(*H*_{A}, *t*) is a matrix function with the real function *f*_{A}(*x*, *t*) given by
2.17Note that when , .

Similarly, in well *W*_{B}, the equation of motion becomes
2.18with the two boundary conditions **X**(*t*_{s}) = **X**_{ts} and **X**(*T*) = **X**_{B}. As *H*_{B}(**X**_{B}) is a constant, the solution of equation (2.18) is then
2.19where *f*_{B}(*H*_{B}, *t*) is a matrix function with the real function *f*_{B}(*x*, *t*) given by
2.20Note again that when , .

Equations (2.16) and (2.19) describe the trajectories, i.e. the minimum action path for the shape, before and after the transition state **X**_{ts}, respectively. We need to find both **X**_{ts} and *t*_{s}. We note first that the two trajectories are continuous, and meet at **X**_{ts}: **X**_{A}(*t*_{s}) = **X**_{B}(*t*_{s}) = **X**_{ts}. For a given *t*_{s}, we can compute *X*_{ts} by imposing velocity continuity at this point for the two trajectories. Finally, we find *t*_{s} by imposing continuity for the energy at **X**_{ts}.

## 3. Implementation

While there are many ways to describe a shape in our three-dimensional world, we concentrate on the important case where the shape is defined from its surface. In the discrete case, this surface, *M*, is described with a discrete triangular mesh, , with , and denoting the vertices, edges and triangles of , respectively. The geometry of this mesh is defined by a configuration vector **X** that contains the coordinates of all the vertices in . The method described above applies directly to computing the distance between two configurations **X**_{A} and **X**_{B} of the same mesh. It can be easily adapted however to computing the distance between two surfaces *M*_{A} and *M*_{B}, if the meshes and that represent them have the same number of vertices with the same connectivities between them. When the two meshes have different combinatorics, we need to generate first a parametrization between the two surfaces, i.e. a mapping such that the combinatorics of one mesh can be transported from one surface onto the other [56,57]. We have recently developed such a method for the specific case of surfaces of genus zero [45–47]. In the following subsection, we briefly describe this method. We then define the elastic energy *E* for such a mesh. Finally, we briefly describe MeshPath, our program for computing the distance between two configurations of a mesh. MeshPath is available under OpenSource upon request.

### 3.1. Surface parametrization

Let *M*_{A} and *M*_{B} be two surfaces of genus zero, represented by the meshes and , respectively. Both meshes are taken to be triangular. The goal is to define a discrete map , where *f*(*M*_{A}) is a geometric realization of onto *M*_{B}, that is as close as possible to an isometry, i.e. that minimizes the distortion of pairwise distances between vertices. The key observation [58] that makes the problem tractable is that any such function *f* can be understood as the composition of three discrete conformal mapping functions, *C*_{A}, *m* and *C*^{−1}_{B}, where *C*_{A} and *C*_{B} are discrete conformal maps from *M*_{A} and *M*_{B} to the sphere, respectively, and *m* a conformal map of the sphere to itself. Once the discrete maps *C*_{A} and *C*_{B} are known (see [45,59] for details), *f* is fully parametrized by *m*. The group of such conformal self-mappings of the sphere is well understood and is called the Möbius group. Any transformation *m* in this group is defined by specifying the image of three points, and thus has six degrees of freedom. The mapping *f* is then constructed by optimizing *m* so that the composition *C*^{−1}_{B} ° *m* ° *C*_{A} is as near to an isometry as possible, i.e. that it minimizes the functional:
3.1Here and denote the sets of edges in the meshes and , respectively. *A*_{ij} is the sum of the areas of the two triangles adjacent to the edge (*v*_{i}, *v*_{j}) and *l*(*v*_{i}, *v*_{j}) is the distance between the two vertices *v*_{i} and *v*_{j}, namely the length of the edge (*v*_{i}, *v*_{j}). This method is implemented into the program MatchSurface [45,47]. We have used this program for mesh parametrization for all cases presented in the Result section.

We note that the infimum of *D*_{sd}(*f*) as *f* varies over all conformal diffeomorphisms from *M*_{A} to *M*_{B} defines a metric on the space of surfaces of genus zero [60]. We refer to the corresponding distance as *d*_{sd} [47].

### 3.2. An elastic energy for a discrete triangular mesh

Let *M* be a surface of a shape *A* represented by the triangular mesh . We define **X**(0) to be the ground state conformation of this mesh, and **X**(*t*) to be its conformation at a time *t*, namely a new geometric realization of the mesh with the same combinatorics. We note that the mesh is basically a set of vertices, with edges between them. As such, it is similar to an elastic network model (ENM) that is used extensively in molecular structural biology to study the dynamics of biomolecules [52,61,62]. We have therefore defined the energy *E*_{A} of shape *A* to be the elastic energy *E* of the mesh , as defined by the formalism of ENM, as described below.

Let vertex *v*_{i} be characterized by its position *X*_{i} = (*x*_{i}, *y*_{i}, *z*_{i}). For two vertices *v*_{i} and *v*_{j} that are connected with an edge in , we set *r*_{ij} and *r*^{0}_{ij} to be the length of this edge in any conformation **X**(*t*) and in the ground-state conformation **X**(0), respectively. The elastic energy *E*_{A} of the mesh is then set to
3.2

where *k*_{ij} is the force constant of the ‘spring’ formed by the pair of vertices *v*_{i} and *v*_{j}. In the original ENM introduced by Tirion, the elastic constants *k*_{ij} are set to be the same for all pairs of vertices [52]. We follow this setting and choose *k*_{ij} = 1.

Using the elastic energy *E*_{A} defined in equation (3.2), the quadratic energy (see equation (2.10)) of the mesh is then given by
3.3

where the Hessian *H*(**X**(0)) is the matrix containing the second derivatives of *E*_{A} computed for the conformation **X**(0). From equation (3.2), it is easy to find that the 3 × 3 submatrix *H*_{ij} of this Hessian corresponding to the two vertices *i* and *j* of an edge (*v*_{i}, *v*_{j}) in the mesh is given by
3.4

and that the 3 × 3 submatrix *H*_{ii} on the diagonal of *H* is given by
3.5

### 3.3. MeshPath: a program to compute the distance between two shapes

Let *A* and *B* be two shapes defined by their surfaces *M*_{A} and *M*_{B}, themselves represented by the meshes and , respectively. We assume that the two meshes have the same combinatorics; if not, the procedure described above is applied to map onto *M*_{B}. The corresponding image is used in place of . To find the distance between those two shapes, we compute the minimum action path between the conformations of and using the following algorithm.

This algorithm mimics the algorithm we developed for computing the minimum action path between two conformations of a biomolecule [41]. The reader is referred to that paper for details. The constant *TOL* in step (4) is chosen to be 1 × 10^{−6}. Finally, to compute the action of the (discrete) converged trajectory, we use the generalized trapezoidal rule:
3.6where defines the variational distance between the two shapes, which we refer to as *d*_{act}.

We note that for most applications, the meshes will have their own combinatorics. For comparing any two shapes *M*_{1} and *M*_{2} represented with the meshes and , we parametrize first onto and compute both the geometric and variational distances between and its image . We then repeat the process by parametrizing onto . The final distances *d*_{sd} and *d*_{act} between the two shapes are the minima of the corresponding distances for the two parametrizations.

## 4. Distances between generic, discrete surfaces with exact correspondence

The method described in this paper is designed to compute the distance between two shapes by building a geodesic between those two shapes whose action is minimal. As those shapes are described by triangular meshes, it involves two steps, a parametrization step and a deformation step. As the novelty of the method relates to the latter, we tested it first on objects that have the same triangulation, such that the parametrization step is trivial. Figure 1 illustrates with three examples the minimum action trajectories that are generated between two such objects with PathMesh. We have considered centaur0 and centaur1 (figure 1*a*), dog0 and dog1 (figure 1*b*), and cat0 and cat2 (figure 1*c*) from the TOSCA high-resolution dataset [54]. All three trajectories were computed with a total time *T* set to 50. In all three cases, the transition along the path occurred at around *T* = 25, namely in the middle of the trajectories. While the transitions look very natural, we do notice that some deformations of the shapes occur in the vicinity of the transition state, at the level of the left front leg of the centaur, at the right back leg of the dog and even more pronounced at the level of the tail of the cat.

The total effort needed to deform one shape into another is expected to be related to how different those two shapes are. The difficulty is to define what ‘different’ means, in a way that a measure of such differences defines a metric in the space of shapes. PathMesh was designed to provide one such measure of differences by computing *d*_{act}, a quasi-metric for shapes (see above). We compared this distance measure to a more traditional measure of similarities between shapes, namely the coordinate root mean square deviation (RMSD). We chose for that purpose the TOSCA high-resolution dataset [54], available at http://tosca.cs.technion.ac.il. This dataset contains a total of 80 objects regrouped into nine classes, including 11 cats, nine dogs, three wolves, eight horses, six centaurs, four gorillas, 12 female figures (Victoria), and two different male figures, David and Michael, containing 7 and 20 poses, respectively. Objects within one class have the same triangulation and an equal number of vertices that are in direct correspondence. For each class, one mesh (usually the one with index 0) was chosen as a reference, and the distances between the other members of the class and this reference were computed in two different ways. First, the RMSD over the vertices of the triangulation is computed after removal of rigid-body transformations (rotation and translation). The meshes are scaled such that their average edge lengths are set to 1. Second, a path is generated between one object and its reference conformation using PathMesh, and the total action along that path defines the distance *d*_{act}. In all those computations, the elastic constants for the mesh energies are set to 1, and the trajectories are computed with *T* = 50. We note that RMSD, energies, actions and time are all given in arbitrary units. Comparisons of the action distances and RMSDs are given in figure 2.

As expected, the larger the differences between two shapes (as measured by the RMSD), the larger the action-based distance between the two shapes. This behaviour is observed over a broad range of geometric differences between the shapes, up to 100 (with the unit being the average edge length of the meshes considered). The relationship between the two measures of shape differences however is not linear. Interestingly, this relationship is unchanging over all classes included in the TOSCA high-resolution dataset. This is likely due to the fact that the deformations that were applied on the different objects are consistent from one class to another.

Using a single thread on a 4 GHz Intel Core I7, the average CPU time to compute a trajectory between two conformations of a mesh with 15 000 vertices is 470 s, and 2200 s for meshes with 50 000 vertices. We note that most of this cost comes from the computation of a function of a matrix times a vector, which can be performed efficiently, in parallel, using a Krylov-based method (see [41] for details). The parallel version of PathMesh leads to a gain of a factor of 3 when it is run over four threads.

## 5. Distances between biological shapes

To test the effectiveness of our full procedure, which includes parametrization and generation of a minimum action path for computing a variational, action-based distance between shapes, *d*_{act}, we applied it on three independent datasets, representing three regions of the skeletal anatomy of a collection of primates. The first dataset, MT1, contains 61 proximal first metatarsals of prosimian primates, New and Old World monkeys, the second dataset, RADIUS, contains 45 distal radii of apes and humans, and the third dataset, TEETH, includes 116 second mandibular molars of prosimian primates and non-primate close relatives. They were originally assembled by Boyer *et al.* who used them in a similar study with different shape comparison measures based on an ‘earth mover’ metric, and on a ‘continuous Procrustes’ (cP) distance [22]. Using these datasets allows us to evaluate the performances of the variational method proposed above. The evaluation is done by comparing the distances *d*_{act} between surfaces included in the three datasets to the geometric distance *d*_{sd}, and to the distances based on landmarks identified by trained morphologists. There are two such sets of experimental distances for the metatarsal dataset, referred to as ‘observer1’ and ‘observer2’, and one set for the RADIUS dataset and the TEETH dataset, named ‘observer1’. We note that the variational distance, *d*_{act}, in common with *d*_{sd}, does not require preliminary selection of landmarks on the surfaces. Geometric morphometricians on the other hand have determined landmarks on each surface, choosing them to reflect correspondences considered biologically and evolutionarily meaningful (see SI appendix, Materials of ref. [22]). These landmarks determine a ‘discrete’ Procrustes distance between any two surfaces, which we refer to as *d*_{obs}. Finally, we note that we had previously compared *d*_{sd} with the continuous Procrustes distance *d*_{cP} on the same datasets and found that the former performs best [47].

### 5.1. A simple shape recognition application

Shape recognition is a natural application of the variational distance described here. As a simple illustration, we considered four surfaces of proximal metatarsal bones, two from baboons, Pu80771 and PuF in MT1, and two from lemurs, Lf170750 and Lf170755 in MT1. The two surfaces PuF (baboon A) and Lf170755 (lemur A) form a mini library of shapes, while the surfaces Pu80771 (baboon B) and Lf170755 (lemur B) are two test surfaces, which we refer to as ‘surface A’ and ‘surface B’. We studied the deformations of all combinations of test versus library surfaces by computing the minimum action paths between them. In figure 3, we show the evolution of the energy (as defined in equation (2.11)) along the corresponding paths. When the surface A is deformed onto the surfaces for baboon A and lemur A, the corresponding minimum actions *d*_{act} are 0.08 and 0.71, respectively, identifying correctly surface A as corresponding to a baboon. By contrast, surface B is closer to lemur B, and therefore corresponds to a lemur.

Each of the three distances (*d*_{act}, *d*_{sd} and *d*_{obs}) defines a matrix for each dataset, containing all pairwise distances between the surfaces included in that dataset. To measure the effectiveness of each distance, we compare those matrices in three different ways.

### 5.2. Correlations between the different distance measures

The distances considered here rank pairs of surfaces according to their similarity. The relative performances of *d*_{act} and *d*_{sd} with respect to the observer distance *d*_{obs} can then be computed using a correlation analysis. As correlation is searching for a linear relationship and the distances may have different dynamic ranges, we first coded those distances with a simple scheme. The distributions of distances are broken into 10 intervals of equal size (in terms of percent of the total number of distances), and the distance belonging to interval *i* is coded with the value *i*. Table 1 gives the corresponding correlation coefficients between the discretely coded distances.

The distances based on manual assignments of landmarks by morphometricians may be considered as a reference, since they are based on extensive expert knowledge, though they are not deemed perfect. Note that there is variability between morphometricians, though the correlations between the results of two such observers are high. On all three datasets, the geometric distance *d*_{sd} and the variational distance *d*_{act} correlate well with those reference distances.

In figure 4, we compare four (coded) distance matrices for the MT1 dataset.

In all four distance matrices, we clearly identify the separation between the bones of the simians (Primates) and the bones of the prosimians (Euprimates). The structures of the matrices within those two groups, however, differ for the four distance measures. Note that this difference also exists between the two observer distance matrices. The Primates group for example contain four species, with one of them only including one sample. The observer 2 distance matrix clearly highlights the presence of the three main groups, as well as identifies the last bone (corresponding to the unique sample) as being different. This separation is less clear in the observer 1 distance matrix. In the distance matrix based on *d*_{sd}, the primates are mostly organized into two groups, while the action-based distance matrix shows the three dominant subgroups of primates, but regroups the unique sample with the third subgroup. Overall however, the graphical representations of the four matrices confirm the high level of similarity observed in the correlation analysis (table 1).

### 5.3. Receiver operating characteristic analyses of the distance measures

For a second comparison of the three distance measures, we evaluated their performances using a receiver operating characteristic (ROC) analysis [63]. In this approach a ‘gold standard’ is defined, based on a choice of level in the phylogeny of the specimens, species, genus, family or superfamily. A pair of surfaces is then defined as similar, or ‘positive’, if the corresponding specimens belong to the same taxonomic group considered, and ‘negative’ otherwise. For varying thresholds of the distance measure under study, all pairs of surfaces whose distances fall below the threshold are then assumed positive, while those above it are deemed negative. The pairs that agree with the standard are called true positives (TP), while those that do not are false positives (FP). An ROC analysis compares the rate of TP (also called sensitivity) to the rate of FP (which corresponds to one minus the specificity). It is scored with the proportion of area under the corresponding curve (AUC). An AUC of 1 indicates that all TP are detected first. This corresponds to the ideal distance measure. On the other hand, the diagonal curve leads to an AUC of 0.5. In this case, TP and FP appear at the same rate, and the distance measure contains no information.

The results of ROC analyses based on the three distances *d*_{sd}, *d*_{act} and *d*_{obs} are given in table 2 and illustrated in figure 5.

Species or genera identification based on landmarks manually defined by experts is expected to perform best. Indeed, the ROC curves derived from the observer distances illustrate excellent classification results, with AUC values above 0.85 in all cases, above 0.9 for the metatarsal dataset, and above 0.95 for the teeth dataset. Note that even with human expertise included, the classification is not perfect. In addition, we observe differences between the results obtained by two distinct experts on the MT1 dataset. The two distance measures *d*_{sd} and *d*_{act} alleviate the difficulty of defining landmarks on the two surfaces to be compared. The distances they compute reflect either different geometric properties (as measured by *d*_{sd}), or differences in their elastic flexibilities (as measured by *d*_{act}). We find that the distance *d*_{act} introduced in this study performs as well as *d*_{sd} on all three datasets, at all phylogenic classification levels. Those two distances perform as well as the observer, landmark-based, distances, with differences that are of similar magnitude to the differences measured between distinct observers.

In such an ROC analysis, it is worth focusing on the small distances, as those are often the most reliable ones and the most relevant for applications. For each distance measure, we therefore computed a ROC50 value, computed as follows. For each of the first 50 false positives that are found, the number of true positives with a smaller distance is recorded. The sum of all of these numbers is then divided by the number of false positives (50), and finally divided by the total number of possible true positives in the dataset considered, thereby defining the ROC50. Results are given in table 3 for all three datasets. Like the AUC values, we note differences between the ROC50 values for the two observers on the MT1 dataset. The geometric distance and the variational distance perform similarly. The latter do seem to outperform both the observer and *d*_{sd} distances on the RADIUS dataset; this dataset however is the smallest, with only 45 samples, and the differences may not be statistically significant.

### 5.4. Classification of shapes

The ROC analysis ranks distances between specimens and assesses if this ranking is compatible with an existing classification; it does not perform the classification itself. We extend the ROC analysis to the actual problem of classification by performing a third set of computational experiments. Each experiment involves a dataset of surfaces/specimens, *D*, a taxon, *T*, and a distance measure, *d*. We begin by randomly dividing the sets of surfaces in *D* into two groups of approximately equal size. The first group serves as a training set to define the taxa, while the second group serves as a test set. A test surface is classified by assigning it to the taxon of its nearest neighbour in the training set. This is much akin to the fold recognition experiment illustrated in figure 3 and used in the protein structure prediction community [64]. The results are stored in a confusion matrix, *C*, whose element *C*(*i*, *j*) reports the number of test surfaces corresponding to specimens that belong to taxon *i* that have been classified as belonging to taxon *j*. The accuracy of the classifier *d* is then defined to be the ratio of the trace of the confusion matrix over the sum of all its elements (i.e. the percentage of correctly classified specimens). To remove the influence of the initial division of the dataset into test and training sets, the procedure is repeated 10 000 times. We performed these experiments for the three datasets, for the three distance measures and for different levels in the taxonomy of the specimens. The results are reported in table 4.

Classifications based on the three distances perform similarly on the three anatomical datasets. We note again a difference between the two observers on the first metatarsal dataset. These differences indicate the difficulties in defining consistent landmarks on anatomical surfaces even for experienced morphometricians.

A distance matrix that contains the results of an all-against-all comparison of all specimens included in a dataset can be turned into a tree using a variety of clustering techniques. We used the unweighted pair group method with arithmetic mean (UPGMA) method to build trees based on the two distance measures (*d*_{obs} and *d*_{act}) for the RADIUS dataset, as implemented in the software package PHYLIP [65]. UPMGA constructs a tree by minimizing the net disagreement between the matrix pairwise distances and the distances measured on the tree. Results are shown in figure 6. We note that those are phenetic trees, in opposition to phylogenetic trees that are built from molecular sequencing data.

The UPMGA tree based on *d*_{act} shows a high level of agreement with the actual phylogeny of the specimens considered. *Homo sapiens* and *Gorilla* are clearly well separated from the other species, forming pure clades. In addition, *Pongo* and *Pan* are well separated. Note that even the bonobos (pygmy chimpanzees) are regrouped within the clade for chimpanzees. The tree based on the observer distances shows more misassociations, with one chimpanzee showing within *Pongo*, and reversely one *Pongo* showing within the chimpanzees, and two gorillas separated from the main gorilla clade.

To quantify the differences between the trees generated from the observer (*d*_{obs}), action (*d*_{act}) and geometric (*d*_{sd}) distance matrices computed for the RADIUS dataset, we first rescaled those distance matrices so that all distances ranged between 0 and 1, and regenerated the UPMGA trees. The four trees are then compared using the TreeDist program from the software package PHYLIP. TreeDist computes the symmetric distance of Robinson & Foulds [67] to evaluate the similarity between two trees. We find that the distances between the observers' tree and the action-based and geometry-based trees are 72 and 78, respectively. While we cannot assess the meaning of the absolute values of these distances, and the significance of the differences between those values, we do notice that the observers' tree resembles most the tree computed with the action distance measure introduced here.

## 6. Discussion

We have developed a new method for measuring the similarities between two shapes. This method is designed to capture the complexities of deforming one of the shapes into the other. Specifically, we consider the generalization of the conventional notion of distance. We set the distance between two shapes to be the length of the geodesic that joins those two shapes, where the geodesic is computed as the minimizer of an Onsager–Machlup action functional. We have introduced an elastic energy for shapes represented by surface meshes, as well as a method for computing the minimizer of the corresponding action. The integral of the action along this path defines a distance that depends only on the geometry of the initial and final configurations of the shape considered. We have illustrated its use both on generic shapes with exact correspondences, and on sets of specific biological shapes on which we performed shape recognition and classification.

While data representing a shape come in many forms, we have focused on the specific case in which a shape is represented by its surface, which in turn is described with a discrete triangular mesh. With such a representation, computing a deformation between two conformations of a shape requires solving two steps, namely finding a parametrization (computing representations of the two conformations with exact correspondence between their vertices and mesh structures), and computing the deformation that leads to minimal effort, in our case minimal action. We note first that solving these two steps sequentially is not a necessity: the elastic shape analysis techniques developed by Laga and co-workers solve those steps simultaneously [42–44]. We believe, however, that those two steps include specific problems that require different treatments, and prefer to handle them separately. Our solutions to those problems rely on our own earlier work, first on the geometry of shapes for performing the parametrization, and on the generation of the minimal action path between two conformations of a biomolecule [40,41] for the deformation step. As such, it carries both their advantages and their limitations. With respect to parametrization, for example, our method relies on a conformal registration method that is described in [45–47]. We note that this method is currently limited to meshes representing surfaces of genus zero, with or without boundaries. Much work remains to be done for performing a parametrization that is not limited to a specific genus.

The method proposed to generate a path between two conformations of a shape is loosely derived from physics. We equate the deformation of a shape to a diffusive process. The probability of a trajectory for such a diffusive process can be computed as an integration over the trajectory of a functional, given by the Onsager–Machlup function. This argument can be reversed to find the trajectory with the highest probability, or equivalently of minimal action. This leads to a second order two-point boundary value ODE problem. With a special choice for the energy of a shape, those equations can be solved analytically. The elastic energy we have introduced that satisfies this condition is much akin to the energy of an ENM and as such suffers from the same drawbacks. First, it should be noted that it is really a geometry-based energy function, and not a physics-based energy function. As such, it lacks the ability to capture the properties of the surface, such as stretching and bending [34,51]. In addition, it does not account for steric hindrance, which may lead to self-intersection of the surface during the deformation. There is a second problem with the energy function that comes from our definition of the (simplified) energy surface for the space of conformations for the shape considered. By defining this energy surface as the lower envelope of the quadratic wells at two prescribed minima, namely the start and target conformations of the shape, we observe problems of smoothness at the crossing between the wells. This leads to the presence of a cusp in energy at the transition (figure 3), which in turn leads to unphysical deformations of the surfaces, as observed in figure 1 for both the cat and the dog near the transition. We are currently working on implementing solutions that circumvent those limitations, in line with our recent developments when solving for minimum action paths for biomolecules [41].

MeshPath is a method for solving the Euler–Lagrange equations corresponding to the constraint of minimum action along the deformation path for a shape. As it solves these equations of motion analytically, it is extremely efficient and directly amenable to study large systems, despite the fact that it relies on the computation of Hessians of the energy functions. We have shown applications on meshes that include tens of thousands of vertices (from the high-resolution dataset from TOSCA), with computing times of the order of thousands of seconds on a unique thread, and of the order of hundreds of seconds on multiple threads.

The main application of PathMesh presented in this paper falls in the field of geometric morphometry, using three datasets representing bones and teeth of primates. Experiments on these datasets have shown that it performs remarkably well both in shape recognition and in identifying evolutionary patterns. We note that while we show successful taxon recognition based on geometric information for one of the datasets considered (the RADIUS dataset, see figure 6), taxonomy is not the intended purpose of this method. Instead, we restrict its current applications to providing robust estimates of the distances between three-dimensional shapes, with one finality being to support phylogeny reconstruction. As the advent of computers and the developments of robust and fast scientific computing techniques to quantify phenotypes, there has been a wealth of studies attempting to provide a quantitative view of evolution in biology, starting with classification and taxonomy. The development of genomics in the last three decades has led many to question the relevance of geometrics morphometrics for such studies. Indeed, the assessments of genetic variations from automated genomic analyses have greatly improved our understanding of the relationships between the genotype and phenotype of individuals of many species (for a review, see [68]). Those relationships serve as ground for building the phylogeny of organisms, i.e. the history of their lineages as they change through time. As a consequence, phylogenetics is playing a central role in the study of evolution [69]. In comparison, it is much harder to come up with good models of the genetics that underlie morphological changes [70]. As a consequence, the utility of morphological data in phylogenic research has become increasingly questioned. As a response to the criticisms expressed against geometric morphometry, MacLeod *et al.* [71] and more recently Boyer *et al.* [72] have emphasized the need to automate and standardize morphological studies, starting with the determination of geometric correspondence between shapes. The method developed in this paper is one attempt towards this goal.

## Data accessibility

The TOSCA high-resolution dataset is available at http://tosca.cs.technion.ac.il. The biological datasets, representing three regions of the skeletal anatomy of a collection of primates, are available on Yaron Lipman's website (http://www.wisdom.weizmann.ac.il/~ylipman/CPsurfcomp/).

## Competing interests

I declare I have no competing interests.

## Funding

The Ministry of Education of Singapore is acknowledged for financial support through grant no. MOE2012-T3-1-008.

## Acknowledgments

I thank Yaron Lipman for making the data from Boyer *et al.* [22] freely available. I also thank the anonymous reviewers for their insightful comments and suggestions.

- Received January 17, 2017.
- Accepted April 24, 2017.

- © 2017 The Author(s)

Published by the Royal Society. All rights reserved.