Abstract
In the paper, a variant of the semismooth\(^{*}\) Newton method is developed for the numerical solution of generalized equations, in which the multi-valued part is a so-called SCD (subspace containing derivative) mapping. Under a rather mild regularity requirement, the method exhibits (locally) superlinear convergence behavior. From the main conceptual algorithm, two implementable variants are derived whose efficiency is tested via a generalized equation modeling a discretized static contact problem with Coulomb friction.
Similar content being viewed by others
1 Introduction
Consider the generalized equation (GE)
where \(f:\mathbb {R}^n\rightarrow \mathbb {R}^n\) is continuously differentiable and \(Q:\mathbb {R}^n\rightrightarrows \mathbb {R}^n\) is a set-valued mapping with a closed graph. For the numerical solution of (1.1) various methods are available, including the semismooth\(^{*}\) Newton method developed in [13]. In this method, the approximation/linearization of the multi-valued term in (1.1) is performed on the basis of either the graph of the respective strict derivative or the limiting coderivative. In each Newton step, one has to solve a linear square system with a non-singular matrix. The method thus differs both from the approach of Josephy [20, 21], where the multi-valued part is not approximated at all, and from the Newton-type methods in [1] and [8], where the multi-valued term is approximated in a different way.
In [14] the authors have shown that for a class of the so-called SCD (subspace containing derivatives) mappings the semismooth\(^{*}\) Newton can be improved. In particular, at these mappings, we dispose at each point with linear subspaces belonging to the graphs of the above-mentioned generalized derivatives, which generate the linear systems in the Newton step in a straightforward way. Moreover, the "regularity" requirement, needed to ensure the (locally) superlinear convergence, could have been substantially relaxed. In [15] this so-called SCD semismooth\(^{*}\) Newton method has been implemented in a class of variational inequalities (VIs) of the second kind that includes, among various problems of practical importance, also a class of discretized contact problems with Tresca friction ([16]). The very good performance of the new method, when applied to those problems, has led us to consider more complicated problems, in which Q does not amount to the subdifferential of a convex function. This clearly requires a generalization of the theory of [15]. As a concrete representative problem of this type, we have chosen a discretized static contact problem, where the Tresca friction is replaced by the (physically more realistic) Coulomb friction.
Starting with the pioneering paper [27] there are many papers and a comprehensive monograph [11] devoted to static, quasi-static and dynamic contact problems with Coulomb friction for various types of material of the bodies in contact. Concerning the static contact of two elastic bodies or an elastic body with a rigid obstacle, it is known [27] that this problem has a (not necessarily unique) solution whenever the friction coefficient belongs to the interval (0, b], where \(b > 0\) is a bound depending on the Poisson constant. This is a great difference from the corresponding discretized problems where, for small (mesh-dependent) values of the friction coefficient, one has to do with a unique solution. However, when the friction coefficient increases, discretized models may allow multiple solutions, as shown, e.g., in [26, 30],
For simplicity, we consider only the contact of an elastic body with a rigid obstacle (Signorini problem with friction). From the algebraic point of view, each two-body problem can be rewritten formally as a one-body problem, see [33]. Such problems can be modeled as quasi-variational inequalities (QVIs) expressed in terms of the so-called dual variables (having the physical meaning of stresses). This enables us to employ a variety of methods developed for the numerical solution of QVIs, cf., e.g., [18, Chapter 5], [12]. As shown in [23, 28], to these methods we can also count the classical semismooth Newton method [25, 29]. Another approach has been used in [2, 3], where a solution is computed as a fixed point of a mapping generated by solving the corresponding contact problems with the Tresca friction. These problems can be solved, for example, by a specially tailored minimization routine; see [24]. In this paper, we use a new discrete model formulated in displacements, which is obtained by a modification of the GE used in [3]. We thus work with a purely "primal" model, well suited for a direct application of the SCD semismooth\(^{*}\) Newton method.
The plan of the paper is as follows: In Sect. 2 we recall some standard notions from variational analysis, which are extensively used throughout the paper. Section 3 is devoted to the SCD mappings; we list their basic properties and provide some calculus rules, indispensable in the construction of the SCD semismooth\(^{*}\) Newton method in Sect. 4 and its subsequent implementation. In Sect. 4 we present the main conceptual algorithm which is thereafter, in Sect. 5, implemented to the numerical solution of GE (1.1). As a result, we obtain an efficient tool, applicable to a wide range of equilibrium models including VIs of the first and second kind, hemivariational inequalities [19] and many others. The first part of Sect. 6 is devoted to the construction of the new model of the discrete contact problem with Coulomb friction mentioned above (Section 6.1). In Section 6.2, it is shown that the multifunction, which arises in the respective GE, is an SCD mapping and possesses the semismooth\(^{*}\) property. This finally enables us to specialize the formulas for the approximation and Newton step, developed in Sect. 5, for the GE considered. The results of the numerical experiments are presented in Sect. 7.
Our notation is basically standard. Given a linear subspace \(L \subseteq \mathbb {R}^n\), \(L^\perp\) denotes its orthogonal complement. For an element \(u \in \mathbb {R}^n\), \(\Vert u\Vert\) denotes its Euclidean norm, \({\mathscr{B}}_\epsilon (u)\) denotes the closed ball around u with radius \(\epsilon\) and \(\mathbb {S}_{\mathbb {R}^n}\) stands for the unit sphere in \(\mathbb {R}^n\). For a matrix A, \(\mathrm{rge\;}A\) signifies its range. To avoid possible confusion, in some situations the dimension of a unit matrix I will be indicated by a subscript (\(I_n\)). Given a set \(\Omega \subset \mathbb {R}^s\), we define the distance from a point x to \(\Omega\) by \(d_{\Omega }(x):= \mathrm{dist}(x, \Omega ):= \inf \{\Vert y-x\Vert \,\big |\,y\in \Omega \}\), the respective indicator function is denoted by \(\delta _{\Omega }\) and \({\mathop {x \rightarrow \bar{x}}\limits ^{\Omega }}\) means convergence within \(\Omega\). When a mapping \(F: \mathbb {R}^n \rightarrow \mathbb {R}^m\) is differentiable at x, we denote by \(\nabla F(x)\) its Jacobian.
2 Preliminaries
Throughout the paper, we will frequently use the following basic notions of modern variational analysis.
Definition 2.1
Let A be a set in \(\mathbb {R}^{s}\), \(\bar{x} \in A\) and A be locally closed around \(\bar{x}\). Then
-
(i)
The tangent (contingent, Bouligand) cone to A at \(\bar{x}\) is given by
$$\begin{aligned} T_{A}(\bar{x}):=\mathop {\mathrm{Lim}\,\mathrm{sup}}\limits _{t\downarrow 0} \frac{A-\bar{x}}{t}. \end{aligned}$$A tangent \(u\in T_A(\bar{x})\) is called derivable if \(\lim _{t\downarrow 0}\mathrm{dist}(\bar{x}+tu,A)/t=0\). The set A is geometrically derivable at \(\bar{x}\) if every tangent vector u to A at \(\bar{x}\) is derivable.
-
(ii)
The set
$$\begin{aligned} \widehat{N}_{A}(\bar{x}):=(T_{A}(\bar{x}))^{\circ } \end{aligned}$$is the regular (Fréchet) normal cone to A at \(\bar{x}\), and
$$\begin{aligned}N_{A}(\bar{x}):=\mathop {\mathrm{Lim}\,\mathrm{sup}}\limits _{{\mathop {x \rightarrow \bar{x}}\limits ^{A}}} \widehat{N}_{A}(x)\end{aligned}$$is the limiting (Mordukhovich) normal cone to A at \(\bar{x}\).
In this definition ”\(\mathop {\mathrm{Lim}\,\mathrm{sup}}\)” stands for the Painlevé-Kuratowski outer (upper) set limit, see, e.g., [32]. The above listed cones enable us to describe the local behavior of set-valued maps via various generalized derivatives. Let \(F:\mathbb {R}^n\rightrightarrows \mathbb {R}^m\) be a (set-valued) mapping with the domain and the graph
Definition 2.2
Consider a (set-valued) mapping \(F:\mathbb {R}^n\rightrightarrows \mathbb {R}^m\) and let \(\mathrm {gph}\,F\) be locally closed around some \((\bar{x},\bar{y})\in \mathrm {gph}\,F\).
-
(i)
The multifunction \(DF(\bar{x},\bar{y}):\mathbb {R}^n\rightrightarrows \mathbb {R}^m\), given by \(\mathrm {gph}\,DF(\bar{x},\bar{y})=T_{\mathrm {gph}\,F}(\bar{x},\bar{y})\), is called the graphical derivative of F at \((\bar{x},\bar{y})\).
-
(ii)
The multifunction \(D^*F(\bar{x},\bar{y}): \mathbb {R}^m \rightrightarrows \mathbb {R}^n\), defined by
$$\begin{aligned} \mathrm {gph}\,D^*F(\bar{x},\bar{y})=\{(y^*,x^*)\,\big |\,(x^*,-y^*)\in N_{\mathrm {gph}\,F}(\bar{x},\bar{y})\}\end{aligned}$$is called the limiting coderivative of F at \((\bar{x},\bar{y})\).
Let us now recall the following regularity notions.
Definition 2.3
Let \(F:\mathbb {R}^n\rightrightarrows \mathbb {R}^m\) be a (set-valued) mapping and let \((\bar{x},\bar{y})\in \mathrm {gph}\,F\).
-
1.
F is said to be metrically subregular at \((\bar{x},\bar{y})\) if there exists a real \(\kappa \ge 0\) along with some neighborhood X of \(\bar{x}\) such that
$$\begin{aligned} \mathrm{dist}(x,F^{-1}(\bar{y}))\le \kappa \,\mathrm{dist}(\bar{y},F(x))\ \forall x\in X. \end{aligned}$$ -
2.
F is said to be strongly metrically subregular at \((\bar{x},\bar{y})\) if it is metrically subregular at \((\bar{x},\bar{y})\) and there exists a neighborhood \(X'\) of \(\bar{x}\) such that \(F^{-1}(\bar{y})\cap X'=\{\bar{x}\}\).
-
3.
F is said to be metrically regular around \((\bar{x},\bar{y})\) if there is some \(\kappa \ge 0\) together with neighborhoods X of \(\bar{x}\) and Y of \(\bar{y}\) such that
$$\begin{aligned} \mathrm{dist}(x,F^{-1}(y))\le \kappa \,\mathrm{dist}(y,F(x))\ \forall (x,y)\in X\times Y. \end{aligned}$$ -
4.
F is said to be strongly metrically regular around \((\bar{x},\bar{y})\) if it is metrically regular around \((\bar{x},\bar{y})\) and \(F^{-1}\) has a single-valued localization around \((\bar{y},\bar{y})\), i.e., there are open neighborhoods \(Y'\) of \(\bar{y}\), \(X'\) of \(\bar{x}\) and a mapping \(h:Y'\rightarrow \mathbb {R}^n\) with \(h(\bar{y})=\bar{x}\) such that \(\mathrm {gph}\,F\cap (X'\times Y')=\{(h(y),y)\,\big |\,y\in Y'\}\).
It is easy to see that the strong metric regularity around \((\bar{x},\bar{y})\) implies the strong metric subregularity at \((\bar{x},\bar{y})\) and the metric regularity around \((\bar{x},\bar{y})\) implies the metric subregularity at \((\bar{x},\bar{y})\). To check the metric regularity one often employs the so-called Mordukhovich criterion, see, e.g. [32, Theorem 9.43], according to which this property around \((\bar{x},\bar{y})\) is equivalent to the condition
Another useful characterization of metric regularity in terms of the graphical derivative is provided by the so-called Aubin-criterion by Dontchev et al [9]. For pointwise characterizations of the other stability properties from Definition 2.3, the reader is referred to [14, Theorem 2.7].
In this preparatory section, we end with a definition of the semismooth\(^{*}\) property, which paved the way for both the semismooth\(^{*}\) Newton method in [13] and the SCD semismooth\(^{*}\) Newton method in [14].
Definition 2.4
We say that \(F:\mathbb {R}^n\rightrightarrows \mathbb {R}^n\) is semismooth\(^{*}\) at \((\bar{x},\bar{y})\in \mathrm {gph}\,F\) if for every \(\epsilon >0\) there is some \(\delta >0\) such that the inequality
holds for all \((x,y)\in \mathrm {gph}\,F\cap {\mathscr{B}}_\delta (\bar{x},\bar{y})\) and all \((y^*,x^*)\) belonging to \(\mathrm {gph}\,D^*F(x,y)\).
3 On SCD mappings
3.1 Basic properties
In this section, we wish to recall the basic definitions and characteristics of the SCD property introduced in the recent paper [14].
In what follows, we denote by \(\mathcal{Z}_n\) the metric space of all n-dimensional subspaces of \(\mathbb {R}^{2n}\) equipped with the metric
where \(P_{L_i}\) is the symmetric \(2n\times 2n\) matrix representing the orthogonal projection onto \(L_i\), \(i=1,2\).
Sometimes, we will also work with bases for the subspaces \(L\in \mathcal{Z}_n\). Let \({\mathscr{M}}_n\) denote the collection of all \(2n\times n\) matrices with full column rank n and define for \(L\in \mathcal{Z}_n\) the set
i.e., the columns of \(Z\in {\mathscr{M}}(L)\) create a basis for L.
We treat every element of \(\mathbb {R}^{2n}\) as a column vector. To keep the notation simple, we write (u, v) instead of \(\begin{pmatrix}u\\ v\end{pmatrix}\in \mathbb {R}^{2n}\) when this does not cause confusion.
Let \(L\in \mathcal{Z}_n\) and consider \(Z\in {\mathscr{M}}(L)\). Then we can divide Z into two \(n\times n\) matrices A and B and write \(Z=(A,B)\) instead of \(Z=\begin{pmatrix}A\\ B\end{pmatrix}\). It follows that \(\mathrm{rge\;}(A,B):=\{(Au,Bu)\,\big |\,u\in \mathbb {R}^n\}\doteq \{\begin{pmatrix}Au\\ Bu\end{pmatrix}\,\big |\,u\in \mathbb {R}^n\}=L\).
Furthermore, for every \(L\in \mathcal{Z}_n\) we can define the adjoint space
It can be shown that \((L^*)^*=L\) and \(d_{\mathcal{Z}_n}(L_1,L_2)=d_{\mathcal{Z}_n}(L_1^*,L_2^*)\). Thus, the mapping \(L\rightarrow L^*\) defines an isometry on \(\mathcal{Z}_n\).
Definition 3.1
Consider a mapping \(F:\mathbb {R}^n\rightrightarrows \mathbb {R}^n\).
-
1.
We call F graphically smooth of dimension n at \((x,y)\in \mathrm {gph}\,F\), if \(T_{\mathrm {gph}\,F}(x,y)=\mathrm {gph}\,DF(x,y)\in \mathcal{Z}_n\). In addition, we denote by \(\mathcal{O}_F\) the set of all points where F is graphically smooth of dimension n.
-
2.
We associate with F the four mappings \(\widehat{\mathscr{S}}F:\mathrm {gph}\,F\rightrightarrows \mathcal{Z}_n\), \(\widehat{\mathscr{S}}^* F:\mathrm {gph}\,F\rightrightarrows \mathcal{Z}_n\), \({\mathscr{S}}F:\mathrm {gph}\,F\rightrightarrows \mathcal{Z}_n\), \({\mathscr{S}}^* F:\mathrm {gph}\,F\rightrightarrows \mathcal{Z}_n\), given by
$$\begin{aligned}\widehat{\mathscr{S}}F(x,y)&:={\left\{ \begin{array}{ll}\{\mathrm {gph}\,DF(x,y)\}&{} \text{ if }\quad (x,y)\in \mathcal{O}_F,\\ \emptyset &{}\text{ else, }\end{array}\right. }\\ \widehat{\mathscr{S}}^* F(x,y)&:={\left\{ \begin{array}{ll}\{\mathrm {gph}\,DF(x,y)^*\}&{} \text{ if }\quad (x,y)\in \mathcal{O}_F,\\ \emptyset &{}\text{ else, }\end{array}\right. }\\ {\mathscr{S}}F(x,y)&:=\mathop {\mathrm{Lim}\,\mathrm{sup}}_{(u,v)\mathop {\longrightarrow }\limits ^{{\mathrm {gph}\,F}}(x,y)} \widehat{\mathscr{S}}F(u,v) \\&=\{L\in \mathcal{Z}_n\,\big |\,\exists (x_k,y_k)\mathop {\longrightarrow }\limits ^{{\mathcal{O}_F}}(x,y):\ \lim _{k\rightarrow \infty } d_{\mathcal{Z}_n}(L,\mathrm {gph}\,DF(x_k,y_k))=0\},\\ {\mathscr{S}}^* F(x,y)&:=\mathop {\mathrm{Lim}\,\mathrm{sup}}_{(u,v)\mathop {\longrightarrow }\limits ^{{\mathrm {gph}\,F}}(x,y)} \widehat{\mathscr{S}}^* F(u,v)\\&=\{L\in \mathcal{Z}_n\,\big |\,\exists (x_k,y_k)\mathop {\longrightarrow }\limits ^{{\mathcal{O}_F}}(x,y):\ \lim _{k\rightarrow \infty } d_{\mathcal{Z}_n}(L,\mathrm {gph}\,DF(x_k,y_k)^*)=0\}. \end{aligned}$$ -
3.
We say that F has the SCD (subspace containing derivative) property at \((x,y)\in \mathrm {gph}\,F\), if \({\mathscr{S}}^*F(x,y)\not =\emptyset\). F is said to have the SCD property around \((x,y)\in \mathrm {gph}\,F\), if there is a neighborhood W of (x, y) such that F has the SCD property at every \((x',y')\in \mathrm {gph}\,F\cap W\). Finally, we call F an SCD mapping if F has the SCD property at every point of its graph.
Since \(L\rightarrow L^*\) is an isometry on \(\mathcal{Z}_n\) and \((L^*)^*=L\), the mappings \({\mathscr{S}}^*F\) and \({\mathscr{S}}F\) are related through
The name SCD property is motivated by the following statement.
Lemma 3.2
(cf.[14, Lemma 3.7]) Let \(F:\mathbb {R}^n\rightrightarrows \mathbb {R}^n\) and let \((x,y)\in \mathrm {gph}\,F\). Then \(L\subseteq \mathrm {gph}\,D^*F(x,y)\) \(\forall L\in {\mathscr{S}}^*F(x,y)\).
In the recent paper [14] one can find several calculus rules to work with the SCD property including the next result.
Proposition 3.3
(cf.[14, Proposition 3.15]) Let \(F:\mathbb {R}^n\rightrightarrows \mathbb {R}^n\) have the SCD property at \((x,y)\in \mathrm {gph}\,F\) and let \(h:U\rightarrow \mathbb {R}^n\) be continuously differentiable at \(x\in U\) where \(U\subseteq \mathbb {R}^n\) is open. Then \(F+h\) has the SCD property at \((x,y+h(x))\) and
Note that these sum rules are also valid at the points \((x,y)\in \mathrm {gph}\,F\) where F does not have the SCD property: In this case, we simply have \({\mathscr{S}}(F+h)(x,y+h(x))={\mathscr{S}}^*(F+h)(x,y+h(x))={\mathscr{S}}F(x,y)={\mathscr{S}}^* F(x,y)=\emptyset\). In addition, we will need some calculus rules for the Cartesian product of mappings. Consider the mapping \(F:\prod _{i=1}^p\mathbb {R}^{n_i}\rightrightarrows \prod _{i=1}^p\mathbb {R}^{n_i}\) defined by
where each multifunction \(F_i:\mathbb {R}^{n_i}\rightrightarrows \mathbb {R}^{n_i}\), \(i=1,\ldots ,p\), has a closed graph. Note that
where the first equation follows from the identity
together with [32, Exercise 6.7] and the inclusion (3.7) is a consequence of [32, Proposition 6.41].
Lemma 3.4
Let F be given by (3.5) and let \((x,y):=\big ((x_1,\ldots ,x_p),(y_1,\ldots ,y_p)\big )\in \mathrm {gph}\,F\). Then we have the following:
-
1.
If \((x,y)\in \mathcal{O}_F\) then each tangent cone \(T_{\mathrm {gph}\,F_i}(x_i,y_i)\), \(i=1,\ldots ,p\), is a subspace of \(\mathbb {R}^{2n_i}\).
-
2.
On the contrary, if \((x_i,y_i)\in \mathcal{O}_{F_i}\), \(i=1,\ldots ,p\), and all but at most one of the sets \(\mathrm {gph}\,F_i\), \(i=1,\ldots ,p\), are geometrically derivable at \((x_i,y_i)\), then \((x,y)\in \mathcal{O}_F\).
Proof
In order to prove the first assertion, let \((x,y)\in \mathcal{O}_F\), let \(i\in \{1,\ldots ,p\}\) be arbitrarily fixed and consider the set
Then it readily follows from Definition 2.1(i) that \(L_i\) is a subset of \(T:=T_{\mathrm {gph}\,F_1\times \ldots \times \mathrm {gph}\,F_p}\big ((x_1,y_1),\ldots ,(x_p,y_p)\big )\), which is a subspace by (3.6). Consider two tangents \(t_1,t_2\in T_{\mathrm {gph}\,F_i}(x_i,y_i)\) together with two scalars \(\mu _1,\mu _2\). Since \((0_{m_1},t_j,0_{m_2})\in L_i\subset T\), \(j=1,2\), we conclude
and from (3.7) we deduce \(\mu _1t_1+\mu _2t_2\in T_{\mathrm {gph}\,F_i}(x_i,y_i)\). This proves that \(T_{\mathrm {gph}\,F_i}(x_i,y_i)\) is a subspace.
The second statement follows from the fact that under the stated assumption, inclusion (3.7) holds with equality, cf. [17, Proposition 1].
Proposition 3.5
Let F be given by (3.5) and assume that all the mappings \(F_i\), \(i=1,\ldots ,p\), are SCD mappings. If all, but at most one, of the mappings \(F_i\), \(i=1,\ldots ,p\), have the property that \(\mathrm {gph}\,F_i\) is geometrically derivable at every point \((x_i,y_i)\in \mathcal{O}_{F_i}\), then F is an SCD mapping, and
Moreover, for every \((x,y)=\big ((x_1,\ldots ,x_p),(y_1,\ldots ,y_p)\big )\in \mathrm {gph}\,F\) there holds
The equality holds in the inclusions (3.8), (3.9) and (3.10) if, in addition, all but at most one of the mappings \(F_i\), \(i=1,\ldots ,p\), have the following property: For every \((x_i,y_i)\in \mathrm {gph}\,F_i\) such that \(T_{\mathrm {gph}\,F_i}(x_i,y_i)\) is a subspace, the dimension of this subspace is \(n_i\).
Proof
The inclusions (3.8) and (3.9) follow immediately from Lemma 3.4. Since \(F_i\), \(i=1,\ldots , p\), are SCD mappings, \(\mathcal{O}_{F_i}\) is dense in \(\mathrm {gph}\,F_i\) and from (3.8) we conclude that \(\mathcal{O}_F\) is dense in \(\mathrm {gph}\,F\). This proves that F is an SCD mapping. Now consider subspaces \(L_i\in {\mathscr{S}}F_i(x_i,y_i)\), \(i=1,\ldots ,p\). By taking into account the relation
the inclusion (3.10) follows from (3.9) and (3.2). The statement about equality in (3.8) is again a consequence of Lemma 3.4 implying equality in (3.9) and (3.10).
The following large class of graphically Lipschitzian mappings covers many mappings important in applications; cf. [31], and is also important in the context of SCD mappings.
Definition 3.6
(cf.[32, Definition 9.66]) A mapping \(F:\mathbb {R}^n\rightrightarrows \mathbb {R}^m\) is graphically Lipschitzian of dimension d at \((\bar{x},\bar{y})\in \mathrm {gph}\,F\) if there is an open neighborhood W of \((\bar{x},\bar{y})\) and a one-to-one mapping \(\Phi\) from W onto an open subset of \(\mathbb {R}^{n+m}\) with \(\Phi\) and \(\Phi ^{-1}\) continuously differentiable, such that \(\Phi (\mathrm {gph}\,F\cap W)\) is the graph of a Lipschitz continuous mapping \(f:U\rightarrow \mathbb {R}^{n+m-d}\), where U is an open set in \(\mathbb {R}^d\).
Every multifunction \(F:\mathbb {R}^n\rightrightarrows \mathbb {R}^n\) that is graphically Lischitzian of dimension n at some point \((x,y)\in \mathrm {gph}\,F\), has the SCD property around (x, y) by [14, Proposition 3.17]. We now state another property related to Proposition 3.5.
Lemma 3.7
Let \(F:\mathbb {R}^n\rightrightarrows \mathbb {R}^n\) be graphically Lipschitzian of dimension n at \((\bar{x},\bar{y})\in \mathrm {gph}\,F\). Then there is an open neighborhood W of \((\bar{x},\bar{y})\) such that for all \((x,y)\in \mathrm {gph}\,F\cap W\) the following properties hold:
-
(i)
If \((x,y)\in \mathcal{O}_F\) then \(\mathrm {gph}\,F\) is geometrically derivable at (x, y).
-
(ii)
If \(T_{\mathrm {gph}\,F}(x,y)\) is a subspace, then the dimension of this subspace is n.
Proof
Regarding property (i), we refer to [14, Remark 3.18] and [32, Proposition 8.41]. To show the second statement, let W, \(\Phi\), U and f be as in Definition 3.6 and consider \((x,y)\in \mathrm {gph}\,F\cap W\) such that \(T_{\mathrm {gph}\,F}(x,y)\) is a subspace. Denoting \((u,f(u))= \Phi (x,y)\), we obtain that
is a subspace with the same dimension as \(T_{\mathrm {gph}\,F}(x,y)\). Therefore, the graphical derivative Df(u, f(u)) is a linear mapping and f is Fréchet differentiable at u by [32, Exercise 9.25]. Since \(T_{\mathrm {gph}\,f}(u,f(u))=\mathrm{rge\;}(I,\nabla f(u))\), the dimension of \(T_{\mathrm {gph}\,f}(u,f(u))\) is n, which is the same as the dimension of \(T_{\mathrm {gph}\,F}(x,y)\).
Next, we turn to the notion of SCD regularity.
Definition 3.8
-
1.
We denote by \(\mathcal{Z}_n^\mathrm{reg}\) the collection of all subspaces \(L\in \mathcal{Z}_n\) such that
$$\begin{aligned} (y^*,0)\in L\ \Rightarrow \ y^*=0. \end{aligned}$$ -
2.
A mapping \(F:\mathbb {R}^n\rightrightarrows \mathbb {R}^n\) is called SCD regular around \((x,y)\in \mathrm {gph}\,F\), if F has the SCD property around (x, y) and
$$\begin{aligned} (y^*,0)\in L \Rightarrow \ y^*=0\ \forall L\in {\mathscr{S}}^*F(x,y), \end{aligned}$$(3.11)i.e., \(L\in \mathcal{Z}_n^\mathrm{reg}\) for all \(L\in {\mathscr{S}}^*F(x,y)\). Further, we will denote by
$$\begin{aligned}\mathrm{scd\,reg\;}F(x,y):=\sup \{\Vert y^*\Vert \,\big |\,(y^*,x^*)\in L, L\in {\mathscr{S}}^*F(x,y), \Vert x^*\Vert \le 1\}\end{aligned}$$the modulus of SCD regularity of F around (x, y).
Since the elements of \({\mathscr{S}}^*F(x,y)\) are contained in \(\mathrm {gph}\,D^*F(x,y)\), it follows from the Mordukhovich criterion (2.1) that SCD regularity is weaker than metric regularity, and consequently SCD regularity is also weaker than strong metric regularity.
In the following proposition, we state some basic properties of subspaces \(L\in \mathcal{Z}_n^\mathrm{reg}\).
Proposition 3.9
(cf.[14, Proposition 4.2]) We have \(L\in \mathcal{Z}_n^\mathrm{reg}\) if and only if for every \((A,B)\in {\mathscr{M}}(L)\) the matrix B is not singular. Thus, for every \(L\in \mathcal{Z}_n^\mathrm{reg}\) there is a unique \(n\times n\) matrix \(C_L\) such that \(L=\mathrm{rge\;}(C_L,I)\). Further, \(L^*=\mathrm{rge\;}(C_L^T,I)\in \mathcal{Z}_n^\mathrm{reg}\),
and
Note that for every \(L\in \mathcal{Z}_n^{\mathrm{reg\,}}\) there is \(C_L=AB^{-1}\) for all \((A,B)\in {\mathscr{M}}(L)\). Combining [14, Equation (34), Lemma 4.7 and Proposition 4.8] we obtain the following lemma.
Lemma 3.10
Assume that \(F:\mathbb {R}^n\rightrightarrows \mathbb {R}^n\) is SCD regular around \((\bar{x},\bar{y})\in \mathrm {gph}\,F\). Then
Moreover, F is SCD regular around every \((x,y)\in \mathrm {gph}\,F\) sufficiently close to \((\bar{x},\bar{y})\) and
4 On semismooth* Newton methods for SCD mappings
In this section we recall the general framework for the semismooth\(^{*}\) Newton method introduced in [13] and adapted to SCD mappings in [14]. Consider the inclusion
where \(F:\mathbb {R}^n\rightrightarrows \mathbb {R}^n\) is a mapping having the SCD property around some point \((\bar{x},0)\in \mathrm {gph}\,F\).
The following notion relaxes the semismooth\(^{*}\) property from Definition 2.4.
Definition 4.1
We say that \(F:\mathbb {R}^n\rightrightarrows \mathbb {R}^n\) is SCD semismooth\(^{*}\) at \((\bar{x},\bar{y})\in \mathrm {gph}\,F\) if F has the SCD property around \((\bar{x},\bar{y})\) and for every \(\epsilon >0\) there is some \(\delta >0\) such that the inequality
holds for all \((x,y)\in \mathrm {gph}\,F\cap {\mathscr{B}}_\delta (\bar{x},\bar{y})\) and all \((y^*,x^*)\) belonging to any \(L\in {\mathscr{S}}^*F(x,y)\).
Clearly, every mapping with the SCD property around \((\bar{x},\bar{y}) \in \mathrm {gph}\,F\) which is semismooth\(^{*}\) at \((\bar{x},\bar{y})\) is automatically SCD semismooth\(^{*}\) at \((\bar{x},\bar{y})\). Therefore, the class of SCD semismooth\(^{*}\) mappings is even richer than the class of semismooth\(^{*}\) maps. In particular, it follows from [22, Theorem 2] that every mapping whose graph is a closed subanalytic set is SCD semismooth\(^{*}\) , cf. [14].
The following proposition provides the key estimate for the semismooth\(^{*}\) Newton method for SCD mappings.
Proposition 4.2
([14, Proposition 5.3]) Assume that \(F:\mathbb {R}^n\rightrightarrows \mathbb {R}^n\) is SCD semismooth\(^{*}\) at \((\bar{x},\bar{y})\in \mathrm {gph}\,F\). Then for every \(\epsilon >0\) there is some \(\delta >0\) such that the estimate
holds for every \((x,y)\in \mathrm {gph}\,F\cap {\mathscr{B}}_\delta (\bar{x},\bar{y})\) and every \(L\in {\mathscr{S}}^*F(x,y)\cap \mathcal{Z}_n^\mathrm{reg}\).
We now describe the SCD variant of the semismooth\(^{*}\) Newton method. Given a solution \(\bar{x}\in F^{-1}(0)\) of (4.1) and some positive scalar, we define the mappings \(\mathcal{A}_{\eta ,\bar{x}}:\mathbb {R}^n\rightrightarrows \mathbb {R}^n\times \mathbb {R}^n\) and \(\mathcal{N}_{\eta ,\bar{x}}:\mathbb {R}^n\rightrightarrows \mathbb {R}^n\) by
Proposition 4.3
([15, Proposition 4.3]) Assume that F is SCD semismooth\(^{*}\) in \((\bar{x},0) \in \mathrm {gph}\,F\) and SCD regular around \((\bar{x},0)\) and let \(\eta >0\). Then there is some \(\bar{\delta }>0\) such that for every \(x\in {\mathscr{B}}_{\bar{\delta }}(\bar{x})\) the mapping F is SCD regular around every point \((\hat{x},\hat{y})\in \mathcal{A}_{\eta ,\bar{x}}(x)\). Furthermore, for every \(\epsilon >0\) there is some \(\delta \in (0,\bar{\delta }]\) such that
Assuming that we are given some iterate \(x^{(k)}\), the next iterate is given formally by \({x}^{(k+1)}\in \mathcal{N}_{\eta ,\bar{x}}({x}^{(k)})\). Let us take a closer look at this rule. Since we are dealing with set-valued mappings F, we cannot expect, in general, that \(F(x^{(k)})\not =\emptyset\) or that 0 is close to \(F(x^{(k)})\), even if \(x^{(k)}\) is close to a solution \(\bar{x}\). Therefore, we first perform some step that produces \((\hat{x}^{(k)},\hat{y}^{(k)})\in \mathrm {gph}\,F\) as an approximate projection of \((x^{(k)},0)\) onto \(\mathrm {gph}\,F\). We require that
for some constant \(\eta >0\), i.e., \(({\hat{x}}^{(k)},{\hat{y}}^{(k)})\in \mathcal{A}_{\eta ,\bar{x}}({x}^{(k)})\). For instance, if
is true with some \(\beta \ge 1\), then
and thus (4.2) holds with \(\eta =\beta +1\) and we can fulfill the inequality (4.2) without knowing the solution \(\bar{x}\). Furthermore, we require that \({\mathscr{S}}^*F(\hat{x}^{(k)},\hat{y}^{(k)})\cap \mathcal{Z}_n^\mathrm{reg}\not =\emptyset\) and compute the new iterate as \(x^{(k+1)}=\hat{x}^{(k)}-C_L^T\hat{y}^{(k)}\) for some \(L\in {\mathscr{S}}^*F(\hat{x}^{(k)},\hat{y}^{(k)})\cap \mathcal{Z}_n^\mathrm{reg}\). In fact, in our numerical implementation we will not compute the matrix \(C_L\), but two \(n\times n\) matrices A, B such that \(L=\mathrm{rge\;}(B^T,A^T)\). The next iterate \(x^{(k+1)}\) is then obtained by \(x^{(k+1)}=\hat{x}^{(k)}+\Delta x^{(k)},\) where \(\Delta x^{(k)}\) is a solution of the system \(A\Delta x=-B\hat{y}^{(k)}\). Alternatively, in view of Proposition 3.9, we can also choose a subspace \(L\in {\mathscr{S}}F(\hat{x}^{(k)},\hat{y}^{(k)})\cap \mathcal{Z}_n^\mathrm{reg}\) and compute the Newton direction as \({\Delta x}^{(k)}=-C_L{\hat{y}}^{(k)}\), that is, given \((A,B)\in {\mathscr{M}}(L)\) we have \({\Delta x}^{(k)}=-Ap\) where p solves \(Bp={y}^{(k)}\).
This leads to the following conceptual algorithm.
For the choice between the two approaches to calculate the Newton direction, it is important to consider whether an element from \({\mathscr{S}}^*F(\hat{x}^{(k)},\hat{y}^{(k)})\) or from \({\mathscr{S}}F(\hat{x}^{(k)},\hat{y}^{(k)})\) is easier to compute.
For this algorithm, locally superlinear convergence follows from Proposition 4.3, see also [14, Corollary 5.6].
Theorem 4.4
Assume that F is SCD semismooth\(^{*}\) at \((\bar{x},0) \in \mathrm {gph}\,F\) and SCD regular around \((\bar{x},0)\). Then for every \(\eta >0\) there is a neighborhood U of \(\bar{x}\) such that for every starting point \(x^{(0)}\in U\) Algorithm 1 is well defined and stops after finitely many iterations at a solution of (4.1) or produces a sequence \(x^{(k)}\) that superlinearly converges to \(\bar{x}\) for any choice of \((\hat{x}^{(k)},\hat{y}^{(k)})\) satisfying (4.2) and any \(L^{(k)}\in {\mathscr{S}}^*F(\hat{x}^{(k)},\hat{y}^{(k)})\) in Step 4.a) and any \(L^{(k)}\in {\mathscr{S}}F(\hat{x}^{(k)},\hat{y}^{(k)})\) in Step 4.b).
In particular, if F is strongly metrically regular around \((\bar{x},0)\), then F has the SCD property around \((\bar{x},0)\) by [14, Corollary 3.19] and it is also SCD regular around \((\bar{x},0)\) as pointed out in the previous section. Therefore, if F also happens to be SCD semismooth\(^{*}\) around \((\bar{x},0)\), then the assumptions of the above statement are fulfilled.
Note that for an implementation of the Newton step, we need not know the whole derivative \({\mathscr{S}}^*F(\hat{x}^{(k)},\hat{y}^{(k)})\) (or \({\mathscr{S}}F(\hat{x}^{(k)},\hat{y}^{(k)})\)) but only one element \(L^{(k)}\).
5 On the implementation of the SCD semismooth* Newton method
When trying to implement the SCD semismooth\(^{*}\) Newton method directly for (1.1), it turns out that it can be rather difficult to perform the approximation step. Hence, we consider another equivalent approach which is more flexible. Consider an equivalent reformulation of (1.1) by the (decoupled) GE
in variables \((x,d)\in \mathbb {R}^n\times \mathbb {R}^n\). Obviously, \(0\in H(\bar{x})\) holds if and only if \((0,0)\in G(\bar{x},\bar{x})\).
The new variable d acts only as an auxiliary variable. The approximation step now reads as follows: Given \({x}^{(k)}\) close to a solution \(\bar{x}\) (and arbitrary \({d}^{(k)}\), for example, \({d}^{(k)}={x}^{(k)}\)), set \({\hat{x}}^{(k)}:={x}^{(k)}\) and find a point \({\hat{d}}^{(k)}\) close to \({x}^{(k)}\) such that \(\mathrm{dist}(0,f({x}^{(k)})+Q({\hat{d}}^{(k)}))\) is small. An approach to solving this problem could be to rewrite GE (1.1) in fixed point form \(x\in T(x)\) and select \({\hat{d}}^{(k)}\in T({x}^{(k)})\). For example, for any \(\lambda >0\) there is
If we choose \({\hat{d}}^{(k)}\in (I+\lambda Q)^{-1}\big ({x}^{(k)}-\lambda f({x}^{(k)})\big )\), we have \({x}^{(k)}-\lambda f({x}^{(k)})\in {\hat{d}}^{(k)}+\lambda Q({\hat{d}}^{(k)})\) and
follows. In order to show that this approach is feasible as an approximation step, we have to verify that a bound of the form (4.2) holds, at least for \({x}^{(k)}\) close to \(\bar{x}\).
Proposition 5.1
Let \(\bar{x}\) be a solution of (1.1) and assume that there is some \(\lambda >0\) such that the resolvent \((I+\lambda Q)^{-1}\) has a single-valued Lipschitz continuous localization S around \(\bar{x}-\lambda f(\bar{x})\) for \(\bar{x}\), i.e., there are neighborhoods V of \(\bar{x}-\lambda f(\bar{x})\) and U of \(\bar{x}\) such that \(S:V\rightarrow U\) is Lipschitz continuous and \((I+\lambda Q)^{-1}(z)\cap U =\{S(z)\}\) for \(z\in V\). Then there is some \(\delta >0\) and some \(\eta >0\) such that for every \(x\in {\mathscr{B}}_\delta (\bar{x})\) there holds \(x-\lambda f(x)\in V\) and the vectors \(\hat{d}:=S(x-\lambda f(x))\), \(\hat{y}:=(\frac{1}{\lambda }(x-\hat{d}), x-\hat{d})\in \mathrm {gph}\,G(x,\hat{d})\) satisfy the estimate
Proof
Choose \(\delta >0\) such that \({\mathscr{B}}_\delta (\bar{x})\subseteq (I-\lambda f)^{-1}(V)\) and let \(L_f, L_S>0\) denote the moduli of Lipschitz continuity of f on \({\mathscr{B}}_\delta (\bar{x})\) and of S on V, respectively. Consider \(x\in {\mathscr{B}}_\delta (\bar{x})\). Then, by construction, \(x-\lambda f(x)\in V\) and hence \(\hat{d}:=S(x-\lambda f(x))\) is well defined. Further, by (5.2), we have \(\bar{x}=S(\bar{x}-\lambda f(\bar{x}))\) implying
In addition we have \(x-\lambda f(x)\in \hat{d}+\lambda Q(\hat{d})\) and \(\hat{y}\in \mathrm {gph}\,G(x,\hat{d})\) follows. Since \(\Vert \hat{y}\Vert \le (1+1/\lambda )\Vert \hat{d}-x\Vert \le (1+1/\lambda )(\Vert \hat{d}-\bar{x}\Vert +\Vert x-\bar{x}\Vert )\), we obtain
and the assertion follows.
In particular, if Q is a maximal hypomonotone mapping, i.e., there exists some \(\gamma \ge 0\) such that \(\gamma I+Q\) is maximal monotone, then for every \(0<\lambda <1/\gamma\) the mapping \((I+\lambda Q)\) is strongly monotone and hence \((I+\lambda Q)^{-1}\) is a single-valued Lipschitz continuous function on \(\mathbb {R}^n\), cf. [32, Proposition 12.54]. However, hypomonotonicity is only a sufficient condition ensuring that \((I+\lambda Q)^{-1}\) has this property. In Sect. 6 we will encounter a non-hypomonotone mapping \(\tilde{Q}\), such that \((I+\lambda \tilde{Q})^{-1}\) is single-valued and Lipschitz continuous for every \(\lambda >0\).
Note that the choice \(\hat{d}\in (I+\lambda Q)^{-1}\big (x-\lambda f(x)\big )\) corresponds to one step of the so-called Forward-Backward method for solving (1.1).
In the next proposition, we summarize some properties of G.
Proposition 5.2
-
(i)
For every \(x\in \mathbb {R}^n\) and \((d,z)\in \mathrm {gph}\,Q\) we have
$$\begin{aligned}{\mathscr{S}}G\big ((x,d),(f(x)+z,x-d)\big )&=\left\{ \mathrm{rge\;}\left[ \left( \begin{matrix}I&{}0\\ 0&{}X\end{matrix}\right) , \left( \begin{matrix}\nabla f(x)&{}Y\\ I&{}-X\end{matrix}\right) \right] \,\bigg |\,\mathrm{rge\;}(X,Y)\in {\mathscr{S}}Q(d,z)\right\} ,\\ {\mathscr{S}}^* G\big ((x,d),(f(x)+z,x-d)\big )&=\left\{ \mathrm{rge\;}\left[ \left( \begin{matrix}Y^*&{}0\\ 0&{}I\end{matrix}\right) , \left( \begin{matrix}\nabla f(x)^TY^*&{}I\\ X^*&{}-I\end{matrix}\right) \right] \,\bigg |\,\mathrm{rge\;}(Y^*,X^*)\in {\mathscr{S}}^* Q(d,z)\right\} . \end{aligned}$$ -
(ii)
Let \(x\in \mathbb {R}^n\) and assume that Q has the SCD property around \((d,z)\in \mathrm {gph}\,Q\). Then the following statements are equivalent:
-
(a)
G is SCD regular around \(\big ((x,d),( f(x)+z,x-d)\big )\).
-
(b)
For every \(L\in {\mathscr{S}}^* Q(d,z)\) and every \((X,Y)\in {\mathscr{M}}(L)\) the matrix \(\nabla f(x)X+Y\) is nonsingular.
-
(c)
For every \(L\in {\mathscr{S}}^* Q(d,z)\) and every \((Y^*,X^*)\in {\mathscr{M}}(L)\) the matrix \(\nabla f(x)^TY^*+X^*\) is nonsingular.
-
(a)
-
(iii)
The mapping H is SCD regular around \((\bar{x},0)\) if and only if G is SCD regular around \(\big ((\bar{x},\bar{x}),(0,0))\).
Proof
G has the representation \(G(x,d)=h(x,d)+\tilde{Q}(x,d)\) with
Since \(\mathrm {gph}\,D\tilde{Q}\big ((x,d),(z,0)\big )=\{\big ((u,e), (v,0)\big )\,\big |\,(e,v)\in \mathrm {gph}\,DQ(d,z)\}\), we obtain \(\mathcal{O}_{\tilde{Q}}=\mathbb {R}^n\times \mathcal{O}_Q\times \{0\}\). For every \((d,z)\in \mathcal{O}_Q\) and every \(x\in \mathbb {R}^n\) the orthogonal projection onto \(\tilde{L}:=\mathrm {gph}\,D\tilde{Q}\big ((x,d),(z,0)\big )=\{\big ((u,e), (v,0)\big )\,\big |\,(e,v)\in \mathrm {gph}\,DQ(d,z)\}\) is represented by the matrix
where \(P_L\) corresponds to the orthogonal projection onto the subspace \(L:=\mathrm {gph}\,DQ(d,z)\). Hence, for every \((d,z)\in \mathrm {gph}\,Q\) and every \(x\in \mathbb {R}^n\) we obtain
and from Proposition 3.3 we conclude
Similarly, we have
yielding, together with Proposition 3.3,
By virtue of (i) and the definition of SCD regularity, G is SCD regular around \(\big ((x,d),(f(x)+z,x-d)\big )\) if and only if for every pair X, Y with \(\mathrm{rge\;}(X,Y)\in {\mathscr{S}}Q(d,z)\) the matrix
is nonsingular. By the representation above, this holds if and only if \(\nabla f(x)X+Y\) is nonsingular. Thus the equivalence between (a) and (b) is established. Similarly, G is SCD regular around \(\big ((x,d),(f(x)+z,x-d)\big )\) if and only if for every pair \(Y^*,X^*\) with \(\mathrm{rge\;}(Y^*,X^*)\in {\mathscr{S}}^* Q(d,z)\) the matrix
is nonsingular and the equivalence between (a) and (c) follows.
To establish (iii), just note that by Proposition 3.3 we have
and the assertion follows from (ii) and the definition of SCD regularity.
Let us now consider the Newton step. Assume that, emanating from the iterate \({x}^{(k)}\), we have computed \(\big (({\hat{x}}^{(k)},{\hat{d}}^{(k)}),({\hat{y}_1}^{(k)}, {\hat{y}_2}^{(k)})\big )\in \mathrm {gph}\,G\) as the result of the approximation step.
Case (i): We compute the Newton direction according to step 4.a) of Algorithm 1.
By Proposition 5.2, we have to compute two \(n\times n\) matrices \({{Y^*}}^{(k)}, {X^*}^{(k)}\) with \(\mathrm{rge\;}({Y^*}^{(k)}, {X^*}^{(k)})\in {\mathscr{S}}^*Q({\hat{d}}^{(k)},{\hat{y}_1}^{(k)}-\nabla f({\hat{x}}^{(k)}))\) and to solve the system
Using the second equation we can eliminate \({\Delta d}^{(k)}={\Delta x}^{(k)}+{\hat{y}_2}^{(k)}\) and arrive at the linear system
Case (ii): The Newton direction is computed by step 4.b) of Algorithm 1.
In this case we determine two \(n\times n\) matrices \({X}^{(k)},{Y}^{(k)}\) with \(\mathrm{rge\;}({X}^{(k)},{Y}^{(k)})\in {\mathscr{S}}Q({\hat{d}}^{(k)},{\hat{y}_1}^{(k)}-\nabla f({\hat{x}}^{(k)}))\), solve the linear system
and set
By eliminating \(p_1={X}^{(k)} p_2-{\hat{y}_2}^{(k)}\) we obtain the linear system
whose solution yields
In both cases, the new iterate is given by \({x}^{(k+1)}:={\hat{x}}^{(k)}+{\Delta x}^{(k)}\). Further, we have \({\Delta x}^{(k)}-{\Delta d}^{(k)}=-{\hat{y}_2}^{(k)}={\hat{d}}^{(k)}-{\hat{x}}^{(k)}\) resulting in \({x}^{(k+1)}={\hat{d}}^{(k)}+{\Delta d}^{(k)}\).
6 Algebraic form of the discrete contact problem with Coulomb friction
We consider an elastic body represented by a bounded domain \(\Omega \subset \mathbb {R}^3\) with a sufficiently smooth boundary \(\partial \Omega\). The body is made of elastic, homogeneous, and isotropic material. The boundary consists of three non-empty disjoint parts: \(\partial \Omega =\overline{\Gamma _u}\cup \overline{\Gamma _p}\cup \overline{\Gamma _c}\). Zero displacements are prescribed on \(\Gamma _u\), surface tractions act on \(\Gamma _p\), and the body is subject to volume forces. We seek a displacement field and a corresponding stress field satisfying the Lamé system of PDEs in \(\Omega\), the homogeneous Dirichlet boundary conditions on \(\Gamma _u\), and the Neumann boundary conditions on \(\Gamma _p\). The body is unilaterally supported along \(\Gamma _c\) by some flat rigid foundation given by the halfspace \(\mathbb {R}^2\times \mathbb {R}_-\) and the initial gap between the body and the rigid foundation is denoted by d(x), \(x\in \Gamma _c\). In the contact zone, we consider a static Coulomb Friction condition.
This problem can be described by partial differential equations and boundary conditions for the displacements, which we are looking for. We refer the reader to, e.g., [11], where also a weak formulation can be found. We consider here only the discrete algebraic problem, which arises after some suitable finite element approximation.
Let n denote the number of degrees of freedom of the nodal displacement vector and let p denote the number of contact nodes \(x_i\in \overline{\Gamma _c}\setminus \overline{\Gamma _u}\). After some suitable reordering of the variables, such that the first 3p positions are occupied by the displacements of the nodes lying in the contact part of the boundary, we arrive at the following nodal block structure for an arbitrary vector \(y\in \mathbb {R}^n\):
In what follows, \(A\in \mathbb {R}^{n\times n}\), \(\tilde{l}\in \mathbb {R}^n\) are the stiffness matrix and the load vector, respectively. Further we are given two matrices \(N\in \mathbb {R}^{p\times n}\), \(T\in \mathbb {R}^{2p\times n}\), where, for a given displacement vector v, Nv yields the normal components at the p contact points, and \(Tv=(T^1v,\ldots ,T^pv)\), where \(T^iv\in \mathbb {R}^2\) is the tangential nodal displacement vector at the i-th contact node. The symbol \(\vert Tv\vert \in \mathbb {R}^p\) denotes a vector defined by
We denote with \(\alpha \in \mathbb {R}^p\) the vector of nodal distances with \(\alpha _i:=d(x_i)\) and the friction coefficient is denoted by \({\mathscr{F}}\).
Definition 6.1
([3, Definition 3.6]) As a solution of a discrete contact problem with Coulomb friction we declare any couple \((\tilde{u}, \lambda )\in \mathbb {R}^n\times \mathbb {R}^p_+\) satisfying
Since the stiffness matrix A is positive definite and \(\lambda \ge 0\), condition (6.1) is equivalent to the requirement that \(\tilde{u}\) is a minimizer of the convex minimization problem
Given a vector \(z=(z_1,z_2,z_3)^T\in \mathbb {R}^3\), we denote by \(z_{12}:=(z_1,z_2)^T\in \mathbb {R}^2\) the vector formed by the first two components. Using this notation, we have
due to the ordering of the nodal displacements.
Next consider the transformation of variables \(u=\tilde{u}+d\), where \(d=(d^1,\ldots ,d^p,d^R)^T\in \mathbb {R}^n\) is given by
Then u is a solution of the problem
where \(l:=\tilde{l}-Ad\). Since the objective in this minimization problem is convex, this is in turn equivalent to the first-order optimality condition
Further, (6.2) is the same as \(-\lambda _i\in N_{\mathbb {R}_+}(\tilde{u}^i_3+\alpha _i)=N_{\mathbb {R}_+}(u^i_3)\), \(i=1,\ldots ,p\). After eliminating \(\lambda\) from explicit variables, we have thus arrived at the GE
where
with \(\tilde{Q}:\mathbb {R}^3\rightrightarrows \mathbb {R}^3\) and \(Q^R:\mathbb {R}^{n-3p}\rightrightarrows \mathbb {R}^{n-3p}\) defined by
GE (6.3) is dependent solely on transformed displacements u. Multipliers \(\lambda _i\) appear only implicitly as \(-\vartheta\) in the description of \(\tilde{Q}\). This is a big difference with respect to other approaches, where the semismooth Newton method for equations is applied to mixed primal-dual systems or purely dual systems using some NCP-functions, see, e.g., [33, 5, 28].
Remark 6.2
We have derived the GE (6.3) for the Signorini problem with static Coulomb friction. We claim also that, for other contact problems with friction involving two elastic bodies, one can derive a GE of the same type. The interested reader is referred to [33, Section 5.2] for an algebraic transformation of a two-body problem to a one-body problem.
Note that
which enables us to prove the following statement.
Proposition 6.3
H is semismooth\(^{*}\) at each point in its graph.
Proof
Consider a point \((\bar{u}, \bar{w}) \in \mathrm {gph}\,H\). By [13, Proposition 3.6] it suffices to show that Q is semismooth\(^{*}\) at \((\bar{u},\bar{w}-A\bar{u}-l)\), which definitely holds true provided \(\tilde{Q}\) is semismooth\(^{*}\) at all points of its graph. Thus, invoking [22, Theorem 3] and using the connection between the semismooth\(^{*}\) property of sets and the respective distance functions, it suffices to show that \(\mathrm {gph}\,\tilde{Q}\) is a subanalytic set. Let us pick a reference point \((\bar{v},\bar{g},\bar{\vartheta }) \in \mathrm {gph}\,\tilde{Q}\) and consider the set
Clearly, P is semianalytic (as the intersection of finitely many polynomial equalities and inequalities, it is even semialgebraic) and compact. Moreover, by construction, \(\mathrm {gph}\,\widetilde{Q} \cap \mathcal {B}_1(\bar{v},\bar{g},\bar{\vartheta })\) is the canonical projection of P onto the space of variables \((v,g,\vartheta )\) and hence subanalytic, cp. [4]. The proof is complete.
Proposition 6.4
For every \(\gamma >0\), the mapping \((\gamma I_3+\tilde{Q})^{-1}:\mathbb {R}^3\rightrightarrows \mathbb {R}^3\) is single-valued and Lipschitz continuous on \(\mathbb {R}^3\). In particular, \(\tilde{Q}\) is graphically Lipschitzian of dimension 3 at every point of its graph.
Proof
We have \(v\in (\gamma I_3+\tilde{Q})^{-1}(w)\) if and only if
for some \(\vartheta \in N_{\mathbb {R}_+}(v_3)\) and some \(v_{12}^*\in \partial \Vert v_{12}\Vert\). Since \(\gamma I_1+N_{\mathbb {R}_+}\) is both maximal monotone and strongly monotone, \(v_3\) and \(\vartheta\) are uniquely given by
For given \(\vartheta \le 0\) the mapping \(\gamma I_2+{\mathscr{F}}(-\vartheta )\partial \Vert \cdot \Vert\) is again maximal monotone and strongly monotone and thus \(v_{12}\) is uniquely given by
These arguments prove that \((\gamma I_3+\tilde{Q})^{-1}\) is single-valued on \(\mathbb {R}^3\) and there remains to show the Lipschitz continuity. Consider two points \(w^j\), \(j=1,2\), together with \((\gamma I_3+\tilde{Q})^{-1}(w^j)=\{v^j\}\) and the corresponding \(\vartheta ^j\in N_{\mathbb {R}_+}(v_3^j)\), \(v_{12}^{*j}\in \partial \Vert v_{12}^j\Vert\), \(j=1,2\), according to (6.6), (6.7). Then we deduce from (6.7) that
where we have used the facts that \(-\vartheta ^1\ge 0\), that the subdifferential mapping \(\partial \Vert \cdot \Vert\) is monotone and that \(\Vert v_{12}^{*2}\Vert \le 1\). Since the functions \(t\rightarrow \min \{t,0\}\) and \(t\rightarrow \max \{t,0\}\) are Lipschitz continuous on \(\mathbb {R}\) with constant 1, we obtain from (6.8) that \(\vert \vartheta ^1-\vartheta ^2\vert \le \vert w_3^1-w_3^2\vert\) yielding
and consequently \(\gamma \Vert v^1_{12}-v^2_{12}\Vert \le \Vert w^1_{12}-w^2_{12}\Vert +{\mathscr{F}}\vert w_3^1-w_3^2\vert\). Since we also have \(\gamma \vert v_3^1-v_3^2\vert \le \vert w_3^1-w_3^2\vert\), we obtain
establishing Lipschitz continuity of \((\gamma I_3+\tilde{Q})^{-1}\). To see that \(\tilde{Q}\) is graphically Lipschitzian of dimension 3, just take \(\Phi (x,y)=(\gamma x+y,x)\) and \(f:=(\gamma I+\tilde{Q})^{-1}\) to obtain \(\mathrm {gph}\,f=\Phi (\mathrm {gph}\,\tilde{Q})\).
Remark 6.5
Note that the mapping \(\tilde{Q}\) is not hypomonotone, that is, for every \(\gamma >0\) the mapping \(\gamma I_3+\tilde{Q}\) is not monotone. Indeed, consider \(\gamma >0\) and let
Then
and therefore \(\gamma I_3+\tilde{Q}\) is not monotone.
Further note that in the case when \(\vartheta =0\) and \(v_{12}=0\) the subgradient \(v_{12}^*\in \partial \Vert v_{12}\Vert\) fulfilling (6.7) is not uniquely given.
Throughout the sequel, it is convenient to refer to [3] and express the graph of \(\tilde{Q}\) in the form
where the single sets arising in (6.9) do have a clear mechanical interpretation. Their definitions are provided in the following table.
Note that in Table 1 the impossible combinations of variables are crossed out.
Proposition 6.6
\(\tilde{Q}\) is an SCD mapping and \(\mathcal {O}_{\tilde{Q}}= L \cup M_{1} \cup M_{3}^{+}\). In particular, for \((\bar{v},\bar{g},\bar{\vartheta }) \in L\)
for \((\bar{v},\bar{g},\bar{\vartheta })\in M_{3}^{+}\)
and for \((\bar{v},\bar{g},\bar{\vartheta })\in M_{1}\) one has
Proof
Since \(\tilde{Q}\) is graphically Lipschitzian of dimension 3 at every point of its graph, it is an SCD mapping.
Note that the sets \(L, M_1\) and \(M_{3}^{+}\) exhibit a stable behavior in the sense that, for a sufficiently small \(\varrho > 0\),
In particular, we have
It follows from Definition 2.1 that
In all three cases, we have therefore to do with linear subspaces of dimension three, which yield \(\mathcal {O}_{\tilde{Q}} \supset L \cup M_{1} \cup M_{3}^{+}\) and the expressions for \(\widehat{\mathscr{S}}\tilde{Q}(\bar{v},\bar{g},\bar{\vartheta })\) in (6.10), (6.11), and (6.12). Concerning the expressions for \(\widehat{\mathscr{S}}^* \tilde{Q}(\bar{v},\bar{g},\bar{\vartheta })\), they can be derived by first computing the respective orthogonal complements and then using the relation (3.1). The equalities \(\widehat{\mathscr{S}}\tilde{Q}(\bar{v},\bar{g},\bar{\vartheta })= {\mathscr{S}}\tilde{Q}(\bar{v},\bar{g},\bar{\vartheta })\) and \(\widehat{\mathscr{S}}^* \tilde{Q}(\bar{v},\bar{g},\bar{\vartheta })={\mathscr{S}}^* \tilde{Q}(\bar{v},\bar{g},\bar{\vartheta })\) in (6.10)–(6.13) follow from the observation that the matrices that describe the subspaces continuously depend on the argument \((\bar{v},\bar{g},\bar{\vartheta })\).
It remains to show that actually \(\mathcal{O}_{\tilde{Q}} = L \cup M_{1} \cup M_{3}^{+}\), that is, \((M_2\cup M_4\cup M_3^-)\cap \mathcal{O}_{\tilde{Q}}=\emptyset\). Consider first \((\bar{v},\bar{g},\bar{\vartheta })\in M_2\cup M_4\). Then \(\bar{v}_3=\bar{\vartheta }=0\) and it follows that \(\big ((0,0,1),(0,0,0)\big )\in T_{\mathrm {gph}\,\tilde{Q}}(\bar{v},\bar{g},\bar{\vartheta })\), but the opposite direction \(\big ((0,0,-1),(0,0,0)\big )\) cannot belong to the tangent cone because \(-1\not \in T_{\mathbb {R}_+}(0)\). Hence, \(T_{\mathrm {gph}\,\tilde{Q}}(\bar{v},\bar{g},\bar{\vartheta })\) is not a subspace and \((\bar{v},\bar{g},\bar{\vartheta })\not \in \mathcal{O}_{\tilde{Q}}\) follows. Finally, let \((\bar{v},\bar{g},\bar{\vartheta })\in M_3^-\). Then for all \(t>0\) we have \(\big ((t\bar{g},0),(-{\mathscr{F}}\bar{\vartheta }\bar{g},\bar{\vartheta })\big )\in \mathrm {gph}\,\tilde{Q}\) implying \(\big ((\bar{g},0,0),(0,0,0)\big )\in T_{\mathrm {gph}\,\tilde{Q}}(\bar{v},\bar{g},\bar{\vartheta })\). Now assume that \(-\big ((\bar{g},0,0),(0,0,0)\big )\in T_{\mathrm {gph}\,\tilde{Q}}(\bar{v},\bar{g},\bar{\vartheta })\). By definition, there are sequences \(t_k\downarrow 0\) and \((v^k,g^k,\vartheta ^k)\mathop {\longrightarrow }\limits ^{\mathrm {gph}\,\tilde{Q}}(\bar{v},\bar{g},\bar{\vartheta })\) that satisfy \(\big ((v^k,g^k,\vartheta ^k)-(\bar{v},\bar{g},\bar{\vartheta })\big )/t_k\rightarrow \big ((-\bar{g},0),(0,0,0)\big )\). From \((v^k_{12}-\bar{v}_{12})/t_k=v^k_{12}/t_k\rightarrow -\bar{g}\), \(\vartheta ^k\rightarrow \bar{\vartheta }\) and \(\Vert \bar{g}\Vert =-{\mathscr{F}}\bar{\vartheta }\) we deduce
contradicting \(g^k\rightarrow \bar{g}\) and we conclude that \((\bar{v},\bar{g},\bar{\vartheta })\not \in \mathcal{O}_{\tilde{Q}}\). This completes the proof.
Note that in the formulas (6.12) and (6.13) the matrices \(-{\mathscr{F}}\bar{\vartheta }/\Vert \bar{v}_{12}\Vert \left( I_2- \bar{v}_{12}\bar{v}_{12}^{T}/\Vert \bar{v}_{12} \Vert ^{2} \right)\) are unbounded for \(\bar{v}_{12}\rightarrow 0\). Theoretically, this does not cause problems, because convergence is related to SCD regularity, which is independent from the basis representation used of the underlying subspaces. However, the use of ill-conditioned bases might produce numerical instability when computing the Newton direction. For this reason, we present another representation of the collections \({\mathscr{S}}\tilde{Q}(\bar{v},\bar{g},\bar{\vartheta })\) and \({\mathscr{S}}^*\tilde{Q}(\bar{v},\bar{g},\bar{\vartheta })\) with a well-conditioned base when \((\bar{v},\bar{g},\bar{\vartheta })\in M_1\). Observe that for any two \(n\times n\) matrices A, B and every nonsingular \(n\times n\) matrix C there are \(\mathrm{rge\;}(A,B)=\mathrm{rge\;}(AC,BC)\). Thus, the following corollary follows from (6.12), (6.13) by using the scaling matrix
Corollary 6.7
For \((\bar{v},\bar{g},\bar{\vartheta })\in M_1\) we have
From Table 1 one can further infer that
-
(i)
every point from \(M_{2}\) is accessible by sequences belonging solely to L or to \(M_{1}\);
-
(ii)
every point from \(M_{3}^{-}\) is accessible by sequences belonging to \(M_{1}\) or to \(M_{3}^{+}\), and
-
(iii)
the singleton \(M_{4}=\{(0,0,0)\}\) is accessible by sequences belonging to L or to \(M_{1}\) or to \(M_3^-\) or to \(M_{3}^{+}\).
This implies in particular that
Formulas (6.16) are used in the implementation of the Newton step of the SCD semismooth\(^{*}\) Newton method to the numerical solution of (6.3) in the next section.
Obviously, the mapping \(Q^R\) given by (6.5) is Lipschitzian and, therefore, graphically Lipschitzian of dimension \(n-3p\) as well. Further \(\mathcal{O}_{Q^R}=\mathrm {gph}\,Q^R=\mathbb {R}^{n-3p}\times \{0\}\) and
Combining Lemma 3.4, Proposition 3.5 and Lemma 3.7 with Proposition 6.4 and Proposition 6.6 we arrive at the following corollary.
Corollary 6.8
The mapping Q given by (6.4) is a SCD mapping,
and for every \((u,w)=\big ((u^1,\ldots ,u^p,u^R),(w^1,\ldots ,w^p,0)\big )\in \mathrm {gph}\,Q\) we have
To implement the semismooth\(^{*}\) Newton method, we must also specify the approximation step. For every \(\gamma >0\) the mapping \(\big (\gamma I_{n-3p}+Q^R\big )^{-1}=I_{n-3p}/\gamma\) is obviously single-valued and Lipschitzian on \(\mathbb {R}^{n-3p}\). Together with Proposition 6.4 we obtain that
and consequently also the mapping \((I_n+\frac{1}{\gamma }Q)^{-1}=(\gamma I_n+Q)^{-1}\circ \gamma I_n\) has these properties. Thus, for a given iterate \({u}^{(k)}\), the choice
is feasible for the approximation step by Proposition 5.1.
7 Numerical experiments
We treat various geometries arising from the cuboid \((0,2)\times (0,1)\times (0,1) \subset \mathbb {R}^3\) by modifying its bottom surface. Given a function \(d:(0,2)\times (0,1)\mapsto \mathbb {R}\), we consider the elastic body represented by the domain
At the left surface \(x_1=0\) of the body, we impose Dirichlet boundary conditions and on the top surface \(x_3=1\) and the right surface \(x_1=2\) act surface tractions with densities \(P_T\) and \(P_R\). The rigid obstacle is given by the half-space \(\mathbb {R}^2\times \mathbb {R}_-\) so that the contact boundary is the bottom surface of the body.
Depending on the discretization parameter lev, the elastic body is uniformly cut into \(n_{x_1}\cdot n_{x_2}\cdot n_{x_3}\) hexahedrons, where
This results in a hexahedral mesh with \((n_{x_1}+1)\cdot (n_{x_2}+1)\cdot (n_{x_3}+1)\) vertices, where \((n_{x_2}+1)\cdot (n_{x_3}+1)\) vertices are in the Dirichlet part of the boundary and \(p:=n_{x_1}\cdot (n_{x_2}+1)\) vertices belong to the contact part of the boundary. The total number of degrees of freedom of the nodal displacements is
The setting is shown in Fig. 1.
The resulting GE (6.3) is solved with the SCD semismooth\(^{*}\) Newton method described in Sects. 5 and 6 and the implementation was carried out in MATLAB on a PC with an i7-7700 CPU. The part of the code that describes the model is built on the original code of [3] with the accelerated assembly of the elastic stiffness matrix as described in [6]. This part of the code was also applied to the Tresca friction solver of [16]. The main difference between the implementations is that in the new code, no reduction is done to the nodes on the contact boundary, and the complete problem is treated with all domain nodes.
For the scalar \(\gamma\) used in the approximation step (6.17) we used an approximation of the largest eigenvalue of A obtained by five iterations of the power method. The system (5.4) that defines the Newton direction was solved iteratively using the GMRES method with ILU factorization as a preconditioner. We stopped the GMRES method when the relative residual (non-preconditioned) is less than the prescribed tolerance tol. Clearly, in this case, we will lose the superlinear convergence, but we can expect linear convergence with the rate tol. Of course, we can use more advanced methods to solve the linear system that determines the Newton direction, but the main task of this paper is to demonstrate the efficiency of the semismooth\(^{*}\) Newton method and not the solution of linear systems.
To improve the global convergence properties, we use a non-monotone line search heuristic as introduced in [15]. Recall that the quantity \({\hat{y}}^{(k)}\) given by (5.3) acts as a residual for GE (5.1) at the point \(({x}^{(k)}, {\hat{d}}^{(k)})\). In the context of our contact problem with Coulomb friction, given an iterate \({u}^{(k)}\) of nodal displacements and \({\hat{d}}^{(k)}\) by (6.17), we consider
as a residual. If the Newton direction used is denoted by \(\Delta {u}^{(k)}\), the next iterate is calculated as \({u}^{(k+1)}={u}^{(k)}+{\alpha }^{(k)}\Delta {u}^{(k)}\), where \({\alpha }^{(k)}\) is chosen as the first element of the sequence \(1,\frac{1}{2},\frac{1}{4},\frac{1}{8},\frac{1}{32},\frac{1}{128}, \frac{0.1}{128},\frac{0.01}{128},\ldots\) such that
We considered three different elastic bodies \(\Omega (d_i)\), \(i=1,2,3\), given by
and two load cases \(L_1,L_2\) with surface tractions
As material parameters, we chose Young’s modulus \(E:=70\,\mathrm{GPa}\) and Poisson’s ratio \(\nu =0.334\) (aluminum). The friction coefficient was always chosen as \({\mathscr{F}}=0.23\). In Table 2 we report for different discretization levels lev the number p of nodes in the contact part of the boundary and the number n of degrees of freedom. Furthermore, using the relative accuracy \(tol=0.1\) to calculate Newton’s direction, for each of the six possible combinations of geometries \(d_1,d_2,d_3\) and load cases \(L_1,L_2\) we list the number of Newton iterations it and the total number gmres of GMRES iterations needed to reduce the initial residual \(\Vert {\hat{y}}^{(0)}\Vert\) by a factor of \(10^{-12}\). The starting point \({u}^{(0)}\) for the semismooth\(^{*}\) Newton method was always chosen as the origin. The value gmres characterizes the computational complexity.
We can see that for every geometry and every load case the iteration numbers are nearly equal and increase only slightly with finer discretizations.
In Fig. 2 we depict for the different cases the undeformed bottom surface and the contact states for the deformed contact boundary.
The number of iterations will decrease when a better starting point is available. In Table 3 we display the iteration numbers when we use as a starting point the solution of the previous discretization level interpolated to the finer mesh.
We observe that the number of GMRES iterations is reduced by \(35-45\%\) at the highest discretization level. A closer analysis shows that, as expected, we essentially avoid iterations to localize the solution. In fact, after 1–3 iterations, we have identified the correct state of all nodes in the contact part of the boundary, and the remaining iterations are only needed to reach the desired accuracy. Note that we use \(tol=0.1\) and therefore expect linear convergence with the rate 0.1. Since we want to reduce the residual by a factor of \(10^{-12}\), we expect about 12 semismooth\(^{*}\) Newton steps to achieve this goal. In most cases, we need fewer iterations: The reason is that the relative residual of the calculated direction is sometimes significantly less than tol.
Finally, we investigate the impact of the parameter tol on the performance of the semismooth\(^{*}\) Newton method. Here, we consider only the load case \(L_2\) and that the bottom surface is given by \(d_3\) with the discretization level \(lev=10\). In Table 4 we report the iteration numbers it of the semismooth\(^{*}\) Newton method and the total number gmres of GMRES iterations for the starting point \({u}^{(0)}=0\). We can see that for \(tol=0.1\) we need the most semismooth\(^{*}\) Newton steps; however, the total number of GMRES iterations, which measures computational complexity, is the lowest.
We show the convergence of the semismooth\(^{*}\) Newton method for the four values of tol in Fig. 3. We see that during the first 5 or 6 iterations, when the semismooth\(^{*}\) Newton method tries to localize the solution, the accuracy tol does not play any role in decreasing the residual, and we only need a lot of GMRES iterations to compute the search directions with higher accuracy. As soon as we are sufficiently close to the solution, the increased accuracy for computing the search direction also yields better convergence rates and consequently fewer iterations for the semismooth\(^{*}\) Newton method. However, we also need more GMRES iterations to calculate the search direction, which defeats the advantage of a better convergence rate.
8 Conclusion
The paper shows the abilities of the SCD semismooth\(^{*}\) Newton method to compute, apart from the variational inequalities of the first and second kind, also a more complicated class of equilibria which can be modeled as GEs with an SCD and semismooth\(^{*}\) multi-valued part. This is documented by a large-scale highly complicated contact problem, where the efficiency of the method enables us, in contrast to most existing approaches, to solve the respective GE on the whole domain without the time-consuming reduction to nodes lying on the contact boundary. We do hope that the SCD semismooth\(^{*}\) Newton method will exhibit a similar performance also in some other mechanical problems having a similar structure as the considered contact problem with Coulomb friction.
The paper is dedicated to our friend A.L. Dontchev, for whom nonsmooth Newton methods definitely belonged to favorite research topics and who contributed to the development of this area in a remarkable way, cf., e.g., [7, 10].
Data availibility
Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.
References
Azé, D., Chou, C.C.: On a Newton type iterative method for solving inclusions. Math. Oper. Res. 20, 790–800 (1995). https://doi.org/10.1287/moor.20.4.790
Beremlijski, P., Haslinger, J., Kočvara, M., Outrata, J.V.: Shape optimization in contact problems with Coulomb friction. SIAM J. Optim. 13, 561–587 (2002). https://doi.org/10.1137/S1052623401395061
Beremlijski, P., Haslinger, J., Kočvara, M., Kučera, R., Outrata, J.V.: Shape optimization in three-dimensional contact problems with Coulomb friction. SIAM J. Optim. 20, 416–444 (2009). https://doi.org/10.1137/080714427
Bierstone, E., Milman, P.D.: Semianalytic and subanalytic sets. Publications Mathématiques de l’IHÉS, Tome 67, 5–42 (1988). https://doi.org/10.1007/BF02699126
Blum, H., Frohne, H., Frohne, J., Rademacher, A.: Semi-smooth Newton methods for mixed FEM discretizations of higher-order for frictional, elasto-plastic two-body contact problems. Comput. Methods Appl. Mech. Eng. 309, 131–151 (2016). https://doi.org/10.1016/j.cma.2016.06.004
Čermák, M., Sysala, S., Valdman, J.: Efficient and flexible MATLAB implementation of 2D and 3D elastoplastic problems. Appl. Math. Comput. 355, 595–614 (2019). https://doi.org/10.1016/j.amc.2019.02.054
Cibulka, R., Dontchev, A.L., Kruger, A.Y.: Strong metric subregularity of mappings in variational analysis and optimization. J. Math. Anal. Appl. 457, 1247–1282 (2018). https://doi.org/10.1016/j.jmaa.2016.11.045
Dias, S., Smirnov, G.: On the Newton method for set-valued maps. Nonlinear Anal. 75, 1219–1230 (2012). https://doi.org/10.1016/j.na.2011.04.005
Dontchev, A.L., Quincampoix, M., Zlateva, N.: Aubin criterion for metric regularity. J. Convex Anal. 13, 281–297 (2006)
Dontchev, A.L., Rockafellar, R.T.: Implicit Function and Solution Mappings, 2nd edn. Springer, New York (2014)
Eck, C., Jarušek, J., Krbec, M.: Unilateral Control Problems: Variational Methods and Existence Theorems. CRC Press, New York (2005)
Facchinei, F., Kanzow, C., Sagratella, S.: QVILIB: a library of quasi-variational inequality test problems. Pac. J. Optim. 9, 225–250 (2013)
Gfrerer, H., Outrata, J.V.: On a semismooth* Newton method for solving generalized equations. SIAM J. Optim. 31, 489–517 (2021). https://doi.org/10.1137/19M1257408
Gfrerer, H., Outrata, J.V.: On (local) analysis of multifunctions via subspaces contained in graphs of generalized derivatives. J. Math. Anal. Appl. (2022). https://doi.org/10.1016/j.jmaa.2021.125895
Gfrerer, H., Outrata, J.V., Valdman, J.: On the application of the SCD semismooth* Newton method to variational inequalities of the second kind. (2021) Submitted, arXiv:2112.08080
Gfrerer, H., Outrata, J.V., Valdman, J.: On the solution of contact problems with Tresca friction by the semismooth* Newton method. Lirkov, I., Margenov, S. (eds.): LSSC 2021, LNCS 13127, pp. 479–487 (2022)
Gfrerer, H., Ye, J.J.: New constraint qualifications for mathematical programs with equilibrium constraints via variational analysis. SIAM J. Optim. 27, 842–865 (2017). https://doi.org/10.1137/16M1088752
Gwinner, J., Jadamba, B., Khan, A.A., Raciti, F.: Uncertainty Quantification in Variational Inequalities. Chapman & Hall/CRC (2021)
Haslinger, J., Mietinen, M., Panagiotopoulos, P.D.: Finite Element Method for Hemivariational Inequalities. Kluwer Academic Publishers, Dordrecht (1999)
Josephy, N.H.: Newton’s method for generalized equations and the PIES energy model, Ph.D. Dissertation, Department of Industrial Engineering, University of Wisconsin-Madison (1979)
Josephy, N.H.: Quasi-Newton method for generalized equations. Technical Summary Report 1966. University of Wisconsin-Madison, Mathematics Research Center (1979)
Jourani, A.: Radiality and semismoothness. Control Cybern 36, 669–680 (2007)
Kočvara, M., Outrata, J.V.: On a class of quasi-variational inequalities. Optim. Methods Softw. 5, 275–295 (1995). https://doi.org/10.1080/10556789508805617
Kučera, R.: Minimizing quadratic functions with separable quadratic constraints. Optim. Methods Softw. 22, 453–467 (2007). https://doi.org/10.1080/10556780600609246
Kummer, B.: Newton’s method for non-differentiable functions. In: Guddat J. et al, (eds.) Advances in Mathematical Optimization. Series in Mathematical Research, vol.45, pp. 114–125. Akademie-Verlag, Berlin (1988)
Ligurský, T.: Theoretical analysis of discrete contact problems with Coulomb friction. Appl. Math. 57, 263–295 (2012). https://doi.org/10.1007/s10492-012-0016-9
Nečas, J., Jarušek, J., Haslinger, J.: On the solution of the variational inequality to the Signorini problem with small friction. Boll. Unione Mat. Ital. V. Ser.B, vol. 17, pp. 796–811 (1980)
Outrata, J.V., Kočvara, M., Zowe, J.: Nonsmooth Approach to Optimization Problems with Equilibrium Constraints. Kluwer Academic Publishers, Dordrecht (1998)
Qi, L., Sun, J.: A nonsmooth version of Newton’s method. Math. Program. 58, 353–367 (1993). https://doi.org/10.1007/BF01581275
Renard, Y.: A uniqueness criterion for the Signorini problem with Coulomb friction. SIAM J. Math. Anal. 38, 452–467 (2006). https://doi.org/10.1137/050635936
Rockafellar, R.T.: Maximal monotone relations and the second derivatives of nonsmooth functions. Ann. Inst. H. Poincaré Anal. Non Linéaire 2, 167–184 (1985)
Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis. Springer, Berlin (1998)
Wohlmuth, B.I.: Variationally consistent discretization schemes and numerical algorithms for contact problems. Acta Numer. 20, 569–734 (2011). https://doi.org/10.1017/S0962492911000079
Acknowledgements
The authors are indebted to J. Haslinger and P. Beremlijski for valuable advices concerning the modeling and discretization of static contact problems with Coulomb friction. Further, our sincere thanks are due to both reviewers for their valuable suggestions.
Funding
Open access funding provided by Austrian Science Fund (FWF). The authors express their gratitude for the support from the Austrian Science Fund (FWF) grant P29190-N32 (H. Gfrerer, M. Mandlmayr) and the Czech Science Foundation (GACR) grant GA22-15524 S (J.V.Outrata), grant GF21-06569K and the MSMT CR grant 8J21AT001 (J.Valdman).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have no competing interests to declare that are relevant to the content of this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
The next statement completes the assertion of Proposition 6.6.
Proposition 9.1
Consider the mapping \(\widetilde{Q}\) and assume that \((\bar{v},\bar{g},\bar{\vartheta })\in M_{2}\). Then
For \((\bar{v},\bar{g},\bar{\vartheta }) \in M_{3}^{-}\) one has
and for \((\bar{v},\bar{g},\bar{\vartheta })=(0,0,0)\in M_{4}\) we have
Proof
The case where \((\bar{x},\bar{g},\bar{\vartheta })\in M_{2}\) is a simple consequence of the relations (6.10) and (6.12). Formula (9.1) follows from (6.11) and (6.14). Finally, in case of \(M_{4}\) one has to analyze various sequences \((v,g,\vartheta )\rightarrow (0,0,0)\) belonging to \(L\cup M_1 \cup M_3^+\). For the cases where \((v,g,\vartheta )\) belongs to \(L\cup M_3^+\) we can simply apply (6.10) and (6.11) to calculate the limits. When \((v,g,\vartheta )\in M_1\) one has \(-{\mathscr{F}}\vartheta \,/\,(\Vert v_{12}\Vert -{\mathscr{F}}\vartheta )\rightarrow \alpha \in [0,1]\) and \(v_{12}\,/\,\Vert v_{12}\Vert \rightarrow w\) for suitable subsequences, and (9.2) follows from (6.14). All subspaces arising in the above formulas have dimension three, and so the statement has been established.
Note that for \((\bar{x},\bar{g},\bar{\vartheta })\in M_{4}\) the collection \({\mathscr{S}}\widetilde{Q}(\bar{v},\bar{g},\bar{\vartheta })\) contains an infinite number of subspaces parameterized by \(\alpha\) and w.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Gfrerer, H., Mandlmayr, M., Outrata, J.V. et al. On the SCD semismooth* Newton method for generalized equations with application to a class of static contact problems with Coulomb friction. Comput Optim Appl 86, 1159–1191 (2023). https://doi.org/10.1007/s10589-022-00429-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10589-022-00429-0