---
title: "Appendix: Proofs"
---

(appendix)=

# Appendix: Proofs

**Notation.** In the proofs below, subscripts on value functions denote partial derivatives (e.g., $v^1_{\lRat\HRat} \equiv \partial^2 v^1/\partial \lRat \partial \HRat$). Superscripts serve two roles: on stage-numbered functions ($v^1, v^2$), superscripts are stage labels and subscripts always denote partials; on $\wFunc$, which carries no stage number, superscripts denote partial derivatives directly ($\wFunc^\aRat \equiv \partial \wFunc/\partial \aRat$, $\wFunc^{\aRat\HRat} \equiv \partial^2 \wFunc/\partial \aRat \partial \HRat$). This follows the convention in the main text ([Section %s](#multdim)).

(proof-grid-homeomorphism)=
## Proof of Grid Homeomorphism Theorem

```{prf:proof} Proof of Grid Homeomorphism Theorem
:label: prf-grid-homeomorphism

The proof establishes that the bilinear extension $\tilde{\Phi}: [1,J] \times [1,K] \to \mathbb{R}^2$ is a homeomorphism by showing it is a continuous injection from a compact set to a Hausdorff space.

**Step 1 (Cell-level injectivity).** Within each cell $[j, j+1] \times [k, k+1]$, the bilinear map in parametric coordinates $(s,t) \in [0,1]^2$ takes the form $x(s,t) = (1-s)(1-t)x_{jk} + s(1-t)x_{j+1,k} + (1-s)t\,x_{j,k+1} + st\,x_{j+1,k+1}$, and similarly for $y$. The partial derivative $\partial x/\partial s = (1-t)(x_{j+1,k} - x_{jk}) + t(x_{j+1,k+1} - x_{j,k+1})$ is a convex combination of the row increments $x_{j+1,k} - x_{jk}$ and $x_{j+1,k+1} - x_{j,k+1}$, both strictly positive by IPM row monotonicity. Hence $\partial x/\partial s > 0$ throughout the cell. Similarly, $\partial y/\partial t > 0$ by column monotonicity.

The Jacobian determinant $\mathbf{J}(s,t) = \frac{\partial x}{\partial s}\frac{\partial y}{\partial t} - \frac{\partial x}{\partial t}\frac{\partial y}{\partial s}$ is bilinear in $(s,t)$, so it attains its minimum at a corner of $[0,1]^2$. Since all four corner Jacobians are positive by the fold-free condition ([Definition %s](#def-fold-free)), $\mathbf{J}(s,t) > 0$ throughout the cell. We establish cell-level injectivity in two steps. First, the boundary of the cell maps injectively: each edge is the image of an affine map with strictly positive slope (since $\partial x/\partial s > 0$ along horizontal edges and $\partial y/\partial t > 0$ along vertical edges), so the boundary image is a simple closed quadrilateral by IPM. Second, an orientation-preserving bilinear map that is injective on the boundary with $\mathbf{J} > 0$ throughout is injective on the entire cell {cite:p}`Knupp1999`.

**Step 2 (Global injectivity).** We show that distinct points in $[1,J] \times [1,K]$ map to distinct points in $\mathbb{R}^2$.

*Points in the same cell:* Injective by Step 1.

*Points in non-adjacent cells:* For any fixed $x$-coordinate $x_0$ in the domain, the interpolated $y$-value on row boundary $k$ at $x = x_0$ is strictly increasing in $k$: since $\partial y/\partial t > 0$ in every cell, the $y$-value along any vertical line through the grid increases monotonically across row boundaries. Therefore, two points in distinct row bands $[k, k+1]$ and $[k', k'+1]$ with $|k - k'| \geq 2$ sharing the same image $x$-coordinate must have distinct image $y$-coordinates. Similarly, cells in different column bands with $|j - j'| \geq 2$ have strictly separated $x$-ranges by the symmetric argument using $\partial x/\partial s > 0$.

*Points in adjacent cells:* Adjacent cells sharing an edge have identical bilinear maps along that shared edge (both cells use the same vertex coordinates). The interiors of adjacent cells cannot overlap: for cells sharing a vertical edge at column $j+1$, the left cell satisfies $x(1,t) = (1-t)\,x_{j+1,k} + t\,x_{j+1,k+1}$ while the right cell satisfies $x(0,t) = (1-t)\,x_{j+1,k} + t\,x_{j+1,k+1}$, the boundary values inherited from the shared column vertices. Since $\partial x / \partial s > 0$ in both cells, the left cell's interior lies strictly to the left of the shared edge and the right cell's interior lies strictly to the right.

For cells sharing a horizontal edge at row $k+1$, a symmetric argument using $\partial y / \partial t > 0$ separates the interiors vertically. For diagonally adjacent cells (sharing only a vertex at $(j+1, k+1)$), the two cells differ in both row and column index. Their $x$-ranges may overlap, but only at the single shared vertex: the left cell's $x$-values at row $k+1$ end at $x_{j+1,k+1}$, while the right cell's $x$-values at row $k$ start at $x_{j+1,k}$. Since $\partial y/\partial t > 0$ separates their $y$-ranges at any shared $x$-coordinate (the lower cell's maximum $y$ at the boundary equals the upper cell's minimum $y$), the interiors are disjoint.

Together, $\tilde{\Phi}$ is a continuous injection from the compact set $[1,J] \times [1,K]$ to $\mathbb{R}^2$. Since continuous bijections from compact spaces to Hausdorff spaces are homeomorphisms, the grid mapping is a homeomorphism onto its image.
```

(proof-egm-regularity)=
## Proof of EGM Regularity Theorem

```{prf:proof} Proof of EGM Regularity Theorem
:label: prf-egm-regularity

**Fold-free (1D case):** For the consumption-savings EGM, the inversion $\mathfrak{m}(\aRat) = \aRat + \cEndFunc(\aRat)$ where $\cEndFunc(\aRat) = (\util')^{-1}((v^2)'(\aRat))$ is a composition of smooth functions when $\util$ and $v^2$ are twice differentiable. By the inverse function theorem, the mapping is locally diffeomorphic wherever its Jacobian is non-singular. Strict concavity of $v^2$ ensures $(v^2)'' < 0$, and strict concavity of $\util$ ensures $\util'' < 0$, so $\partial \cEndFunc / \partial \aRat = \frac{(v^2)''(\aRat)}{\util''(\cEndFunc(\aRat))} > 0$ (both numerator and denominator are negative, yielding a positive ratio). Thus $\partial \mathfrak{m} / \partial \aRat = 1 + \partial \cEndFunc / \partial \aRat > 1 > 0$, establishing positive Jacobian.

**Fold-free (2D case):** For the health investment EGM, the mapping $\Phi: (\lRat, \HRat) \mapsto (\mRat, \hRat)$ is defined by $\mRat = \lRat + \nEndFunc(\lRat, \HRat)$ and $\hRat = \HRat - \gFunc(\nEndFunc(\lRat, \HRat))$. The Jacobian matrix is:
$$\mathbf{J} = \begin{pmatrix} 1 + \partial \nEndFunc/\partial \lRat & \partial \nEndFunc/\partial \HRat \\ -\gFunc' \cdot \partial \nEndFunc/\partial \lRat & 1 - \gFunc' \cdot \partial \nEndFunc/\partial \HRat \end{pmatrix}$$
The determinant is $\det(\mathbf{J}) = (1 + \partial \nEndFunc/\partial \lRat)(1 - \gFunc' \cdot \partial \nEndFunc/\partial \HRat) + \gFunc' \cdot \partial \nEndFunc/\partial \HRat \cdot \partial \nEndFunc/\partial \lRat = 1 + \partial \nEndFunc/\partial \lRat - \gFunc' \cdot \partial \nEndFunc/\partial \HRat$.

To establish $\det(\mathbf{J}) > 0$, we derive the partial derivatives of $\nEndFunc$ explicitly. The first-order condition $\gFunc'(\nEndFunc) = v^1_\lRat / v^1_\HRat$ implicitly defines $\nEndFunc(\lRat, \HRat)$. Differentiating both sides with respect to $\lRat$:
$$\gFunc'' \cdot \frac{\partial \nEndFunc}{\partial \lRat} = \frac{v^1_{\lRat\lRat} \cdot v^1_\HRat - v^1_\lRat \cdot v^1_{\lRat\HRat}}{(v^1_\HRat)^2}$$
Solving for $\partial \nEndFunc/\partial \lRat$:
$$\frac{\partial \nEndFunc}{\partial \lRat} = \frac{v^1_{\lRat\lRat} \cdot v^1_\HRat - v^1_\lRat \cdot v^1_{\lRat\HRat}}{\gFunc'' \cdot (v^1_\HRat)^2}$$
Similarly, differentiating with respect to $\HRat$:
$$\frac{\partial \nEndFunc}{\partial \HRat} = \frac{v^1_{\lRat\HRat} \cdot v^1_\HRat - v^1_\lRat \cdot v^1_{\HRat\HRat}}{\gFunc'' \cdot (v^1_\HRat)^2}$$

Joint strict concavity of $v^1$ ensures the Hessian is negative definite: $v^1_{\lRat\lRat} < 0$, $v^1_{\HRat\HRat} < 0$, and $v^1_{\lRat\lRat} v^1_{\HRat\HRat} - (v^1_{\lRat\HRat})^2 > 0$. Since $\gFunc'' < 0$ (concave production) and $v^1_\HRat > 0$, we can sign the derivatives. The condition $v^1_\HRat > 0$ holds because $v^1_\HRat = \wFunc^\HRat$, which is positive: higher post-investment health $\HRat$ increases both the survival probability $\Surv = 1 - \DiePrb/(1+\HRat)$ and labor income $\yRat_{t+1} = \omega_{t+1}\HRat$, both of which raise continuation value.

We assume $v^1_{\lRat\HRat} \geq 0$, meaning wealth and health are complements or independent in the continuation value. This is imposed as an assumption for the health model. Economic intuition suggests health capital raises the marginal value of wealth (through higher future income and survival probability), but a formal derivation from the model's primitives would require differentiating through the expectation and survival probability, which we do not pursue here. If this complementarity condition fails ($v^1_{\lRat\HRat} < 0$, so that health and wealth are substitutes), $\partial \nEndFunc/\partial \HRat$ can become positive, causing column monotonicity to fail and the endogenous grid to fold, which violates IPM.[^complementarity-derivation]

[^complementarity-derivation]: By the envelope theorem, $(v^1)^\lRat = \wFunc^\aRat$ and $(v^1)^\HRat = \wFunc^\HRat$, so $v^1_{\lRat\HRat} = \partial/\partial \HRat[\wFunc^\aRat(\aRat^*(\lRat,\HRat), \HRat)] = \wFunc^{\aRat\HRat} + \wFunc^{\aRat\aRat}\partial\aRat^*/\partial\HRat$, where superscripts on $\wFunc$ denote partial derivatives. The first term $\wFunc^{\aRat\HRat} \geq 0$ reflects health raising the return to saving through higher future income. The second term involves $\wFunc^{\aRat\aRat} < 0$ (concavity in assets) multiplied by $\partial\aRat^*/\partial\HRat$, the response of optimal savings to health. The sign of this cross-partial depends on the balance of these two terms; we impose $v^1_{\lRat\HRat} \geq 0$ as an assumption rather than deriving it from primitives.

For $\partial \nEndFunc/\partial \lRat$: the numerator $v^1_{\lRat\lRat} \cdot v^1_\HRat - v^1_\lRat \cdot v^1_{\lRat\HRat}$ has $v^1_{\lRat\lRat} < 0$ and $v^1_\HRat > 0$, giving a negative first term; with $v^1_\lRat > 0$ and $v^1_{\lRat\HRat} \geq 0$, the second term $-v^1_\lRat \cdot v^1_{\lRat\HRat}$ is non-positive, so the numerator is negative. Dividing by $\gFunc'' < 0$ yields $\partial \nEndFunc/\partial \lRat > 0$, hence $1 + \partial \nEndFunc/\partial \lRat > 1 > 0$.

For $\partial \nEndFunc/\partial \HRat$: the numerator is $v^1_{\lRat\HRat} \cdot v^1_\HRat - v^1_\lRat \cdot v^1_{\HRat\HRat}$. With $v^1_{\HRat\HRat} < 0$ and $v^1_\lRat > 0$, the second term $-v^1_\lRat \cdot v^1_{\HRat\HRat} > 0$. If $v^1_{\lRat\HRat} \geq 0$, the first term is also non-negative, so the numerator is positive. Dividing by $\gFunc'' < 0$ gives $\partial \nEndFunc/\partial \HRat \leq 0$. Since $\gFunc' > 0$, this yields $-\gFunc' \cdot \partial \nEndFunc/\partial \HRat \geq 0$, contributing non-negatively to $\det(\mathbf{J})$.

Combining: $\det(\mathbf{J}) = 1 + \partial \nEndFunc/\partial \lRat - \gFunc' \cdot \partial \nEndFunc/\partial \HRat \geq 1 + 0 + 0 = 1 > 0$, where the first inequality uses $\partial \nEndFunc/\partial \lRat > 0$ and $-\gFunc' \cdot \partial \nEndFunc/\partial \HRat \geq 0$. The argument extends to higher dimensions by induction on the number of EGM stages, with each stage's Jacobian inheriting positivity from the composition of positive-Jacobian mappings. The inductive step requires that each stage's transition satisfies the same regularity conditions (strict concavity, complementarity) established above for the 2D case; the precise conditions on cross-partials become more restrictive as dimensionality increases.

**IPM:** Since $\mathfrak{m}'(\aRat) = 1 + \partial\cEndFunc/\partial\aRat > 1$ (established in the fold-free step), $\mathfrak{m}$ is strictly increasing. For adjacent exogenous grid points $\aRat_1 < \aRat_2$, we have $\mathfrak{m}(\aRat_1) < \mathfrak{m}(\aRat_2)$, preserving coordinate ordering.

For the 2D case, IPM requires: (i) row monotonicity, where for fixed $\HRat$ the coordinate $\mRat$ increases with $\lRat$; and (ii) column monotonicity, where for fixed $\lRat$ the coordinate $\hRat$ increases with $\HRat$. From the fold-free analysis above, $\partial \mRat/\partial \lRat = 1 + \partial \nEndFunc/\partial \lRat > 1 > 0$, establishing row monotonicity. For column monotonicity, $\hRat = \HRat - \gFunc(\nEndFunc)$, so $\partial \hRat/\partial \HRat = 1 - \gFunc' \cdot \partial \nEndFunc/\partial \HRat$. Under the complementarity assumption ($v^1_{\lRat\HRat} \geq 0$), we showed $\partial \nEndFunc/\partial \HRat \leq 0$, so $-\gFunc' \cdot \partial \nEndFunc/\partial \HRat \geq 0$ (since $\gFunc' > 0$), yielding $\partial \hRat/\partial \HRat \geq 1 > 0$.

**IPO:** Twice-differentiability of $\vFunc$ implies $(v^2)'$ is Lipschitz continuous. From $\cEndFunc = (\util')^{-1}((v^2)')$, the chain rule gives $L_{\cEndFunc} = \sup |d\cEndFunc/d\aRat| = \sup |(v^2)''|/|\util''|$, which is bounded when both second derivatives are bounded away from zero. Thus the EGM mapping $\mathfrak{m}(\aRat) = \aRat + \cEndFunc(\aRat)$ is Lipschitz with constant $L = 1 + L_{\cEndFunc}$. Since $v^2$ is $C^2$ on a compact domain (by assumption), $|(v^2)''|$ is bounded by the extreme value theorem. CRRA utility bounds $|\util''|$ away from zero on compact domains, so $L$ is finite. For grid spacing $\Delta$, adjacent points map to points at most $L\Delta$ apart. If adjacent rows have spacing $\Delta_j$ and adjacent columns have spacing $\Delta_k$, the IPO constant is $\alpha = L \cdot \Delta_k / \Delta_j$; for uniform grids, $\alpha = L$.

For the 2D health investment EGM, the relevant Lipschitz constant is the cross-directional $L_H = \sup |\partial m_{\text{end}}/\partial H|$. This quantity is bounded under the same smoothness conditions assumed for the value function, and we verify numerically that $\alpha$ remains small in all computed examples.

Strict concavity bounds the second derivatives of $\vFunc$, which bounds $L$ and hence $\alpha$. Small $\alpha$ (smooth value function) means brackets change slowly across rows, yielding efficient $O(\log J \cdot \log K)$ complexity with empirical performance approaching $O(\log J + \log K)$ for typical problems. Thus, the same smoothness conditions that make EGM applicable also make ENGINE efficient.
```

(proof-engine-complexity)=
## Proof of ENGINE Complexity

```{prf:proof} Proof of ENGINE Complexity
:label: prf-engine-complexity

ENGINE locates the containing cell for a query $(x^*, y^*)$ via binary search on rows, with each probe requiring a bracket search along that row.

**Binary search structure.** ENGINE performs binary search over the $K$ rows to find the row band $[k^*, k^*+1]$ satisfying $\hat{y}_{k^*} \leq y^* < \hat{y}_{k^*+1}$. Each probe at row $k$ computes the intermediate coordinate $\hat{y}_k$ by interpolating along that row at $x = x^*$. The comparison $\hat{y}_k \lessgtr y^*$ determines whether to search higher or lower rows. This requires $O(\log K)$ probes. Binary search is valid because the interpolated $y$-coordinate at any fixed $x$ is strictly increasing in $k$: by fold-free ($\partial y/\partial t > 0$ in every cell) and the continuity of the bilinear extension, the $y$-value along any vertical line through the grid increases monotonically across row boundaries.

**Cost per probe.** Each probe must find the $j$-bracket satisfying $x_{jk} \leq x^* < x_{j+1,k}$. The first probe uses binary search over $J$ column indices, costing $O(\log J)$. Subsequent probes exploit the IPO property: if the previous probe at row $k'$ found bracket $j'$, then the bracket $j^*$ at row $k$ satisfies $|j^* - j'| \leq \alpha |k - k'|$. Binary search within the constrained window of size $O(\alpha \cdot |k - k'|)$ costs $O(\log(\alpha \cdot |k - k'|)) = O(\log J)$ in the worst case. With $O(\log K)$ probes, each costing $O(\log J)$, the total complexity is $O(\log J \cdot \log K)$.

**Empirical performance for small $\alpha$.** When $\alpha = O(1)$, the bracket window at each probe is bounded by $O(\alpha \cdot |k - k'|)$. As binary search on rows converges, the bracket window at each probe narrows: at depth $i$, probed rows are $O(K/2^i)$ apart, so the bracket window has size $O(\alpha K/2^i)$. For typical smooth EGM problems where $\alpha \leq 2$, the bracket windows shrink rapidly enough that empirical per-probe costs are $O(1)$ after the first probe, yielding observed $O(\log J + \log K)$ total complexity. A formal proof of this tighter bound would require a potential function argument showing that the total bracket search work across all probes telescopes; the $O(\log J \cdot \log K)$ bound is what the analysis rigorously establishes.

**Regridding complexity.** For $P \times Q$ query points, each query is independent, giving $O(PQ \log J \cdot \log K)$. When queries are sorted (as in meshgrid evaluation), the walking-pointer optimization processes each row's queries in order, reducing the bracket search to $O(1)$ amortized per query within each row.
```

(engine-algorithm)=
## ENGINE Algorithm

```{raw} latex
\begin{algorithm}[htbp]
\caption{ENGINE Two-Pass Interpolation}\label{alg:engine}
\begin{algorithmic}[1]
\Require Grid $\{(x_{jk}, y_{jk}, f_{jk})\}$ for $j=1,\ldots,J$ and $k=1,\ldots,K$; query point $(x^*, y^*)$
\Ensure Interpolated value $f(x^*, y^*)$
\State
\State \textbf{Pass 1: Binary search on rows}
\State $k_l \gets 1$, $k_h \gets K$
\State $j_l \gets \textnormal{BinarySearch}(x^*, \{x_{j,k_l}\}_j)$ \Comment{Initial bracket at bottom row}
\State $\hat{y}_l \gets \textnormal{Interp1D}(x^*, x_{j_l,k_l}, x_{j_l+1,k_l}, y_{j_l,k_l}, y_{j_l+1,k_l})$
\While{$k_h - k_l > 1$}
    \State $k_m \gets \lfloor (k_l + k_h) / 2 \rfloor$
    \State $j_m \gets \textnormal{BracketSearch}(x^*, \{x_{j,k_m}\}_j, j_l)$ \Comment{Search near $j_l$ via IPO}
    \State $\hat{y}_m \gets \textnormal{Interp1D}(x^*, x_{j_m,k_m}, x_{j_m+1,k_m}, y_{j_m,k_m}, y_{j_m+1,k_m})$
    \If{$\hat{y}_m \leq y^*$}
        \State $k_l \gets k_m$, $j_l \gets j_m$, $\hat{y}_l \gets \hat{y}_m$
    \Else
        \State $k_h \gets k_m$
    \EndIf
\EndWhile
\State
\State \textbf{Pass 2: Final interpolation}
\State $j_h \gets \textnormal{BracketSearch}(x^*, \{x_{j,k_h}\}_j, j_l)$
\State $\hat{y}_h \gets \textnormal{Interp1D}(x^*, x_{j_h,k_h}, x_{j_h+1,k_h}, y_{j_h,k_h}, y_{j_h+1,k_h})$
\State $\hat{f}_l \gets \textnormal{Interp1D}(x^*, x_{j_l,k_l}, x_{j_l+1,k_l}, f_{j_l,k_l}, f_{j_l+1,k_l})$
\State $\hat{f}_h \gets \textnormal{Interp1D}(x^*, x_{j_h,k_h}, x_{j_h+1,k_h}, f_{j_h,k_h}, f_{j_h+1,k_h})$
\State $t \gets (y^* - \hat{y}_l) / (\hat{y}_h - \hat{y}_l)$
\State \textbf{return} $(1-t) \hat{f}_l + t \hat{f}_h$
\end{algorithmic}
\end{algorithm}
```

The algorithm uses subscripts $l$, $m$, $h$ for low, mid, and high indices respectively. Pass 1 performs binary search over rows to find the bounding row band $[k_l, k_h]$ satisfying $\hat{y}_l \leq y^* < \hat{y}_h$. Each probe at row $k$ finds the column bracket $j$ such that $x_{jk} \leq x^* < x_{j+1,k}$, then interpolates $\hat{y}$ along that row. The IPO property bounds the bracket search range at each probe: consecutive probes have brackets within $\alpha$ positions of each other, so binary search within the constrained window costs $O(\log J)$ per probe in the worst case. Pass 2 interpolates function values at the bounding rows and combines them linearly.

For query points outside the grid domain, ENGINE performs linear extrapolation by extending boundary slopes. Each 1D interpolation step handles out-of-bounds queries by using the slope between the two nearest boundary points, ensuring continuous extension of the approximation beyond the grid's convex hull.

EGM applications require extrapolation: simulation may occasionally produce state values slightly outside the solution grid due to shock realizations or numerical precision. The extrapolation extends boundary slopes: for $x^* < x_1$, the value is $f_1 + (x^* - x_1)(f_2 - f_1)/(x_2 - x_1)$, and symmetrically for $x^* > x_J$. Extrapolation reliability depends critically on grid placement. Near binding constraints, policy functions exhibit high curvature or kinks, making linear extrapolation unreliable. In wealth-rich regions, by contrast, policies typically become smooth and approximately linear as precautionary motives diminish and consumption approaches permanent income rules. Standard practice therefore extends the asset grid well beyond the constraint region, typically to several multiples of target wealth (steady-state assets), ensuring that (i) interpolation handles the economically relevant constrained region exactly, and (ii) any extrapolation occurs only in the smooth unconstrained region where linear approximation is accurate. Extrapolation to economically invalid regions (e.g., negative assets when borrowing is constrained) produces meaningless results regardless of smoothness.
