Clearly the function \(f(z)=e^z\) is an entire function that satisfies \(f(\log{n})=n\) for every \(n\in\mathbb{N}\). Are there any other such entire functions?
The answer is no if we insist that \(f\) does not grow "too quickly". We can show this by studying the relationship between the growth of an entire function and the distribution of its zeros. The fundamental result in this theory is Jensen's formula. Theorem (Jensen's formula): Let \(f\colon G\to\mathbb{C}\) be holomorphic with \(\overline{B_r(0)}\subseteq G\). Let \(a_1,\dots,a_n\) be zeros of \(f\) in \(B_r(0)\) and suppose \(f(0)\neq0\). Then, \[\log{|f(0)|}+\sum_{k=1}^{n}{\log{\frac{r}{|a_k|}}}=\frac{1}{2\pi}\int_{0}^{2\pi}{\log{|f(re^{it})|}\ \mathrm{d}t}.\] Jensen's formula tells us that the distribution of the zeros of an entire function is controlled by the growth of the function in the following sense. Corollary: Let \(f\colon\mathbb{C}\to\mathbb{C}\) be an entire function with \(f(0)=1\). For ever \(r>0\), let \(N(r)\) denote the number of zeros of \(f\) in the ball \(B_r(0)\) and let \(M(r):=\sup_{z\in B_r(0)}{|f(z)|}\). Then, for every \(r>0\), \[N(r)\log{2}\leq\log{M(2r)}.\] Proof: Pick \(r>0\). Let \(a_1,\dots,a_n\) be the roots of \(f\) in \(B_{2r}(0)\). Observe that \(\log{\left|\frac{2r}{a_k}\right|}>0\) for each \(k\) by construction. Hence, by Jensen's formula, \[\log{M(2r)}\geq\frac{1}{2\pi}\int_{0}^{2\pi}{\log{|f(2re^{it})|\ \mathrm{d}t}}=\log{|f(0)|}+\sum_{k=1}^{n}{\log{\left|\frac{2r}{a_k}\right|}}=\sum_{k=1}^{n}{\log{\left|\frac{2r}{a_k}\right|}}.\] We can break up the sum on the right hand side by considering the indices \(k\) for which \(|a_k|<r\) and the indices \(k\) for which \(r\leq|a_k|<2r\). This gives us \[\sum_{k=1}^{n}{\log{\left|\frac{2r}{a_k}\right|}}=\sum_{|a_k|<r}{\log{\left|\frac{2r}{a_k}\right|}}+\sum_{r\leq|a_k|<2r}{\log{\left|\frac{2r}{a_k}\right|}}\geq \sum_{|a_k|<r}{\log{\left|\frac{2r}{a_k}\right|}}\geq N(r)\log{2},\] with the inequalities coming from the fact that the terms in the first sum are at least \(\log{2}\) and the terms in the second sum are positive. \(\square\) This corollary leads to another way in which the growth of an entire function controls the distribution of the zeros. Let \(f\) be an entire function. Suppose \(f\) has the nonzero roots \(\{a_n\}_{n\in S}\), where \(S\) is finite or countable. The critical exponent of \(f\), denoted \(\alpha\), is defined to be \[\alpha:=\inf{\left\{t>0\colon\sum_{n\in S}{\frac{1}{|a_n|^t}}<\infty\right\}}.\] Clearly, \(\alpha\) quantifies the distribution of the zeros of \(f\) by measuring how quickly the roots of \(f\) grow: the larger \(\alpha\) is, the slower the roots of \(f\) must grow. Notice that it is very simple to see that if \(S\) is countable, then for any \(\epsilon>0\), we have \(\sum_{n\in S}{\frac{1}{|a_n|^{\alpha+\epsilon}}}<\sum_{n\in S}{\frac{1}{|a_n|^{\alpha-\epsilon}}}=\infty\). Hence, \(\alpha\) is another way to quantify the rate of decay of the terms of a series. While the behavior of the series is easy to understand when we take powers above and below \(\alpha\), it is unclear what actually happens when the power is exactly \(\alpha\). In fact, the series may either converge or diverge when we take the power to be exactly \(\alpha\). This is another sense in which the barrier between convergent and divergent series is fuzzy. Now, suppose \(f\) is an arbitrary entire function. We define the order of \(f\), denoted \(\lambda\), to be \[\lambda:=\limsup_{r\to\infty}{\frac{\log{\log{M(r)}}}{\log{r}}}.\] Clearly, \(\lambda\) is a measure of how quickly \(f\) grows. In particular, \(\lambda\) detects "exponentially polynomial" growth. That is, the order of the entire function \(\exp{z^d}\) is \(d\). The order of any polynomial is simply zero. It is fairly straightforward to show that order of a sum or product of entire functions is at most the maximal order of the addends or factors, respectively. There is a relationship between the critical exponent and the order of an entire function. This result is morally identical to our corollary: the distribution of the zeros of an entire function is controlled by the growth of the function. Proposition: Let \(f\) be an entire function with critical exponent \(\alpha\) and order \(\lambda\). Then, \(\alpha\leq\lambda\). Proof: It is clear that the order and critical exponent of an entire function is invariant under multiplication of the function by a nonzero constant, so we may assume without loss of generality that \(f(0)=1\). First, note that when \(f\) has finitely many zeros, we have \(\alpha=0\) and the conclusion is immediate. So it suffices to assume that \(f\) has countably many zeros. Suppose that the zeros of \(f\) are \(a_1,a_2,\dots\) where \(|a_1|\leq|a_2|\leq\dots\). Note that \(|a_n|\to\infty\) since otherwise \(f\) must be the constant zero function which contradicts our assumption that \(f\) has countably many roots. Then by the corollary, for every \(n\in\mathbb{N}\) we have \[n-1\leq N(|a_n|)\leq \frac{\log{M(2|a_n|)}}{\log{2}}.\] Pick \(\epsilon>0\). By the definition of order, there exists \(R\) so large that \(\log{M(r)}\leq r^{\lambda+\frac{\epsilon}{2}}\) for all \(r>R\). Since \(|a_n|\to\infty\), we have that there exists \(N\) so that for every \(n>N\) we have \[n-1\leq N(|a_n|)\leq \frac{\log{M(2|a_n|)}}{\log{2}}\leq\frac{(2|a_n|)^{\lambda+\frac{\epsilon}{2}}}{\log{2}}.\] Rearranging this inequality we obtain \[\frac{1}{|a_n|}\leq \frac{2}{[(n-1)\log{2}]^{\frac{1}{\lambda+\frac{\epsilon}{2}}}}.\] Therefore, \[\frac{1}{|a_n|^{\lambda+\epsilon}}\leq\frac{2^{\lambda+\epsilon}}{[(n-1)\log{2}]^{\frac{\lambda+\epsilon}{\lambda+\frac{\epsilon}{2}}}}.\] Since \(\frac{\lambda+\epsilon}{\lambda+\frac{\epsilon}{2}}>1\), we have that \[\sum_{n=N+1}^{\infty}{\frac{1}{|a_n|^{\lambda+\epsilon}}}\leq\sum_{n=N+1}^{\infty}{\frac{2^{\lambda+\epsilon}}{[(n-1)\log{2}]^{\frac{\lambda+\epsilon}{\lambda+\frac{\epsilon}{2}}}}}<\infty.\] Therefore, \(\sum_{n=1}^{\infty}{\frac{1}{|a_n|^{\lambda+\epsilon}}}<\infty\) so that \(\lambda+\epsilon\geq\alpha\). Since \(\epsilon\) was arbitrary, the conclusion follows. \(\square\) Notice that in our proof above, we used the fact that if the \(a_n\) were not unbounded, \(f\) must be the constant zero function. This essentially due to the Bolzano-Weierstrass theorem, which would guarantee that there the zeros of \(f\) have a limit point, from which it follows by the identity theorem that \(f\) is the constant zero function. This subtle point makes a reappearance in our main result, which we are now able to state. Theorem: Let \(f\) and \(g\) be entire functions with order at most \(\delta<\infty\). Suppose that \(\{a_n\}_{n\in\mathbb{N}}\) is a sequence of nonzero complex numbers such that \(f(a_n)=g(a_n)\) for every \(n\in\mathbb{N}\) and \[\sum_{n=1}^{\infty}{\frac{1}{|a_n|^{1+\delta}}}<\infty.\] Then, \(f=g\). Proof: Consider the entire function \(F=f-g\). Let \(0\) be a root of \(F\) with multiplicity \(m\). Then we can write \(F=z^mG\) for some entire function \(G\) where \(G(0)\neq0\) but \(G(a_n)=0\) for every \(n\). Let \(\lambda\) be the order of \(G\). We know that \(\lambda\leq\delta\) since the order of a sum is at most the maximal order of the addends. By the condition given on \(\{a_n\}_{n\in\mathbb{N}}\), we have \(\lambda+1\leq\delta+1\leq\alpha\) where \(\alpha\) is the critical exponent of \(G\). But by the previous proposition, we also have \(\alpha\leq\lambda\), hence we have \(\lambda+1\leq\lambda\), contradicting our assumption that \(\delta\) is finite. So where is the mistake? The error is in assuming that the critical exponent of \(G\) is well-defined to begin with. Recall that the definition of the critical exponent requires the entire function to have at most countably many roots. Hence, \(G\) must have have uncountably many roots. Now it is simple to show that \(G\) must be the constant zero function. We may reason as follows. Since \(\mathbb{C}\) is \(\sigma\)-compact, we can let \(\{S_n\}_{n\in\mathbb{N}}\) be a countable collection of compact subsets of \(\mathbb{C}\) such that \(\mathbb{C}=\bigcup_{n\in\mathbb{N}}{S_n}\) (one can choose, for example, closed unit squares). Let \(Z\subseteq\mathbb{C}\) be the zero set of \(G\). Suppose \(Z\cap S_n\) is finite for every \(n\in\mathbb{N}\). Then \(Z=\bigcup_{n\in\mathbb{N}}{(Z\cap S_n)}\) is countable as a countable union of finite sets, which contradicts our observation that \(Z\) is uncountable. Hence, there exists \(N\in\mathbb{N}\) such that \(Z\cap S_N\) is infinite. But since \(S_N\) is compact, it is sequentially compact, and thus \(Z\cap S_N\) has a limit point in \(S_N\). In particular, \(Z\) has a limit point so \(F\) is identically zero. \(\square\) This is a great example of why it is very important to check the hypotheses of not just theorems, propositions, and lemmas, but also definitions! The theorem essentially tells us the following. Suppose \(f\) is an entire function with order \(\lambda\) and zeros \(\{a_n\}_{n\in\mathbb{N}}\). If \(f\) grows too slowly (that is, \(\lambda\) is small) and the zeros of \(f\) do not grow very quickly (that is, \(\frac{1}{|a_n|}\) does not decay rapidly), then \(f\) is the constant zero function. We can use this idea to show that if \(g(z)=e^z\), and \(f\) is an entire function of finite order such that \(f(\log{n})=n\) for all \(n\in\mathbb{N}\), the entire function \(f-g\) is the constant zero function. In essence, we will show that if \(f\) does not grow too quickly, the roots \(\log{n}\) grow so slowly that the assumption that \(f\) is entire forces \(f(z)-e^z\) to be identically zero. Let \(g(z)=e^z\). Suppose \(f\) is an entire function of finite order such that \(f(\log{n})=n\) for all \(n\in\mathbb{N}\). \(g\) is of order \(1\) and agrees with \(f\) on \(\{\log{n}\}_{n\in\mathbb{N}}\). Let \(\lambda_f\) be the order of \(f\) and \(\delta=\max{(1,\lambda_f)}\). Notice that the orders of \(f\) and \(g\) are at most \(\delta<\infty\). Observe that if we apply L'hôpital's rule \(\left\lfloor 1+\delta\right\rfloor\) times, we obtain that \[\lim_{x\to\infty}{\frac{x}{(\log{x})^{1+\delta}}}=\lim_{x\to\infty}{\frac{x}{\left\lfloor 1+\delta\right\rfloor!(\log{x})^{\{\delta\}}}}\geq\frac{1}{\left\lfloor 1+\delta\right\rfloor!}\lim_{x\to\infty}{\frac{x}{\log{x}}}=\infty,\] where \(\{\delta\}\) is the fractional part of \(\delta\) and the last equality follows from one more application of L'hôpital's rule. So \(\lim_{x\to\infty}{\frac{x}{(\log{x})^{1+\delta}}}=\infty\). This means there exists some \(N\in\mathbb{N}\) so that for all \(n>N\), we have \(n>(\log{n})^{1+\delta}\) so that \(\frac{1}{n}<\frac{1}{(\log{n})^{1+\delta}}\). Since the harmonic series \(\sum_{n=1}^{\infty}{\frac{1}{n}}\) diverges, we note that by direct comparison, \[\sum_{n=2}^{\infty}{\frac{1}{|\log{n}|^{1+\delta}}}=\infty.\] Now by the previous theorem, it must be true that \(f=g\). So there is only one function \(f\) that can exist as described, and it is \(f(z)=e^z\). Note that the assumption that \(f\) has finite order is crucial. Often, in these types of arguments, one really needs some control over the growth of the entire function being studied. I am not sure if much can be said if we remove the requirement that \(f\) must have finite order. This theory can be pushed farther. The growth of the zeros of an entire function can be quantified in way distinct from the critical exponent. The quantity that does this is known as the genus of the entire function, and it is somewhat related to the critical exponent. If we denote the genus of an entire function as \(h\) and the order of the function as \(\lambda\), then it is true that \(h\leq\lambda\leq h+1\). This is a pretty strong relationship: it gives us a very good understanding of the growth of an entire function given the growth of its zeros. This result can be used to prove weak versions of the Picard theorems, which I think are some of the most interesting results in complex analysis.
0 Comments
In high school calculus, one is often inundated with various series convergence tests. It is often a headache to determine which convergence test to use on a particular series. None of the high school convergence tests work on every series. For example,
The answer is no. A short and clever argument shows that an algorithm that can determine if any sequence (or equivalently, series) converges would be capable of solving the halting problem, and thus no such algorithm can exist. This result may be disappointing to high school math students. It also suggests that any notion of a "barrier" between the convergent series and the divergent series would be fuzzy at best. There are many ways to define what such a threshold could be. Intuitively, we would like to say that the threshold is some "critical rate of decay" on the terms of series (modulo trivial modifications to the series) such that a series converges if and only if the rate of decay of the terms of that series is at least the critical rate of decay. There are several different ways of making this precise. We will discuss the following way. Definition: Let \(\{a_n\}_{n\in\mathbb{N}}\) be a sequence of positive numbers. The series \(\sum_{n=1}^{\infty}{a_n}\) is said to be a threshold series (and the sequence \(\{a_n\}_{n\in\mathbb{N}}\) is a threshold sequence) if \(\sum_{n=1}^{\infty}{a_n|c_n|}<\infty\) if and only if the sequence \(\{c_n\}_{n\in\mathbb{N}}\) is bounded. Indeed, this notion of a threshold series captures what we intuitively want. Of course, modifying the terms of any series by any collection of bounded coefficients will not change the convergence of the series. Our notion of threshold series thus includes all such modifications (this is what we meant by quantifying a rate of decay of the terms of a series modulo trivial modifications). The main point is that modification by an unbounded collection of coefficients will always tip the threshold series over the edge into the realm of divergent series—no matter how slowly our coefficients grow. Thus, a threshold series is a series that exhibits a "critical rate of decay" in its terms. It turns out that our hunch that the barrier between convergent and divergent series is fuzzy is correct in the sense that threshold series do not exist. To prove this result, we will need the following lemma. Lemma: Let \(X\) and \(Y\) be topological spaces and let \(\Omega\) be a dense subset of \(Y\). If \(\varphi\colon X\to Y\) is an open map, then \(\varphi^{-1}(\Omega)\) is dense in \(X\). Proof: Pick \(x\in X\) and an open neighborhood \(U\) of \(x\). Since \(\varphi\) is an open map and \(U\) is nonempty, \(\varphi(U)\) is an open subset of \(Y\). Since \(\Omega\) is dense, there exists \(y\in\varphi(U)\cap\Omega\). In particular, since \(y\in\varphi(U)\), there exists \(x'\in U\) such that \(\varphi(x')=y\in\Omega\). Hence, \(x'\in\varphi^{-1}(\Omega)\), so \(U\cap\varphi^{-1}(\Omega)\) is nonempty. \(\square\) Now we may proceed with our main argument. We will reason by contradiction. Suppose that there exists a threshold sequence \(\{a_n\}_{n\in\mathbb{N}}\). Let \(B(\mathbb{N})\) be the space of bounded functions on \(\mathbb{N}\). We can interpret \(B(\mathbb{N})\) as a Banach space with the uniform norm (i.e. the supremum norm). Let \(\mu\) be the counting measure on \(\mathbb{N}\) so that \(L^1(\mu)\) is precisely the collection of absolutely convergent sequences. Define the map \(T\colon B(\mathbb{N})\to L^1(\mu)\) given by \((Tf)(n)=a_nf(n)\) for every \(n\in\mathbb{N}\) and \(f\in B(\mathbb{N})\). Note that the image of \(T\) is a subset of \(L^1(\mu)\) and so \(L^1(\mu)\) is a valid codomain because \(\{a_n\}_{n\in\mathbb{N}}\) is a threshold sequence. It is trivial to check that \(T\) is a surjective linear map between Banach spaces. Suppose that we have a sequence of functions \(\{g_n\}_{n\in\mathbb{N}}\subseteq B(\mathbb{N})\) such that \(g_n\to g\) and \(Tg_n\to h\) where the convergence occurs with respect to the norms of the relevant Banach spaces. Pick \(\epsilon>0\) and fix \(k\in\mathbb{N}\). Since \(g_n\to g\) in uniform norm, we have that \(g\) is the pointwise limit of the \(g_n\). In particular, we may pick \(N_1\) so large that for all \(n>N_1\) we have \(|g_n(k)-g(k)|<\frac{\epsilon}{2a_k}\). Since \(Tg_n\to h\) in the \(L^1\) norm, we pick \(N_2\) so large that for all \(n>N_2\) we have \[|a_kg_n(k)-h(k)|\leq\sum_{j=1}^{\infty}{|a_jg_n(j)-h(j)|}=\sum_{j=1}^{\infty}{|Tg_n(j)-h(j)|}=\|Tg_n-h\|_1<\frac{\epsilon}{2}.\] Now, for \(n>\max{(N_1,N_2)}\), we have \[\begin{split} |(Tg)(k)-h(k)|&=|a_kg(k)-h(k)|\\ &=|a_kg(k)-a_kg_n(k)+a_kg_n(k)-h(k)|\\ &\leq|a_kg(k)-a_kg_n(k)|+|a_kg_n(k)-h(k)|\\ &<\frac{\epsilon}{2}+\frac{\epsilon}{2}=\epsilon. \end{split}\] Since \(\epsilon\) is arbitrary, we have that \((Tg)(k)=h(k)\). Thus, \(Tg=h\). We have showed that \(T\) is a closed linear map, hence \(T\) is continuous by the closed graph theorem. Now since \(T\) is a surjective continuous linear map, \(T\) is open by the open mapping theorem. Let us define \[S=\left\{f\in B(\mathbb{N}): \left|f^{-1}(\mathbb{R}\setminus\{0\})\right|<\infty\right\}.\] It is clear that \(T^{-1}(S)=S\). Notice that for any \(f\in S\), if \(\chi_{\mathbb{N}}\in B(\mathbb{N})\) is the constant indicator function, \[\sup_{n\in\mathbb{N}}{|\chi_{\mathbb{N}}(n)-f(n)|}\geq\sup_{n\in f^{-1}(\{0\})}{|\chi_{\mathbb{N}}(n)-f(n)|}=1.\] Hence, \(S\) is not dense in \(B(\mathbb{N})\). Now pick \(\epsilon>0\) and \(f\in L^1(\mu)\). By definition, \(\sum_{n=1}^{\infty}{|f(n)|}<\infty\), so we may pick \(N\) so large that for \(m\geq N\) we have \(\sum_{n=m}^{\infty}{|f(n)|}<\epsilon\). Define the function \(g\) by \[g(n)=\begin{cases} f(n) & n<N\\ 0 & n\geq N. \end{cases}\] Note that \(g\in S\) by construction. Moreover, \[\|f-g\|_1=\sum_{n=1}^{\infty}{|f(n)-g(n)|}=\sum_{n=N}^{\infty}{|f(n)|}<\epsilon.\] This means that \(S\) is dense in \(L^1(\mu)\). So \(S\) is dense in \(L^1(\mu)\) but not \(B(\mathbb{N})\) and \(T\colon B(\mathbb{N})\to L^1(\mu)\) is an open map with \(T^{-1}(S)=S\). This contradicts the lemma, completing the argument. This is one of my favorite problems because it provides some insight to something I've always thought about when I was younger (the "barrier" between convergent and divergent series) using some comparatively abstract techniques from functional analysis. A lot was swept under the rug via the open mapping and closed graph theorems. I find it fascinating that these abstract results can tell us something that I find relatively tangible about series. The problem of measuring how quickly the terms of a series decay isn't just one from my big bag of problems that I find interesting. It is in fact well-studied in complex analysis, where many results are known regarding how quickly the zeros of an entire function grow. Allegedly, this has significant ramifications in analytic number theory. At some point in the future, I will talk about the critical exponent and the order of an entire function and the relationship between the growth of an entire function and the distribution of its zeros. One of my favorite ideas in all of mathematics is to study the topology of a space by studying functions on the space. This is the underlying idea of Morse theory, which I hope to learn more about. A huge set of examples of this idea that I am more familiar with comes from complex analysis.
One may observe that in the examples that I have given, the functions we are studying to probe the topology possess some non-topological properties. For instance, holomorphic functions famously have some extremely strong properties most of which are not topological in nature at all. The same goes for harmonic functions, which share many properties with holomorphic functions. In general, differentiability is not a property of a function that interacts much at all with the domain topology. So it is natural to wonder if we can study the topology of a space by studying functions that obey no assumption other than the assumption that they interact somehow with the topology. It also seems reasonable that the topology should be uniquely determined by such functions. More precisely, let \(X\) be a topological space and let \(C(X)\) be the ring of real-valued continuous functions on \(X\). Question: Can one recover the topology on \(X\) given \(C(X)\)? In this blog post, we will show that the answer is in the affirmative. We will focus on the following special case: fix \(X\) to be a compact Hausdorff topological space. Consider the spectrum of the ring, \(C(X)\), which we denote \(\text{Spec }C(X)\). We can interpret the spectrum as a topological space by giving it the Zariski topology. Let \(\mathscr{M}\) be the set of maximal ideals of \(C(X)\). Since every maximal ideal is prime, \(\mathscr{M}\subseteq\text{Spec }C(X)\) and we can endow \(\mathscr{M}\) with the subspace topology. We sometimes refer to \(\mathscr{M}\) as the maximal spectrum of \(C(X)\). The incredible fact which we will prove is that \(X\) is homeomorphic to \(\mathscr{M}\). For every \(x\in X\) define \(I_x=\left\{f\in C(X)\colon f(x)=0\right\}\). Clearly, \(I_x\) is an ideal of \(C(X)\). What is less clear is that \(I_x\) is always a maximal ideal. We will show this in two different ways. In the first method, we will show that any ideal properly containing \(I_x\) is the full ring \(C(X)\). Fix \(x\in X\) and pick \(g\in C(X)\setminus I_x\). Since \(X\) is Hausdorff, \(\{x\}\) is closed, and since \(g\) is continuous, \(g^{-1}(\{0\})\) is closed (and disjoint with \(\{x\}\) since \(g\notin I_x\)). Recall that every compact Hausdorff space is normal (\(T_4\)), so by Urysohn's lemma, there exists a continuous function \(f\colon X\to\mathbb{R}\) such that \(f(x)=0\) but \(f(y)=1\) for all \(y\in g^{-1}(\{0\})\). Notice that \(f\in I_x\). Moreover, \(f\) and \(g\) have no common zeros by construction. Therefore, \(f^2+g^2\in\langle f,g\rangle\) is always positive, so the multiplicative inverse \(\frac{1}{f^2+g^2}\) exists in \(C(X)\). Since ideals are closed under multiplication from any element, \(\chi_X=(f^2+g^2)\cdot\frac{1}{f^2+g^2}\in\langle f,g\rangle\). So the ideal \(\langle f,g\rangle\) contains the identity element of the ring and thus \[C(X)=\langle f,g\rangle\subseteq\langle I_x,g\rangle\subseteq C(X).\] Hence, \(I_x\) is a maximal ideal as claimed. We have established that \(\{I_x\}_{x\in X}\subseteq\mathscr{M}\). It turns out that this is method is quite clumsy. A quicker way to establish that \(I_x\) is maximal is to notice that it is the kernel of the evaluation homomorphism \(C(X)\to\mathbb{R}\) that maps \(f\mapsto f(x)\). Since the homomorphism is clearly surjective, the first isomorphism theorem tells us that \(C(X)/I_x\cong\mathbb{R}\), which is a field. This immediately tells us that \(I_x\) is maximal. Hence, Urysohn's lemma is not (yet) required. The fact that \(\{I_x\}_{x\in X}\subseteq\mathscr{M}\) is purely algebraic. We want to establish the reverse inclusion as well. This is tantamount to showing that every maximal ideal of \(C(X)\) is of the form \(I_x\) for an appropriate choice of \(x\in X\). Let us study a "rogue" maximal ideal \(I\) that is not of the form \(I_x\) for any \(x\in X\). Since \(I\) is maximal and \(I_x\) is maximal for every \(x\in X\), the containment \(I\subseteq I_x\) would immediately imply \(I=I_x\). Hence, \(I\) is not contained in any ideal of the form \(I_x\). This means that for each \(x\in X\), there exists \(f_x\in I\) such that \(f_x(x)\neq0\). For each \(x\in X\), by the continuity of each \(f_x\) and the fact that \(f_x(x)\neq0\), there exists an open neighborhood \(U_x\) of \(x\) such that \(0\notin f_x(U_x)\). This gives us an open cover \(\{U_x\}_{x\in X}\) (notice that to form this open cover, we are invoking the axiom of choice). By compactness, we may extract a finite subcover \(\left\{U_{x_j}\right\}_{j=1}^{n}\). By construction, for each \(x\in X\), there exists at least one \(1\leq j\leq n\) such that \(f_{x_j}(x)\neq0\). So the functions \(f_{x_1},\dots,f_{x_n}\) have no common zero. This means that the function \(f_{x_1}^2+\dots+f_{x_n}^2\) is always positive and so \(\frac{1}{f_{x_1}^2+\dots+f_{x_n}^2}\) is a well-defined continuous function on \(X\). Since \(f_{x_1}^2+\dots+f_{x_n}^2\in I\), we have that \(\chi_X=(f_{x_1}^2+\dots+f_{x_n}^2)\cdot\frac{1}{f_{x_1}^2+\dots+f_{x_n}^2}\in I\). This is a contradiction: no maximal ideal is the unit ideal. Hence, no "rogue" maximal ideals exist. This establishes that \(\mathscr{M}=\{I_x\}_{x\in X}\). Notice the paragraph above uses the same sum of squares trick that we used when we clumsily showed that \(\{I_x\}_{x\in X}\subseteq\mathscr{M}\). In particular, we are using the general fact that the ideal generated by any finite collection of functions in \(C(X)\) that share no common zero is the unit ideal. This is what we have essentially proven in the previous paragraph. Now consider the well-defined map \(\varphi\colon X\to\mathscr{M}\) defined by \(\varphi(x)=I_x\). The above establishes that this map is a surjection. A more subtle point is injectivity. This is where we truly need Urysohn's lemma. Pick \(x,y\in X\) to be distinct points. Since compact Hausdorff spaces are normal, and \(\{x\}\) and \(\{y\}\) are disjoint closed sets, by Urysohn's lemma there exists \(f\in C(X)\) such that \(f(x)=0\) and \(f(y)=1\neq0\). This shows that \(I_x\neq I_y\), which establishes that \(\varphi\) is an injection and thus a bijection. We will establish that \(\varphi\) is in fact a homeomorphism. To do this, we will construct a basis for the topology of \(X\) and for the topology of \(\mathscr{M}\), and show that \(\varphi\) induces a bijection between those bases. For each \(f\in C(X)\), define \[U_f=f^{-1}\left(\mathbb{R}\setminus\{0\}\right),\qquad \tilde{U}_f=\left\{I\in\mathscr{M}\colon f\notin I\right\}.\] We claim that \(\{U_f\}_{f\in C(X)}\) and \(\{\tilde{U}_f\}_{f\in C(X)}\) form bases for the topologies on \(X\) and \(\mathscr{M}\), respectively. To check this, we will use the following standard result from point-set topology. A collection of open subsets \(\mathscr{E}\) of a topological space is a basis for the topology if and only if
We continue to establish the claim that \(\{\tilde{U}_f\}_{f\in C(X)}\) forms a basis for the topology on \(\mathscr{M}\). This is easy with a little knowledge of the Zariski topology on the spectrum of a ring. Define \[X_f=\left\{I\in\text{Spec }C(X)\colon f\notin I\right\}.\] It is a standard fact that \(\{X_f\}_{f\in C(X)}\) forms a basis for the Zariski topology. It is also clear that since \(\tilde{U}_f=\mathscr{M}\cap X_f\) for every \(f\in C(X)\), we have that \(\{\tilde{U}_f\}_{f\in C(X)}\) forms a basis for the subspace topology on \(\mathscr{M}\). Finally, we will establish that for every \(f\in C(X)\), we have \(\varphi(U_f)=\tilde{U}_f\). But this can be done in a single line. \[\varphi(U_f)=\left\{I_x\in\mathscr{M}\colon f(x)\neq0\right\}=\left\{I\in\mathscr{M}\colon f\notin I\right\}=\tilde{U}_f.\] We conclude that \(\varphi\) is a homeomorphism. What is interesting is how we employed the assumptions that \(X\) is Hausdorff and compact. Urysohn's lemma was used in a crucial way to establish that \(\varphi\) is injective, and for this we needed that \(X\) is normal (which uses both assumptions). The compactness assumption was used by itself in the proof that \(\varphi\) is surjective (i.e., the proof of the fact that \(\mathscr{M}=\{I_x\}_{x\in X}\)). However, the astute reader may argue that by proving that \(\{U_f\}_{f\in C(X)}\) forms a basis for the topology on \(X\), we accomplished exactly what we wanted to: we found a way to reconstruct the topology of \(X\) given \(C(X)\). In particular, we used the elements of \(C(X)\) to construct a basis for the topology on \(X\). In doing this, we used no assumption on \(X\) at all; we did not use the assumptions that \(X\) is Hausdorff and compact. Indeed, this construction is valid for any topological space. The issue is that the construction relies heavily on an understanding of the individual continuous functions in \(C(X)\). Usually, it is very difficult to compute preimages of arbitrary continuous functions on \(X\). Hence, we would like a better, more direct way to characterize the topology on \(X\). Showing that \(X\) is homeomorphic to \(\mathscr{M}\) (at the expense of some assumptions) gives us a complete picture of the topology (not just a basis) and it relies more on the ring structure of \(C(X)\) than the actual behaviors of the functions in \(C(X)\). From a theoretical point of view, this is a "nicer" characterization of the topology. It is an entirely algebraic characterization. So while it is true that \(C(X)\) always uniquely determines the topology on \(X\), there is an especially nice algebraic way to represent this topology in the case that \(X\) is compact and Hausdorff. This begs the question: what goes wrong with our algebraic characterization when we remove either the assumption of compactness or of being Hausdorff? Since the Hausdorff assumption is a separation axiom, it is fairly intuitive why things may go wrong if it is removed. What is more interesting is if we remove compactness. Let us study what happens when we remove the compactness assumption from a topological subspace \(X\subseteq\mathbb{R}\). By the Heine-Borel theorem, compactness in this context is equivalent to being closed and bounded, so let us separately remove the assumption of being closed and the assumption of being bounded to see what goes wrong in both cases. First, suppose \(X=(0,1)\). This is a set that is bounded but not closed. Let \(J=\left\{f\in C(X)\colon\lim_{y\to 1^-}{f(y)}=0\right\}\). It is easy to check that \(J\) is an ideal. However, it is easy to see that \(J\) is not contained in \(I_x\) for any \(x\in X\) because the function \(g(y)=y-y^2\) is in \(J\) but not in any \(I_x\) since \(g\) is positive on \(X\). Therefore, the maximal ideal containing \(J\) is none of the \(I_x\). So in this case, the inclusion \(\{I_x\}_{x\in X}\subseteq\mathscr{M}\) is strict. Now, suppose that \(X=[0,\infty)\). This is a set that is closed but not bounded. In this case, let \(J=\left\{f\in C(X)\colon\lim_{y\to\infty}{f(y)}=0\right\}\). Once again, this is an ideal. Moreover, the function \(g(y)=e^{-y}\) is in \(J\) but none of the \(I_x\), since \(g\) is positive on \(X\). So in this case as well, the inclusion \(\{I_x\}_{x\in X}\subseteq\mathscr{M}\) is strict. There is one last interesting note. Recall that when we formed an open cover in the argument, we remarked that we were invoking the axiom of choice. This was used to establish that \(\mathscr{M}=\{I_x\}_{x\in X}\). It turns out that that equality can be proven without the axiom of choice using only the assumptions that \(X\) is a complete, totally bounded metric space. See here. Here is an interesting problem that was walked through in my homework for physics class (PHYS 4B at UCSD).
Problem: Consider water (or some inviscid fluid) rotating in a large bucket with constant angular velocity, \(\omega\) (about the axis through the center of the bucket). What shape does the surface of the water take? The solution was unfortunately given away by the structure of the problem in the homework (it was split up into parts). Nonetheless, I think that it is a very cool problem and worth expositing. The key idea is Bernoulli's principle. Solution: We leave the lab frame and enter a frame, which we will call \(S\) that rotates along with the water in the bucket. Observe that this frame is non-inertial, since it accelerates with respect to the lab frame. So we perceive fictitious forces in \(S\). Importantly, a packet of water at a constant distance from the axis will stay at that constant distance and the centrifugal force it feels is actually the centripetal force that keeps it in uniform circular motion. This means that the force vector in our frame has a clean form in cylindrical coordinates, \((r,\theta,z)\). Let the axis of rotation hit the floor of the bucket at the origin, \((0,0,0)\). The force on a packet of water is thus given by \[\vec{F}=mr\omega^2\hat{e}_r-mgz\hat{e}_z.\] That is, the net force felt in this frame is a combination of the centrifugal force and weight. Since the centrifugal force is the centripetal force, which proportional to the mass, we can apply the equivalence principle. That is, the laws of physics do not change in our frame, as long as we account for an extra fictitious force that is proportional to mass. One may question why we take the radial component of our force (which is the centrifugal force here) to be positive. The reason is because in \(S\), each packet of water is stationary! So it does not have an inward radially-directed centripetal force. The centrifugal force happens to actually be directed outward (but still radially). This occurs to yield equilibrium in \(S\). A free body diagram will be balanced with the outward centrifugal force, weight, and contact force normal to the surface. I won't say more here, because seeing things this way actually leads to a simpler solution of the problem which is not what I am trying to demonstrate at the moment! Going back to our problem, now that we have a force function, we can come up with a potential \(\varphi\) such that \(\vec{F}=-m\vec{\nabla}\varphi\) (observe the similarities with other forms of potential, such as electric potential, namely that we keep mass out of \(\varphi\) as we would keep charge out of the electric field). This is just a standard exercise in basic calculus. We have that \[\frac{\partial\varphi}{\partial r}=-r\omega^2\Rightarrow\varphi=\int{-r\omega^2\textrm{ d}r}=-\frac{1}{2}\omega^2r^2+A(\theta,z),\] \[\frac{\partial\varphi}{\partial z}=g\Rightarrow\varphi=\int{g\textrm{ d}z}=gz+B(r,\theta).\] From equating these two expressions for \(\varphi\), we have that \(A(\theta,z)=gz+C\) and \(B(r,\theta)=-\frac{1}{2}\omega^2r^2+C\). Putting everything together, we have \[\varphi(r,z)=gz-\frac{1}{2}\omega^2r^2+C,\] for some constant \(C\). The choice of \(C\) is irrelevant, since a potential by itself is meaningless – only a potential difference has an interpretation. WLOG, take the potential at the origin to be zero, so that \(C=0\). Next, consider a tube along a radial direction on the surface of the water. By Bernoulli's principle in the lab frame, we have that \[p+\frac{1}{2}\rho v_{\textrm{tube}}^2+\rho gz=C,\] where \(p\) is the pressure, \(\rho\) is the density of water, \(v_{\textrm{tube}}\) is the speed of flow in the tube, and \(C\) is some constant. But for \(v_{\textrm{tube}}\) to be nonzero, a fluid packet must have some radial component of velocity (again, remember that we are now in the lab frame). But this is not true since the water is just rotating in the bucket! So Bernoulli's principle reduces to the statement \[p+\rho gz=C.\] Now, let us add and subtract \(\frac{1}{2}\rho\omega^2r^2\) to the LHS of the above equation. We obtain \[p+\frac{1}{2}\rho\omega^2r^2+\rho gz-\frac{1}{2}\rho\omega^2 r^2=C.\] Now \(\omega r=v\), the true speed of any packet of water rotating in the bucket. And the last to terms are just \(\rho\varphi\)! So we obtain a modified form of Bernoulli's principle: \[p+\frac{1}{2}\rho v^2+\rho\varphi=C.\] This statement could also have been derived by understanding what exactly Bernoulli's principle asserts. In particular, it states that the sum of pressure, kinetic energy density, and potential energy density is an invariant. We could have just modified the potential energy density term by introducing \(\varphi\). More implicitly, what's going on here is that we are using the equivalence principle. That is, we are saying that the combination of the true gravitational force and the centrifugal force is indistinguishable from a very strange modified gravitational force in \(S\). Furthermore, since in \(S\) we have \(v=0\), our modified Bernoulli's principle now reads \[p+\rho\varphi=C.\] Now we take a point on the surface, say \((0,\theta_0,z_0)\). For elegance, let this be the point on the surface that is also on the axis of rotation. For any point on the surface, we must have that \(p=p_0\), the atmospheric pressure! In particular, we have \[p_0+\rho\varphi(0,z_0)=C.\] And now we are done. Because the surface of the water is just the locus of points where the pressure is equal to atmospheric pressure. In particular, we will have \[p_0+\rho\varphi(r,z)=p_0+\rho\varphi(0,z_0).\] Rearranging this gives \[\varphi(r,z)=\varphi(0,z_0)\Rightarrow z=\frac{1}{2g}\omega^2r^2+\frac{1}{g}\varphi(0,z_0).\] This, in fact, fits the cylindrical form of a paraboloid. \(\square\) It is interesting to note that the solution is entirely determined by energy conservation and the equivalence principle. But any problem that can be solved with energy conservation can also be solved directly by Newton's laws. So it is in fact possible to find a simpler solution, which I mentioned before. The idea there is to simply note that an observer in \(S\) would see that the system is in static equilibrium. I'll leave that solution to the reader. Let's solve a cool problem.
Suppose there is an \(n\) person committee. Let an arrangement of this committee be a derangement if none of the committee members are sitting where they are supposed to be sitting. Let \(P(n)\) be the probability that a randomly chosen arrangement of the \(n\) person committee is a derangement. What is \(\lim_{n\rightarrow\infty}{P(n)}\)? That is, what does this probability approach as the number of members grows arbitrarily large? First, let us define some notation. Let \(D_n\) be the number of derangements of an \(n\) person committee. How can we count \(D_n\)? There are \(n!\) possible arrangements. Perhaps we can use complementary counting by finding the total number of arrangements where at least one member is sitting in his or her assigned seat, and then subtract that number from \(n!\). It turns out that it is pretty much impossible to count how many different ways we can have an arrangement with exactly \(1\leq k\leq n\) members in their correct seat. Try it for yourself and see what problems you run into. There is no simple algorithm to derange a set. Note that I say simple. It is certainly possible to construct an algorithm based on swapping elements that are in the correct spots. Good luck keeping track of that for counting purposes. Instead, we take a step back and think. We only need the sum of the number of arrangements with exactly \(k\) members in their correct seat from \(k=1\) to \(k=n\). Observe that while it is difficult to count the number of arrangements with exactly \(k\) correct seatings, it is much easier to count the number of arrangements with at least \(k\) correct seatings. We can then use the principle of inclusion-exclusion to deduce the sum of the number of arrangements with exactly \(k\) correctly seated members without counting a single such case! So we calculate the number of possible arrangements where at least \(k\) people are sitting in his or her own assigned seat. There are \(\binom{n}{k}\) ways of choosing the number of ways to choose which people are guaranteed to sit in their correct seats. This leaves \((n-k)!\) ways to arrange the remaining people however we wish. It is totally possible that while arranging the these remaining people, someone gets seated where they are supposed to, hence why \(\binom{n}{k}(n-k)!=\frac{n!}{k!}\) is only the number of arrangements where at least \(k\) people are seated correctly. By the principle of inclusion-exclusion, the alternating sum of these must be the desired value. Hence, \[\begin{split} D_n&=n!-\left[\frac{n!}{1!}-\frac{n!}{2!}+...+(-1)^{n+1}\frac{n!}{n!}\right]\\ &=n!\sum_{k=0}^{n}{\frac{(-1)^{k}}{k!}}. \end{split}\] Since the total number of arrangements is \(n!\), we can easily calculate the probability: \[P(n)=\frac{D_n}{n!}=\sum_{k=0}^{n}{\frac{(-1)^{k}}{k!}}.\] Taking the limit, \[\lim_{n\rightarrow\infty}{P(n)}=\sum_{k=0}^{\infty}{\frac{(-1)^{k}}{k!}}.\] But now the RHS is the Taylor series of \(e^x\) evaluated at \(x=-1\)! Hence our requested limit is simply \(\boxed{\frac{1}{e}}\). This is approximately 36.8%. The function \(D_n\) enjoys a number of cool properties. One of them is: \[\sum_{k=0}^{n}{\binom{n}{k}D_{n-k}}=n!.\] See if you can find a combinatorial proof of this. Hint: The construction is similar in spirit to the one we used to find the sum of the number of arrangements with exactly \(k\) correctly seated members. It's a shame that quantum and relativistic corrections are necessary. Though perhaps it's self-entitled of me to wish that the universe would be simple enough for me to understand it.
It turns out that current flows in such a way that power loss in resistors is minimized in parallel. This claim comes from problem 2 from a pset here. Consider two resistors in parallel. The current into a terminal is \(I\). Let the current the resistors be \(I_1\) and \(I_2\), and let the resistances be \(R_1\) and \(R_2\), respectively. Then by conservation of charge: $$I=I_1+I_2,$$ and since power dissipated through a resistor is \(I^2R\), we have a total power loss of: $$P(I_1,I_2)=I_1^2R_1+I_2^2R_2.$$ Now we have a simple optimization problem. We must minimize the function above subject to the constraint \(S(I_1,I_2)=I_1+I_2=I\). We proceed with Lagrange multipliers: $$\nabla P=\lambda\nabla S\Rightarrow\left<2I_1R_1,2I_2R_2\right>=\left<\lambda,\lambda\right>.$$ From this, we immediately obtain: $$I_1R_1=I_2R_2,$$ which if Ohm's law holds, is simply \(V\), the potential difference across the resistors! This physically makes sense. Potential difference is constant over resistors in parallel since all of the resistors of coinciding terminals and potential difference is path-independent. The calculation above easily generalizes to an arbitrary number of resistors. I am awaiting a response on StackExchange for a greater physical insight into why currents arrange themselves such that power dissipation is minimized. This seems to be a manifestation of a common theme throughout physics. For instance, Fermat's principle from optics states that light will always choose a path that minimizes travel time. This is an interesting principle, but it is actually due to the Huygens principle, in which waves are framed as propagating by new wavelets emanating from each wavefront. I am simply wondering if there is a similar underlying principle behind why currents travel so that power dissipation is minimized in parallel resistors. Here are two problems in a pset from MIT 18.02 (multivariable calculus).
Problem 2: Let \(f(x,y,z,t)\) be a smooth function, and let \(\nabla f=\left<f_x,f_y,f_z\right>\) be the gradient in space variables only. Let \(\mathbf{r}=\mathbf{r}(t)=\left<x(t),y(t),z(t)\right>\) be a smooth curve, and \(\mathbf{v}=\mathbf{r}'(t)\); and suppose we use the notation \(\frac{\textrm{D}f}{\textrm{D}t}=\frac{\textrm{d}}{\textrm{d}t}f(\mathbf{r}(t),t)\). Use the Chain Rule to show that \(\frac{\textrm{D}f}{\textrm{D}t}=\frac{\partial f}{\partial t}+\mathbf{v}\cdot\nabla f\). Solution: We have: \[f(\mathbf{r}(t),t)=f(x(t),y(t),z(t),t)\] By the Chain Rule: \[\frac{\textrm{d}}{\textrm{d}t}f(x(t),y(t),z(t),t)=\frac{\partial f}{\partial x}\frac{\textrm{d}x}{\textrm{d}t}+\frac{\partial f}{\partial y}\frac{\textrm{d}y}{\textrm{d}t}+\frac{\partial f}{\partial z}\frac{\textrm{d}z}{\textrm{d}t}+\frac{\partial f}{\partial t}\] Which becomes: \[\frac{\textrm{d}}{\textrm{d}t}f(x(t),y(t),z(t),t)=\frac{\partial f}{\partial t}+\left<\frac{\textrm{d}x}{\textrm{d}t},\frac{\textrm{d}y}{\textrm{d}t},\frac{\textrm{d}z}{\textrm{d}t}\right>\cdot\left<\frac{\partial f}{\partial x},\frac{\partial f}{\partial y},\frac{\partial f}{\partial z}\right>\] Which is: \[\frac{\partial f}{\partial t}+\mathbf{v}\cdot\nabla f\] As desired. \(\square\) This function, \(\frac{\textrm{D}f}{\textrm{D}t}\), is called the convective derivative or the material derivative. In fact, there are quite a few names for this. It is important to realize that \(\mathbf{r}\) defines a path or trajectory through space. The function \(f\) then describes something that is changing along a trajectory with time. The next problem makes this clear, letting \(f=\rho\), the density of a fluid. When \(\rho\) is constant in \(t\), the flow is termed steady. Unsteady flow, as one can imagine by this definition, must be enormously complicated, and it includes phenomena such as turbulence. In the case of steady flow, each trajectory is called a streamline. A fluid flow is called incompressible if the convective derivative of \(\rho\) is zero. In steady flow, this means that there is no pressure change along a streamline (which makes sense!). Problem 3a: Suppose that the density function depends only on time \(t\) but is constant in the space variables \((x,y,z)\), that is, \(\rho=\rho(t)\). Then show that the flow is incompressible if and only if the density \(\rho(t)\) is constant in all the variables \((x,y,z,t)\) (in other words, the flow must be steady). Solution: We want: \[\frac{\partial \rho}{\partial t}+\mathbf{v}\cdot\nabla\rho=0\] But since \(\rho\) does not depend on spatial variables, \(\nabla\rho=0\) and \(\frac{\partial \rho}{\partial t}=\frac{\textrm{d}\rho}{\textrm{d}t}\). Hence: \[\frac{\textrm{d}\rho}{\textrm{d}t}=0\] Integrating both sides WRT \(t\): \[\int{\frac{\textrm{d}\rho}{\textrm{d}t}\textrm{ d}t}=\rho(t)=C\] Hence, the flow is steady. \(\square\) Problem 3b: Next suppose instead that the density depends only on the space variables \((x,y,z)\) but not (explicitly) on \(t\), so that \(\rho=\rho(x,y,z)\). An incompressible flow in this case is called stratified. Use the result of problem 2 to give the condition on \(\rho\) and \(\mathbf{v}\) for stratified flow. Solution: In this case: \[\mathbf{v}\cdot\nabla\rho=0\] So the velocity is always orthogonal to direction in which \(\rho\) changes the most. But recall that \(\nabla\rho\) is always orthogonal to the contour surfaces of \(\rho\). It follows that the velocity must always be parallel, and hence tangential to, surfaces of equal density. \(\square\) Both gravity and electric force satisfy an inverse square law, and thus, both forces satisfy the shell theorem. That is, in a uniformly massive (or uniformly charged) spherical shell, there is no net force on any massive (or charged) object at any location within the shell.
Is this a unique property of inverse square functions? Or are there other functions that obey this? I reckon that I'll have to solve a differential equation of some sort (or perhaps an integral equation). My gut tells me that this is a unique property of inverse square functions (what sort of differential equation would be satisfied by inverse square functions and another class of functions?). I'll be investigating this further soon. School's out; the fun begins. I was up till 3:30. There was no way that I would sleep before the problem would.
As it turns out, the integral equation that I discussed earlier is well-known. It is an example of a linear Volterra equation of the first kind. \[f(c)=\int_{a}^{c}{K(c,x)\rho(x)\textrm{ d}x}\] The function \(K\) is known as the kernel. In our case, the kernel is: \[K(c,x)=c+b-x\] The idea here is to use the Leibniz integral rule (which follows from Fundamental Theorem). By differentiating both sides with respect to \(c\), we obtain: \[\frac{\textrm{d}}{\textrm{d}c}\left(\int_{c}^{\infty}{(c+b-x)\rho(x)\textrm{ d}x}\right)=\frac{\textrm{d}R}{\textrm{d}t}\lim_{R\rightarrow\infty}{\left[(c+b-R)\rho(R)\right]}-b\rho(x)+\int_{c}^{\infty}{\rho(x)\textrm{ d}x}\] Observe that if \(G(c, R)=\int_{c}^{R}{(c+b-x)\rho(x)\textrm{ d}x}\) is well-behaved, then we may interchange the limit and derivative above to obtain: \[\frac{\textrm{d}R}{\textrm{d}t}\lim_{R\rightarrow\infty}{\left[(c+b-R)\rho(R)\right]}=\lim_{R\rightarrow\infty}{\left[(c+b-R)\rho(R)\frac{\textrm{d}R}{\textrm{d}t}\right]}=0\] As cited, the conditions we want for \(G\) for this manipulation to be valid are:
Anyway, we have: \[b\rho(c)=\int_{c}^{\infty}{\rho(x)\textrm{ d}x}\] Let \(P'=\rho\). Then, by the Fundamental Theorem: \[bP'(c)=\lim_{R\rightarrow\infty}{P(R)}-P(c)\] Let \(L=\lim_{R\rightarrow\infty}{P(R)}\). Then, the equation above rearranges to: \[P'+\frac{1}{b}P=\frac{L}{b}\] This is simply a first-order linear ODE. We solve this by using an integrating factor of \(\exp{\int{\frac{1}{b}\textrm{ d}{x}}}\). This yields: \[P(x)=L+Ce^{-x/b}\] Differentiating this, we obtain: \[\boxed{\rho(x)=Ce^{-x/b}}\] Obviously with \(C>0\). Uniform convergence is forced by the fact that this solution satisfies our Volterra equation. It can probably also be proven with \(\epsilon\)-\(\delta\) calculations. That's for another time. And I'm done with mechanics for this weekend I think. Here's what I do know.
For brevity, let \(K=c+b\). Let \(u=\rho(x)\) and \(\textrm{d}v=K-x\textrm{ d}x\). Then, integrating by parts: \[\int{(K-x)\rho(x)\textrm{ d}x}=\rho(x)\left(Kx-\frac{1}{2}x^2\right)-\int{\left(Kx-\frac{1}{2}x^2\right)\rho'(x)\textrm{ d}x}\] We continue integrating by parts, letting the polynomial in each new integrand be \(\frac{\textrm{d}v}{\textrm{d}x}\), and the derivatives of \(\rho(x)\) be \(u\). This then yields: \[\int{(K-x)\rho(x)\textrm{ d}x}=\sum_{n=0}^{\infty}{(-1)^n\rho^{(n)}(x)\left(\frac{Kx^{n+1}}{(n+1)!}-\frac{x^{n+2}}{(n+2)!}\right)}\] Ok Andrew, cool. Now what? Ah, simply observe. Our desired integral is the improper integral evaluated from \(c\) to \(\infty\). This means that the limit of the antiderivative (above) as \(x\) approaches \(\infty\) must converge! In other words: \[\lim_{x\rightarrow\infty}{\sum_{n=0}^{\infty}{(-1)^n\rho^{(n)}(x)\left(\frac{Kx^{n+1}}{(n+1)!}-\frac{x^{n+2}}{(n+2)!}\right)}}\] Must converge! For this to occur, every single term must converge! But observe that the polynomials in each term of the summation tend to \(\infty\), hence for convergence to occur, every derivative of \(\rho(x)\), along with \(\rho(x)\) itself, must tend to \(0\) (this follows plainly from L'Hôpital's rule). In other words, we have derived: \[\lim_{x\rightarrow\infty}{\rho^{(n)}(x)}=0\textrm{ }\forall n\in\mathbb{N}_0\] This in and of itself hints at possible functions. For instance, an exponential decay function, or something of the form \(Ax^n\) for some \(n<0\). Ok sure, we can substitute these functions into our original integral equation and find that one of them actually works, but I want a little more insight. And rigor. And what sort of lame solution is that? Deriving one property and guessing and checking? Ew. The next thing I will use is the Leibniz integral rule. That is: \[\frac{\textrm{d}}{\textrm{d}x}\left(\int_{a(x)}^{b(x)}{f(x,t)\textrm{ d}t}\right)=f(x,b(x))\cdot\frac{\textrm{d}}{\textrm{d}x}b(x)-f(x,a(x))\cdot\frac{\textrm{d}}{\textrm{d}x}a(x)+\int_{a(x)}^{b(x)}{\frac{\partial}{\partial x}f(x,t)\textrm{ d}t}\] While this looks complicated, it just follows from the Fundamental Theorem of Calculus. And while I'd love to continue this post with the application of this rule, I should probably do my history for once. The next post will include this, and discussion of uniform continuity, and uniform and pointwise convergence. Stay tuned. |
Categories
All
Archives
July 2023
|