Skip to main content

The Curious Case of the Vitali Set ...

... or
How I Learnt to Stop Worrying
and Accept Mathematical Rigour.

In Measure Theory, among other things, we try to generalize the concept of 'volume' (or 'area' or 'length' according as which one you prefer). In other words, like some sets come with an inherent 'measure' associated with them (for example, intervals in $\mathbb R$ comes with a length and "well behaved" shapes come with an area or volume depending on how many dimensions you count), we want to assign to any general subset a measure of its own.

Since any general subset may not look as structured as the other sets that have their own volume, ideally we are free to randomly associate any number we want to to the set. But, as in all of mathematics, we want these numbers to satisfy certain properties.

So, we will impose on these numbers a few properties that are very intuitive and geometric, but to our utter shock, are inconsistent. These properties are so much in tune with our everyday experiences, that it will be hard to believe that such pathological examples may exist. But such is often the fate of mathematicians- the inescapibility of a formal proof doesn't care about humane emotions.

In favour of mathematical rigour, we can device a plethora of examples that can absolutely destroy geometric intuitions. But this one is my most recent favourite. Another justufication for the title is that the construction of this set is quite Kubrickesque!


Axioms

For the rest of this post, we will only be concerned about the real line $\mathbb R$ and try to assign a measure to all its subsets. For a subset $X$ of $\mathbb R$, we will denote the 'volume' of $X$ as $\mu(X)$.

Here are the properties that we want our measures to abide by-

1. $\mu(\emptyset)=0$
i.e., we don't want the empty set to have any volume. If this condition isn't imposed, we can just add nothing to a square and change its area by $\mu(\emptyset)$. Of course, we don't want something to be born out of nothing.

2. If $E_1,E_2,\dots$ is a (finite or infinite) sequence of disjoint subsets of $\mathbb R$, then
$$\mu(E_1\cup E_2\cup \dots)=\mu(E_1)+\mu(E_2)+\dots$$
i.e., if $E_1,E_2,\dots$ don't share anything in common, then the set containing all of $E_1,E_2,\dots$ has volume equal to the sum of the volumes of $E_1,E_2,\dots$. To convince yourself, you can think of a few disjoint shapes on a plane, and think of what `area' you can assign to the union of these shapes if you have to assign one. I would also urge the reader to notice that the last property immediately follows from this one- so, it wasn't strictly necessary to mention the first one, but I did it to give you an idea of the geometry we are trying to follow.
Readers familiar with probability may compare this property with probability spaces and probabilities.

3. If $E$ is a translation, rotation, or reflection of $F$, then
$$\mu(E)=\mu(F)$$
This property is purely geometric. It just says that you cannot change the volume of a shape by rotating, reflecting or translating it.

4. The measure of the unit interval (denoted by $S$ for the rest of the post) is unity, i.e.,
$$\mu(S)=\mu([0,1))=1$$
The reader may have noticed that the first three properties are trivially satisfied by the measure $\mu(X)=0$ for any set. Although that is a choice, we want to abandon it due to its uselessness. To do that, we have to initiate the process by giving some set some value- and for that, we use the most familiar choice.

Breaking Intuitions

The set that we are about to discuss is aimed at establishing the fact that the four axioms don't go well together. But, if this set was our everyday geometric object, then as we can guess, it wouldn't have done the job. So, we need to construct more insane objects- something that often require `more discontinuities' than we would normally expect; and as often it happens, this is a facility freely available with the distribution of rationals in reals. With that in mind, let us head straight into the construction.

Let us call $y$ a rational shift of $x$ if $x=y+q$ for some rational number $q$. Now, let us create a subset $V$ of $S$ which satisfies the following two properties-
1. No two elements in $V$ are rational shifts of each other.
2. For any real number $r$, there is a rational shift of $r$ in $V$.
One can think of constructing it in this way- think of the number line, and consider connecting every two rational shifts with a wire. Now, from all the networks, pick exactly one element.
For the more mathy people, all we did was defined an equivalence relation $x\sim y\iff x-y \in \mathbb Q$ and invoked the Axiom Of Choice on the equivalence classes.

Now, take any rational $q$ in $S$, and shift $V$ to the right by $q$ units and then shift the part that sticks out beyond $S$ one unit to the left, and call this set $V_q$. Mathematically,
$$V_q=\{x+q:x\in V\cap [0,1-q)\}\cup \{x+q-1:x\in V\cap [1-q,1)\}$$
We leave it to the readers as an (easy) exercise to convince themselves that for any $q$, $V_q\subset S$ and any $x$ in $S$ belongs to one and exactly one of these $V_q$'s (i.e., the $V_q$'s are disjoint, and all the $V_q$'s taken together cover all of $S$).

With that in hand, let us assume a measure $\mu$ and use property 2 to see that for any $q$,
$$\mu(V)=\mu(V\cap [0,1-q))+\mu(V\cap [1-q,1))$$
But, hold on... the right hand side of the above equality looks like the very definition of $V_q$, only translated a little bit, isn't it? And since, the measure is translation invariant, no matter what $q$ we have,
$$\mu(V)=\mu(V_q)$$
More specifically, all these $V_q$'s also `weigh' the same, and their weight is precisely $\mu(V)$.

Now, since any $x\in S$ belongs to one and exactly one of these $V_q$'s,
$$\mu(S)=\sum_{q\in \mathbb Q\cap S} \mu(V_q)$$
because we have property 2 in hand.

But, by property 3, we have $\mu(S)=1$ and so,
$$\sum_{q\in \mathbb Q\cap S} \mu(V_q)=1$$
(Un/)fortunately, the sum on the left hand side, being an infinite sum of constants, either diverges to infinity, or is identically equal to $0$.

So, no matter what weight we associate with $V$ (which is called the "Vitali set"), it will never satisfy the three very fundamental properties we want it to.


Concluding Remarks

A keen reader is of course expected to ask what that means for our goal of imposing a measure. Well, as we have realised, the power set of our original set is not the ideal place to impose a measure (unless we decide to declare a war against the Axiom of Choice).


But, this is not the perfect battlefield. So, we classify a special subset of the power set and impose our measure there. In $\mathbb R$, this special subset is call the "Borel $\sigma$ Algebra" on $\mathbb R$.

Another valid question to ask is whether it is possible to replicate this construction in higher dimensions. The answer is not just an 'yes', but much more. A similar idea in $\mathbb R^3$ gives rise to what is maybe the most counter-intuitive theorem ever in mathematics, known by the name, "Banach Tarski Paradox". I should refer the reader to the VSauce video of the same and urge him/her to notice the use of rational numbers in the proof.

Comments

Popular posts from this blog

Why am I frequently meeting my crush?

Gourav Banerjee, a 21MS student, goes to the main canteen of IISER Kolkata for dinner at some arbitrarily scheduled time between 8 and 9 pm. He frequently meets an anonymous, beautiful girl in the mess and begins to wonder whether the girl is stalking him or if their meeting is just a coincidence. So he tries to compute the probability of meeting that girl in the mess during dinner time given the following constraints: Both Gourav and the girl go to mess for having dinner at some random time between 8 - 9 pm. Because of the Queue at the mess, both stay in the mess for minimum of 30 min. What do you think? Solution Let $x$ denote the time when Gourav enters the mess and let y denote the time when girl enters the mess. Here we take origin to be the 8 pm mark and a distance of 1 unit represents 1 hour on both $x$ and $y$ axis so all possible coordinates within the unit square $ABCD$ represents an event where Gourav and the girl both visit the canteen. Now the favourable coordinates which

The height of probabilistic interpretation

Girls only love men as tall as 6' and above. Socrates, ca. 2023 It is undeniable that heights strongly influence our daily lives. Be it our heights, or the height of a mountain we scale, or the height of all problems - humans. Mathematics too hasn't been able to escape its clutches, with height functions being useful in several fields, including but not limited to - Diophantine Geometry, Automorphic forms and the Weil-Mordell theorem - something you should have heard before if you attend my talks. If you have attended school (or maybe you are a climate activist) - then try recalling the elementary school days when fractions were introduced. Albeit unknowingly, but we had as children classified fractions into proper and improper - based on whether the denominator was larger than the numerator or vice versa. Well, it seems mathematicians have stuck with this classification - giving us the crux of todays discussion - height of a rational number. Given a rational number $x=\frac mn

Monotonic functions and the first derivative

A couple of days ago, Rohan Didmishe shared this problem with us: show that the function defined by \[ f\colon \mathbb{R} \to \mathbb{R}, \qquad f(x) = \begin{cases} x + x^2\sin(1 / x), &\text{ if }x \neq 0, \\ 0, &\text{ if } x = 0. \end{cases} \] is not monotonic (increasing or decreasing) in any interval $(-\delta, \delta)$ around zero. Graphing this function (say, using Desmos ) shows that it oscllates rapidly, curving up and down with increasing frequency the closer its gets to zero. This is due to the $x^2\sin(1 / x)$ term; the $x$ added in front 'tilts' the curve upwards. The first thing to look at is the derivative of $f$. Using $\lim_{x \to 0} x\sin(1 / x) = 0$ and the chain rule, we can compute \[ f'(x) = \begin{cases} 1 + 2x\sin(1 / x) - \cos(1 / x), &\text{ if }x \neq 0, \\ 1, &\text{ if } x = 0. \end{cases} \] Specifcally, $f'(0) = 1$ which seems to tell us that $f$ is increasing at $0$ ... or doe