What is eit

Forget what you have seen - numbers on the number line, complex numbers on the xy graph.

Numbers are things. All that we know about these things are the equations that relate them to other things (numbers). Numbers don't actually lie on the number line, that is a convenient visualization but like all analogies it starts to be harmful when we insist on mistaking the analogy for the territory.

All that can be said about a number is the relations that describe it in terms of other numbers. Everything else is commentary.


A common pattern in these equations is to find an expression of the sort

eit

This confused me for the longest time. Now I know what it means - it is a number on the unit circle! These numbers are common since they are a convenient way of describing a rotation, or a phase shift of a periodic process. Let us arrive at this conclusion piece by piece.


Firstly, what is e?

Consider the following equation:

f(x) = 1 + x 1! + x2 2! + x3 3! +

There are reasons† why a sequence of operations as described by the right hand side arises commonly in nature's, and in our own, undertakings. So much so that we have given a name to this function - it is called the exponential function.

† e.g., it is the fixed point of derivation. That is, exp(x) is the (only) function whose derivative is equal to itself. That is, the value of exp(x) at any number x is also equal to the rate of change of exp(x) at x.

exp(x) = 1 + x 1! + x2 2! + x3 3! +

What is the value of this function when we evaluate it at 1?

$ node
> factorial = (n) => n == 0 ? 1 : n * factorial(n - 1)
[Function: factorial]
> exp = (x) => Array(99).fill().reduce((s, _, i) => s + x ** i / factorial(i), 0)
[Function: exp]
> exp(1)
2.7182818284590455

That looks familiar - it is e.

So e is the value of the exponential function when it is evaluated at 1. Since e, the constant, is so ubiquitous, it has somewhat come to overshadow the function it originates from, and sometimes it just stands in for the entire function itself. For example, people sometimes write ex when what they mean is exp(x).

While confusing, this is mathematically correct, because the exponential function obeys the following identity

exp (x+y) = exp(x) . exp(y)

This can be seen by plugging in the values into the definition of exp(x) above and doing some symbolic algebra to convince ourselves. Assuming we're convinced, then it is easy to see that, for example,

e3
= exp(1)3
= exp(1). exp(1). exp(1)
= exp(1+1+1)
= exp(3)

So ex is equivalent to exp(x).

This equivalence seems irrelevant, since multiplying e by itself x times seems a good enough way to arrive at the same result, and also matches the notation for raising to power that we're already familiar with, then why remind ourselves that it is a shorthand for an underlying function evaluation?

Because it allows us to see what complex exponentiation means.


A complex number is a tragedy of nomenclature. They are neither complex, nor imaginary.

This is perhaps an opportune point to repeat the diatribe I started with. Number are just things defined by their relations, nothing more can be said about them. We however get misled by labels attached to some of them (e.g. complex number) or their visual analogies (showing complex numbers on a 2D plane) to think of them as something they're not.

So forget the fact that a complex number has two components, or that it is different from a real number. Just think of all of them, real or complex, as just things that we can relate to each other, and all the ways they relate to other numbers is their very definition, they have no inherent existence of their own.

One such relation is

x.x = -1

The number x that satisfies this relationship is called i. It is a complex number; numbers that we're less familiar with intuitively (but I'm sure if we give it a few hundred years, children will find complex numbers as intuitive as we find real numbers today), but it is in no way unreal. It is as real as real numbers, and in fact more so, because complex numbers are closed while real numbers are incomplete: what we just found, the square root of minus 1, an eminently reasonable question, does not have its answer in real numbers, but no matter what you do to complex numbers, you still get back a complex number.


Figuratively and literally, complex numbers open another dimension to us. But when we start putting complex numbers in the relations we had previously used for with only real numbers, we face new questions. For example, what does it mean to raise e to the power of a complex number? Or more specifically, since we've only seen one complex number so far, i, what does ei mean?

The repeated multiplication doesn't work. We effortlessly think of e3 as e multiplied by itself 3 times, e.e.e, but how do we multiply e by itself i times?

We're asking the wrong question, and getting confused because we're confusing the notational shorthand for the real thing. ex is a shorthand for exp(x), so instead of wondering what is ei, what we actually want to know is what is exp(i).

This is a simple question with a easy to obtain answer. We can just plug in i in the definition of exp(x)

exp(i) = 1 + i 1! + i2 2! + i3 3! +

Hmm. All these symbols Manav, what do they mean?


Unless you're the sort of person who skips footnotes, you'd remember how we talked about exp(x) being a fixed point of derivation. Let's putter around with that thought.

At the simplest, we can think of a function whose value never changes. The number of suns in the sky†. No matter at how long after birth (i.e., x) I'm trying to evaluate the number of suns in the sky (i.e., f(x)), I still get back the same answer 1, which never changes (i.e., it has a rate of change 0, or df(x)dx=0).

† It is hard to come up with physical examples that are strictly true. One can come up with exotic examples, say considering f(x) as the speed of light in a vaccum at some space-time coordinate x, something that is generally taken as unchanging, but I have my doubts. I feel that the more we refine physics, the more we'll find these fundamental constants as also changing. The only sureshot examples of constant functions that I can think of are mathematical in nature, say defining f(x) as the ratio of the circumference of a circle to its radius.

As Heraclitus mentioned, perhaps the only constant is change, though I'm not sure how to formulate that as an equation. dddxdx=a?‡

‡ There is a mathematical convention (I think?) that I realized after an embarrasingly long time: constants are represented by a, b, c etc, while variables are represented by x, y, z etc.

Next up, we can think of a function whose rate of change is constant - that is, its rate of change does not depend on the input x. Everyday my age increases by one day (i.e., df(x)dx=a, where a is some constant, in this case 1), independent of my age (i.e., f(x)) or how long after birth I'm trying to evaluate my age (i.e., x).

These functions look like lines, which is why they are called linear. Let us mentally tag this group as functions whose derivative is a constant.

While I can't explain what the derivative is in a footnote, what I do want to re-emphasize is that the derivative is an operator - it takes a function and returns a function. This is different from regular functions, which take a number and return a new number. So the derivative of a function f(x), ddx(f(x)), is another function, say d′(x). This sounds complicated when put into words, but those of you who have written code would recognize this sort of a meta-function as quite common in programming, and they're not all that complicated either.

Note that we can think of 0 as just another constant, so our previous group of unchanging functions really is part of this current group of functions that we are considering. Indeed, constant functions are also lines, just horizontal ones, so both these groups are the same thing that way.

If we consider them as distinct groups, we can start building a tower of changeability. That sounds interesting, let's try that.

f df/dx
f(x)=a 0
f(x)=x a

This allows us to find out the rate of change of the rate of change, i.e., the second derivative. We just go one step up in the tower. So if we start with a linear function f(x)=x, like my age, its derivative would be a constant function f(x)=a, and we can look up its derivative one row up in the tower to find out the second derivative of my age is 0.

Enough words, I think I'm belabouring the point. Let's move on with our sequence.

Next up, we can think of functions whose rate of change when evaluated for some number x is in some way related to value of x itself. These are functions of the form f(x)=x2. For example, if I go to a meeting and before the meeting starts, each person present does a handshake with each other person present there, then the number of handshakes we'll end up doing, f(x)=x.(x-1)2, is (almost) the square of the number of people x. These numbers grow large very quickly - for a meeting of 7 people, there will be 49 handshakes - because for each new person added to the meeting, the number of handshakes that'll increase is (roughly) equal to the number of people now in the meeting.

We can continue this pattern. For example, we can think of functions whose rate of change when evaluated for some number x is in some way related to x2. These turn out to be functions of the form f(x)=x3. Putting these guys in our tower,

f df/dx
f(x)=a 0
f(x)=x a
f(x)=x2 x
f(x)=x3 x2

Does this tower ever reach a function f whose the rate of change when evaluated for some value x is in some way related to f(x) itself?

Yes! Consider

f(x) = 1 + x 1! + x2 2! + x3 3! +

For this function, the rate of change is equal to the function itself. That is, ddx(f(x))=f(x).

What is this f? Why, it is the exponential function, exp, that we'd been talking about earlier. And this is what it means for it to be a fixed point - unlike the other functions we've seen so far, no matter how many times we take the derivative of exp(x), we get back the exact same function exp(x) again. Put differently, it forms a (1-)cycle, like the ouroboros, the snake eating its own tail.

So a natural next question to ask would be - is a function that forms a 2-cycle? That is, if we take its derivative twice, we get back the same function again?

Somewhat surprisingly, there is! The pair sin(x) and cos(x) cycle back to each other after two steps.

ddx( ddx( sin(x) ) ) = ddx( cos(x) ) = - sin(x)

There is a slight asymmetry, we get back the negative of where we started with, and I don't know what to make of it except perhaps that is the reason which makes this a 2-cycle instead of a 1-cycle.

This same 2-cycle works even if we start with the cosine function instead.

ddx( ddx( cos(x) ) ) = ddx( - sin(x) ) = - cos(x)

So these functions are like yin and yang, each engendering each other again, and again, ad infinitum. Philosophical glee aside, this behaviour is indeed quite curious, and one would imagine that there must be some internal similarity between these two functions and the exponential function, which forms a cycle by itself, for these two to form a cyclic pair. Or viewed from the other end, it is natural to wonder if we can somehow combine sin and cos to get exp?


Let's look at the formula for the exponential function again

exp(x) = 1 + x 1! + x2 2! + x3 3! +

I gave you the formula, and someone gave it to me, but what if I told you we can derive it from first principles?

Alright, here goes. Let us try to come up with a polynomial to approximate exp(x) without using the definition above. As a reminder, a polynomial is a function that looks like this:

polynomial(x) = c0 + c1x + c2x2 + c3x3 +

That is, it is a sum of successive powers of the input x to the function. Each power has a constant factor ci associated with it, to "scale" its contribution to the function. This constant can also be zero, in which case that particular power of x will not be involved at all.

The highest power of x with a non-zero constant associated with it is called the degree of the polynomial. Polynomials of degree 0 are constants, of degree 1 are lines, and of degree 2 are parabolas.

Since polynomials of degree too are quite common, they also have a nickname – they're called quadratic functions, or quadratic polynomials. Let's start by approximating exp(x) using one of them. It will have the form:

f(x) = c0 + c1x + c2x2

We will make use of two facts:

  1. exp(x) is a fixed point of derivation, that is, the derivative of exp(x) is exp(x) itself.
  2. exp(0)=1 (this fact can either be given to us, or we could guess at it by considering the interpretation of exp(x) as ex, and then recalling that the anything raised to the power of 0 is 1)

Where do we start? Well, since exp(0)=1, we can start by making our approximation f(0) also equal to 1 at 0. This lets us deduce the value of the constant c0.

f(0) = 1 c0 + c10 + c202 = 1 c0 = 1

Alright. Next up, we can try to imagine that around the input 0, if our approximation should have the same "shape" as exp, then it should have the same derivative as of exp at 0. Let us find the derivative of f.

ddx( f(x) ) = ddx( c0 + c1x + c2x2 ) = c1 + 2.2c2x

And we can easily deduce that the derivative of exp at 0 is 1.

ddx( exp(0) ) = exp(0) = 1

This lets us deduce the value of the constant c1, since we're seting its derivative at 0 to be equal to the derivative of exp at 0 to give our approximation the same slope.

ddx( f(0) ) = 1 c1 + 2.2c20 = 1 c1 = 1

Continuing with this shape chasing, we can set the second derivative of f at 0 to be equal to the second derivative of exp at 0 so that so that "shape" of our approximation around 0 is the same as the shape of exp around 0. The second derivative of f is

ddx( ddx( f(x) ) ) = ddx( c1 + 2.2c2x ) = 2.2c2

And the second derivative of exp at 0 is 1.

ddx( ddx( exp(0) ) ) = ddx( exp(0) ) = exp(0) = 1

And we can combine these two pieces of information to deduce the value of c2

ddx( ddx( f(0) ) ) = 1 2.2c2 = 1 c2 = 12

So we have managed to deduce all three of the constants for our quadratic approximation, giving us

f(x) = 1 + x + x22

If you notice above, when we were deducing the constants for each term, the previous constants did not matter – they effectively get wiped out when we take the nth derivative. This means is that to extend our approximation by a degree, we just need to consider the next derivative to compute the constant for the latest degree; the previous approximation still remains valid otherwise.

So let us go one degree higher, and consider degree 3 polynomials, affectionately called cubic polynomials. That is, let us find an approximation to exp of the form

f(x) = c0 + c1x + c2x2 + c3x3

As I mentioned, the way we're deducing these constants means that the previous ones remain untouched, so we already know the values of c0, c1 and c2; we just need to deduce c3.

We continue our shape chasing, and set the third derivative of f at 0 to be equal to the third derivative of exp at 0. The third derivative of f is

ddx( ddx( ddx( f(x) ) ) ) = ddx( ddx( ddx( c0 + c1x + c2x2 + c3x3 ) ) ) = ddx( ddx( c1 + 2.c2x + 3.c3x2 ) ) = ddx( c2 + 2.3.c3x ) = 1.2.3.c3 = 3!.c3

And the third derivative of exp at 0 is 1.

ddx( ddx( ddx( exp(0) ) ) ) = ddx( ddx( exp(0) ) ) = ddx( exp(0) ) = exp(0) = 1

Combining these two pieces of information, we can deduce the value of c3

ddx( ddx( ddx( f(0) ) ) ) = 1 3!.c3 = 1 c2 = 13!

Giving us the cubic equation

f(x) = 1 + x + x22 + x33!
Do you see where we're going? We're getting something that very closely resembles the formula for the exponential function!
exp(x) = 1 + x 1! + x2 2! + x3 3! +

This is not a coincidence. If we continue this process, increasing the degree of the polynomial one by one, we will get this same formula. It is an infinite degree polynomial, and these are called series†. Further, this turns out not to be an approximation, but is exactly equal to exp(x); indeed, that's how we first defined exp.

† This particular one is called a Taylor series.


We should pause here for a minute to marvel at the magnificient vista we've reached. We were able to deduce the exact definition of a function just by the information of its derivatives at a single input. The derivatives of a single value contained the definition of the entire function!?

You might think that this is a special case: exp is the fixed point of the derivative operator, and maybe there is some hoogedly poogedly going on for that special case. Well, you're not entire wrong, but you're not entirely right either - exp isn't the only function for which we can use the above method to derive its definition just from the values of its derivatives at some input. There are other such functions†. Can you guess two of them?

sin and cos!

† You're partially right because this trick method cannot be applied to all functions. It is working here because the functions we're considering are "smooth", that is, they are infinitely differentiable.

If you were to repeat a process very similar to what we used above (no advanced mathematics needed - just knowledge of the derivatives of sin and cos, and their values at 0; indeed, the Indian mathematician Madhava was able to derive them 700 years ago, when calculus as we now know it hadn't even been formulated), you will be able to arrive at the following series representations of them.

sin(x) = x 1! - x3 3! + x5 5! - x7 7! +
cos(x) = 1 - x2 2! + x4 4! - x6 6! +

Whoa. If you squint at these closely, and ignore the minus signs for a minute, you'd see how the series for sin and cos seem like two pieces of a jigsaw puzzle that combines to give us the series for exp.


Of course, we can't ignore the minus signs. But even before that, it is possible that you might be feeling lost as to what it is that we're trying to do here.

Let me rewind the tape. We were trying to find the meaning of the following equation.

exp(i) = 1 + i 1! + i2 2! + i3 3! +

This was the equation we'd obtained when we had plugged in the complex number i into the definition of exp.

Since we didn't know how to make sense of it, I had taken you on a diversion, where we'd seen how smooth functions can be written in terms of their Taylor series, and in particular, we found out the Taylor series for sin and cos.

This diversion was useful, since I didn't want to make it appear as if I simply mandated the series for sin and cos; we were able to get a sense of how they arise.

Either ways, now let us do some basic algebra with the series expansion of exp(i). The two things you need to keep in mind are:

  1. The series expansion of sin and cos (we'll use it below),
  2. i is the square root of -1 (that's its very definition: i2=i.i =-1)

Alright, let go.

exp(i) = 1 + i 1! + i2 2! + i3 3! + i4 4! + i5 5! + i6 6! + i7 7! + = 1 + i 1! + i2 2! + i.i2 3! + i2.i2 4! + i.i2.i2 5! + i2.i2.i2 6! + i.i2.i2.i2 7! + = 1 + i 1! - 1 2! - i 3! + 1 4! + i 5! - 1 6! - i 7! + = ( 1 - 1 2! + 1 4! - 1 6! + ) + i ( 1 1! - 1 3! + 1 5! - 1 7! + ) = cos(1) + i . sin(1)

Aha. So exp(i) = cos(1) + i . sin(1) . That is, the value exp(i), which is the result of evaluating the exponential function exp(x) at the complex number i is a complex number whose "real" part (i.e. the x-component) is cos(1), and whose "imaginary" part (i.e., the y-component) is sin(1).

But that's not all. The above algebra works even if we multiply the input by any arbitrary number t. That is,

exp(i.t) = cos(t) + i . sin(t)

This is known as Euler's formula, after Swiss mathematician Leonard Euler who noticed it 300 years ago. It makes precise our intuitive guess in the last section, where we thought that cos and sin were like pieces of a jigsaw puzzle that should fit together to form exp. Except, that this jigsaw puzzle cannot be assembled in the domain of so called real numbers, we need to level up and use the "really real" numbers - i.e., complex numbers - to see this truth.

The pudding is yet to come. So now that we know that the result of evaluting the exponential function exp(x) at any multiple i.t of the complex number i will be a complex number with the rectangular form of cos(t)+i. sin(t), and recalling that the length (aka magnitude) of a complex number is the square root of the sum of the squares of its two rectangular form components, we can use basic arithmetic to deduce that the length of such numbers is always 1.

magnitude(exp(i.t)) = magnitude(cos(t) +i. sin(t)) = cos(t)2 + sin(t)2 = 1

The last step follows from the fact that if you draw a right angle triangle from origin to any point on the circle of radius r, then Pythagoras' 2500 year old discovery tells us that

r2 = x2 + y2

Where x and y are, respectively, the projections of this point on the x- and y- axis. Dividing both sides by r2, and noticing that xr and yr are the definitions of the sine and cosine functions,

r2 = x2 + y2 1 = x2r2 + y2r2 1 = (xr)2 + (yr)2 1 = cos(t)2 + sin(t)2

So we've deduced that the value of exp(i.t) is a always a complex number of length 1. What does that mean? It means that these points are always on the unit circle!

So there you have it. The exponential function exp takes complex numbers that lie on a straight line, the so called imaginary axis (all and any multiples of i, i.e. complex numbers of the form i.t for any arbitrary value t) and maps them onto complex numbers that all lie on the unit circle (a circle is a 2D creature, and lives on the complex plane). Because of an abuse of notation, this is sometimes written as

eit = cos(t) + i . sin(t)

but to avoid confusing ourselves, it is better to see that the above is a shorthand for the relationship

exp(i.t) = cos(t) + i . sin(t)

A circle epitomizes periodic motion, and so this relationship is ubiquitous anytime we're dealing with phenomena that exhibit periodic motion, as it allows us to map linear inputs (points on the imaginary axis) to a periodic one (points on the unit circle).

We also took a detour that showed us that two other functions that are commonly brought up as examples of periodic motion - sin and cos - are actually components of the quintessential periodic motion: movement around a circle.