Mar 13, 2009

tensors and transformations

Transformation, in the most literal of senses, means changing the form of something... Since I dabbled in computer graphics a lot in my earlier years, I would usually think of them as graphical manipulations like resize/rotate/skew/reflect etc.

Indeed this is very related to the mathematical concept of transformations, except, one is often more interested in transforming the "canvas" rather than the "image". In more strict terms, one often transforms the coordinate system rather than the objects.

At earlier levels, I could never understand why it was even useful to perform transformations (as they were often introduced along with matrix algebra), but eventually I realized that transformations are one of the most useful tools in mathematics.

The necessity of transformations is often obscured by the ubiquity of the Cartesian coordinate system (i.e. x, y, and z). It just so happens that it isn't the only coordinate system available, nor can it describe all kinds of spaces. The Cartesian system describes Euclidean space satisfactorily (the zero-curvature kind of space that is nearly always used in "ordinary" math), but there are other kinds of spaces out there. Not to mention the fact that there are other coordinate systems that can be defined iin Euclidean spaces, like the cylindrical coordinate system or the spherical coordinate system.

In calculus, one of the most frequently used and most powerful rule is the chain rule, which I'm quite sure many have heard of:

What most people may not realize is that the chain rule is also a kind of transformation rule, i.e. it allows the transformation from one set of coordinates to another. The one-dimensional (single-variable) chain rule allows the transformation of a derivative in u-coordinates to a derivative x-coordinates.

The multidimensional "general" chain rule, shown below, does the same thing, but for higher dimensions.

It may look unwieldly, but it's really simple if you put it in summation form:

Here I just put i to represent any particular index, and used j as a running index. You might think that's the end of the story, but physicists (and some mathematicians) would do anything and everything to make their equations look neat and elegant... like dropping the summation sign altogether:

Now, I'm (partly) using tensor notation. Take note that "upper indices" are not meant to represent powers like ex or something. Rather, they are just indices, just like the lower index in "x1", except tensor algebra requires to have a way to distinguish between two kinds of indices, the upper or "contravariant" indices and the lower or "covariant" indices. Funny names, I'd say. Doesn't that look like our "ordinary", single-variable chain rule with a few indices attached?

Now that I've already "digressed" here... I might as well describe what a tensor is first. A tensor is a "kind of generalized quantity" that includes both vectors and matrices, and much more. Originally, I used to think of them as multidimensional "arrays" as in computer programming, but that was too simplistic. Tensors are not "just arrays of components", because they are really "geometrical" quantities that exist without a coordinate system, yet can be expressed differently in terms of numeric ("scalar") components when a coordinate system is specified. It's very much like vectors: you can express vectors (i.e. "arrows", if you like picturing them) in different coordinate systems, and doing so will give you different values for their components, but regardless of the system you use, these geometrical entities exist and don't vary with the coordinate system that you choose. And this is why tensors are useful... in fact, in tensor algebra, is all about transformations of tensors, because they obey very "straightforward" rules.

It's hard to describe what tensors really are, since I could only describe how they behave - I'd say it's an interesting property of abstract mathematics: one can talk all day about what those entities can and cannot do and what rules they follow, but one hardly bothers with what they truly are. Definitions are, after all, just a description of what they do. I can't "visualize" them so easily as vectors, which was why I had a hard time understanding it.

All these belong to some of the most elegant* branches of mathematics, which is still intimately attached to theoretical physics - and that's why I'm so interested in it. On one side, there's the beauty of tensor algebra (no kidding), especially in those unusually elegant equations I see; on the other side, there's a comforting feeling of union when I realize that much of what I learned about vector calculus (and matrices) are united into a grand set of equations in tensor algebra - just look at the chain rule, for example, or perhaps the fundamental theorem of calculus that I didn't have the space to mention. It feels like we are going onto higher and higher math, but retrospectively, we are really digging deeper and deeper into the fundamentals of mathematics.

* I would call these tensor algebra and multivariable calculus, but I'm not absolutely sure.

0 comments:

Post a Comment