A brief foray into hypercubes

The discussion on error-correcting codes is about to get a little hypercube heavy (never a good state to be in), and a brief foray into how to construct/visualize them may be in order.  I’ll take the liberty of defining an n-dimensional (unit) hypercube as a shape whose

1. vertices are located at coordinates made of entirely 0′s and 1′s, and
2. has an edge wherever two vertices are distance 1 apart.

This would take two more things to make a complete definition: I should let you move the cube about however you like (no reason to have it fixed is space), and I should tell you about the 2-D faces, 3-D hyperfaces, and so on up to the (n-1)-D hyperfaces.  You can use that first one if you want, but I’ll ignore the second.  I think I did a good job of defining what’s called the 1-skeleton of a very particular n-dimensional hypercube.

Two sample vertices representing (1,1,0,0) and (1,0,1,0). These will not be connected in the hypercube.

Anyways.  Wednesday had pictures of a 2-cube and 3-cube.  What about the 4-cube?  Or 5-cube?  It will help to consider this all from a less analytic, more graph theory (or, if that sounds technical, “pictures and problem solving”) point of view.  Condition 1 for a hypercube says that there are 2n vertices, all the binary sequences of length n. Then condition 2 says that two vertices are connected if you can change one vertex’s binary sequence to the other’s by changing a single bit.   We’ll go one step further, by just coloring particles on a line: white for 0, black for 1 (this is something of a homage to my undergraduate thesis advisor’s work with polyhedra).

The only two things left to do are to draw the vertices and arrange them in nice ways (that is, fine a “nice” projection).

A projection of a 4-cube, with vertices labelled.

Below is the image from the wikipedia 5-, 6-, and 7- cubes.  Note the some of the vertices are laying on top of eachother.  I’ll leave it as an exercise to the reader to label these vertices with the appropriate binary sequences.

5-cube

6-cube

7-cube

Hamming Error-Correcting Codes

This is part 1 in hopefully a series on the Hamming error-correcting codes, to be continued on Friday.  The problem is this:  suppose you wish to send a message composed entirely of 0′s and 1′s, but suspect part of it might get “corrupted”.  More concretely, I am going to send you the message

01101,

but there’s a chance that you receive instead something like 00101 (so the second bit was flipped), or 01111 (so the fourth bit was flipped).  Is there any way to control for this?  An idea that is both natural and wrong is to send two copies of the message.  In this case, I’d send along

0110101101.

Now if a bit gets flipped (note that there are now more bits that could be flipped), you can see exactly where — perhaps you receive

0010101101,
where the non-matching bits are highlighted.  The problem here is that you cannot tell whether the offending bit should be interpreted as a 0 or a 1 (which might be ok if you are only interested in error detection).  But if we want to be able to correct the error, we’ll have to be a little more clever.  As a very broad outline of our strategy, we are going to take each of the symbols we would like to transmit and encode them into a “big enough” space so that each symbol has a buffer zone around it to account for any errors in transmission.

In some sense, this is a familiar strategy: on touchscreen keyboards you do not have to hit the exact spot for a key to work, only be close enough.  In this case, the “symbols” I’m referring to are letters, and the “buffer zone” is the area of the screen you can touch so the computer recognizes that you are trying to touch this particular letter.

The trick here (and what is sort of confusing about this) is that the symbols we wish to transmit are bits, and the space that we will be embedding our symbols into will also be bits (rather than a shiny retina display!) As a hint of things to come, I’ve displayed below a representation of the space of 2-bit strings (which will not be big enough to detect 1-bit errors), and a representation of the space of 3-bit strings, which is of perfect size to detect 1-bit errors in a 1 bit message.

All 2-bit strings, with a line connecting them if the distance between them is 1, so you can get to one from the other by switching a single bit.

All 3-bit strings, connected whenever the distance between them is 1. Notice that the points (0,0,0) and (1,1,1) have disjoint balls of radius 1 around them. This will be the basis of the error correcting code, by agreeing that anything received in the red “ball” will be interpreted as the symbol (0,0,0), and anything received in the blue “ball” with be interpreted as (1,1,1). Then this can be turned into a 1 bit error correcting code.

Bouncing balls

Inspired, as usual, by Leonid’s recent post, I decided to first write a script that would mimic his.  After that, since I had all the numbers worked out, I wrote two more MATLAB programs: one that mimicked elastic collisions in 2-dimensions, and one that mimics them in 3.

In theory, you can specify the number of particles and their radius, as well as the mass, position, and initial velocity for each (I didn’t vectorize radius for some reason, so I cannot model balls of different sizes bouncing around).  However, in practice I just generate random vectors for each of these numbers.  The final aspect is that the domain I put the balls in was a pool table of 9 x 4.5 units, or 9 x 4.5 x 4.5 for the 3D version.  This was just to make calculating the reflecting angle easier when a ball hit the wall.

As with Leonid’s code, mine works by checking whether the next step will cause any collisions, then adjusting the velocity vector so that the collision didn’t happen (using conservation of momentum and kinetic energy).  This algorithm is not “smart” in the sense that by avoiding one collision, it might get pushed into a second collision which it does not detect, and if a particle gets going fast enough, it can reflect off a wall from a large distance (my time step is just 0.01).  You can spot this in some of the figures below.

Anyways, here are some of the outputs.  I did not go through the trouble of turning these into .gifs, but they play fairly smoothly.  What happens is I simulate N particles of varying masses and velocities bouncing around in a 2- or 3- dimensional box for T seconds, then plot the path of one of the particles.  The end position of all the particles, plus this path, is in each picture below (with the “tracked” particle colored in red).

4 particles for 20 seconds. Not many collisions... I can spot 3, I think.

40 particles bouncing for 20 seconds. This particle is involved in a few more collisions than last time (and looks to have been moving faster, too.

400 particles bouncing for 20 seconds. Quite a few collisions here.

4 particles bouncing for 20 seconds in 3 dimensions. Again, not many collisions (the ball radius is 0.1, compared to the volume of the box, 9x4.5x4.5 = 182).

Now 40 particles for 20 seconds in 3d. A few more collisions, and it looks like it was going pretty fast in the middle there for a while.

400 particles for 20seconds. Starting to look more like the "random walk" of Robert Brown's pollen, though I would certainly have to mess with how heavy the particles were to more accurately model that.

Another nice theorem

Trying to visualize the projection map using fibers. You'll have to take my word that the lines stop before getting to the origin.

Today’s Theorem of the Day (TM) I used to compute the Jacobian of a radial projection.  In particular, consider the map $F: \mathbb{R}^n \to \mathbb{R}^n$ where $x \mapsto x/|x|$ for all $|x| > 1$.  This projects all of n-space onto the surface of the unit ball, and leaves the interior untouched.  Then we may compute the derivative $\frac{\partial F_j}{\partial x_k} = \frac{\delta_{j,k}|x|^2 - x_k^2}{|x|^3}$.

To calculate the Jacobian of means we have to calculate the determinant of that matrix.  With a little figuring, we can write that last sentence as $|JF(x)| = \det \left(\frac{1}{|x|} ( I - \frac{x^Tx}{|x|^2} ) \right) = \frac{1}{|x|^n} \det \left(I-\frac{x^Tx}{|x|^2}\right)$.

Now we apply The Theorem, which Terry Tao quoted Percy Deift as calling (half-jokingly) “the most important identity in mathematics”, and wikipedia calls, less impressively, “Sylvester’s determinant formula“.  Its usefulness derives from turning the computation of a very large determinant into a much smaller determinant.  At the extreme, we apply the formula to vectors u and v, and it says that $\det (I+u^Tv) = 1+v^Tu$.  In our case, it yields $|JF(x)| = 0$.  Thus we turned the problem of calculating the determinant of an n x n matrix into calculating the determinant of a 1 1 matrix.

Pretty nifty.

More with fibers of functions

I posted earlier on a way of visualizing the fibers of certain maps from high dimensions to low dimensions.  Specifically, if the range can be embedded in the domain so that f is the identity of the image of the range, then we can draw the inverse image at each point.  I had some images of functions whose inverse image was a torus, but had trouble making these sorts of images for maps $f: \Omega \subset \mathbb{R}^3 \to \mathbb{R}^2$, so that the inverse image of a point is a line.  Well, no more!  Here are two images, one is the projection of a cube onto a square, and the other is somewhat more complicated, and is the string hyperboloid map.  See the previous post for more details on these specific maps, but I just thought these were nice images!

Fibers of the projection map from the cube to the square.

Fibers of the "twisted cylinder", which are again straight lines.

Slices followup

More from yesterday.  Of course I see the following .gif in the course of my daily reading, which blows all my animations out of the water:

This is a sequence of 2-D slices of the human body, collected from the Visible Human Project.  For those who don’t go to the wikipedia page, this data was collected in a much more… invasive manner than the MRI I described yesterday.  Specifically, a cadaver was “frozen in a gelatin and water mixture”, then they grinded away 1mm, take a picture, and repeat.    There is also a female data set with 0.33mm resolution.  Also, apparently this guy was a Texan murderer, which is a far less pleasant tie-in than, say, Tibor Rado being at Rice before solving Plateau’s problem, or Milgram being at Syracuse before proving his theorem.

There is also a fantastic viewer for this dataset hosted here.

In any case, this sort of image is a good introduction to the next logical place I was going to go with visualizing data.  That is, keeping track of five dimensions of data at once.  Yesterday’s images had three spatial dimension and one color dimension.  Today’s .gif has two spatial dimension, one color dimension, and one time dimension.

Long exposures and fire: another way to "see" another dimension.

This suggests (and I plan to provide examples of this) having three space dimensions, one color dimension, and one time dimension.  A great example would be watching a person’s temperature change over time, or how hot an oven is over time.  For today though, we have two somewhat boring examples, in that the color and time dimensions are equal to each other (i.e., I colored these by recording the height, and making that the intensity.

This first is an implementation of a variant of the heat equation with “random shocks” introduced, and Von-Neumann boundary conditions (which means: no energy escapes the system).  It is meant to model how head would travel, and the “random shocks” are small points of heat introduced into the system.

The second image is an implementation of a variant of the wave equation (in both these cases, when I say “variant”, I am allowing just a little bit of energy to leave the system so that there is not too much “heat” at the end of the simulation… things tended to just white-out without this).  The von Neumann boundary conditions are much more evident in this one, where you can see waves “bouncing” off the walls.

A little more imaging.

Now I want to talk about slightly more extravagant ways of viewing high dimensional maps.  ”Extravagant” does not mean “little used”, though.  I’ve included two pictures and one video of its use at varying-levels-of-importance.  Specifically, I want to think about visualizing maps $f: \mathbb{R}^3 \to \mathbb{R}$, which I have done before with the fibers of those maps.  In this case, we will use color or intensity to stand in for the fourth dimension that we are trying to display: remember, 3 for the domain, one for the range.  We will use the space dimensions for our domain and the color dimension for the range.

If the pieces of plywood making this bear had pictures of bear innards in it, it would be both way more interesting for this post, and way less reasonable to use as a bookshelf.

Examples!  One immediate problem with this plan is that if you assign a color to every point inside a three dimensional space, the space becomes opaque, and you cannot see it.  This is why with these plots, one often takes “slices” instead.  Literally in the case of this video, where a chef maps the heat in an oven by placing slices of toast in it (and I love experimental math).

For a more important example, MRIs can give an idea of the inside of a person’s body by mapping slices.

Slices of a person's head, not arranged spatially like some of these other plots.

For some more traditional math plots, I want to first imitate an oven by giving a plot in the spirit of those above, colored according to distance from the origin (more realistic oven imitations will come sometime in the future with discussions of the heat equation).  Specifically, we will plot $f(x,y,z) = \sqrt{x^2+y^2+z^2}$, and assign an intensity of color to each point.  Recall that if we wanted to visualize this plot instead with the fibers of the map, it would be a series of nested spheres, which you can see in this picture if you sort of squint.

Slices colored according to distance from the origin.

Finally for comparison, the map of the torus from my previous post, as well as this slice plot of the torus.  Recall the the map we are visualizing takes a circle in the x-y plane, and returns the distance of a point (x,y,z) from that circle.  Hence, the level sets are tori as can be seen in either of the figures below.  I guess which method of visualization you use depends on what you are trying to do with your data:

The picture of fibers of the map.

Slices of the torus map.

Higher resolution

So my normal workflow for creating images here is to sketch ideas, try to create them in MATLAB, and then annotate them in Skitch, if necessary (or sketch things directly into Skitch).  Both these programs are fantastic for typical image manipulation and creating diagrams, though Skitch is (necessarily) a little underpowered.  I have run into this problem once before, where Skitch will take screenshots off your computer (from, for example, a webpage, or a MATLAB figure window), but you cannot layer multiple screenshots into the same image without working a little too hard.  Also, yesterday’s post had some pretty terrible resolution images that I produced in Skitch by trying to just put a textbox in, and then save it.

The 1000th line. Click for full resolution!

The problem with both of these is that I crossed a line into more serious image editing- creating layers, or looking for very high resolution.  I finally worked out that open-source Gimp (and I would assume it’s commercial analogue, Photoshop), a program intended for this sort of image manipulation, is a better choice.  Hence I can offer today a high res version of the 1000th line (1.3MB) and 2000th line (3.1MB) of Pascal’s triangle.  The 1000th is 6pt font, and usually readable with compression.  The 2000th line is 5pt font (otherwise the canvas was too big… it is a 100MB+ file before compression), and only sometimes readable.

The 2000th line. Click for higher resolution!

Compare this to Wikipedia’s image that helped to inspire this experiment, which they display by coloring the pixels in a grayscale from 1 to 10, one pixel per digit.  This has the benefit of higher compression, since we do not need to keep track of the value of each pixel, and easier reading from a macro level.

From Wikipedia.

Finally, just to keep things a little original, here is a similar plot of the first 2000 terms of the Fibonacci sequence.  Again, the font is pretty small, but it gives a good idea of a famous relationship, which is that for large numbers, the nth Fibonacci number is close to $\phi^n/\sqrt{5}$, where $\phi = \frac{1+\sqrt{5}}{2}$ is the golden ratio.  This is an easy and beautiful derivation using eigenvalues in linear algebra.  But for now, just notice that the number of digits (that is, the log of the number) looks like it increases linearly.  This could lead us to try to prove that $\log{F_n} \approx k\cdot n$, where $F_n$ is the nth Fibonacci number, and k is just some number.  It turns out we would find that $\log{F_n} \approx \log{\phi}\cdot n-\log{\sqrt{5}}$ which, while sort of ugly, is a linear function in n.

The first 1000 Fibonacci numbers, written vertically. Click for full resolution.