# Some code.

Decided to publish the code from yesterday’s example, though wordpress only allows for certain somewhat odd file types to be shared, so I am just copy/pasting below.  There are 5 functions, and as usual, you save each to the same folder in MATLAB, open that folder, and then typing

>>billiardgame3(25,10,15,20)

will generate a 3-dimensional billard game with 25 balls of masses between 0 and 10, speeds between 0 and 15 bouncing for 20 seconds, then play this.  It also returns the position vectors [X,Y,Z] associated with this.  I’d be glad to also publish the 2-dimensional version of this, though it should not be too hard to wade through my code and cut it down to two dimensions.

function [v1n,v2n] = collide3(m1,m2,v1,v2)
%solves the conservation of velocity and momentum for two balls of mass m1
%and m2 colliding at velocities v1 and v2, and returns their new
%velocities.
C1 = (m1-m2)/(m1+m2);
C2 = (2*m2)/(m1+m2);
C3 = (2*m1)/(m1+m2);
v1n = C1*v1+C2*v2;
v2n = -C1*v2+C3*v1;
end

%takes vectors X, Y, and Z where each entry contains the x- or y-position of
%a billiard ball and returns an n x 2 matrix A containing the indices where
%and a matrix W containing the indices of which balls collided with walls
%in a w x l x h rectangle

C = X>=w; %oddly, I can’t find a way to make this code nicer. Checks collisions with walls
C = +C;
D = Y>=l;
D = +D;
E = X<=0;
E = +E;
F = Y<=0;
F = +F;
G = Z>=h;
G = +G;
H = Z<=0;
H = +H;
C = C+D+E+F+G+H;
for j = 1:length(C)
if C(j) > 0
W = [W;j];
end
end

X = ones(length(X),1)*X;
Y = ones(length(Y),1)*Y;
Z = ones(length(Z),1)*Z;

B = sqrt((X-X.’).^2+(Y-Y.’).^2+(Z-Z.’).^2); %a matrix whose (i,j)th entry is the distance between particle i and particle j
A = [];
for j = 1:size(B,2)
for k = 1:j-1
if B(k,j)==1
A = [A;k,j];
end
end
end
end

%simulates billiards with elastic collisions on a w x l billiards table.  M
%should be a vector recording the (positive) masses of the billiard balls
%(the function will create as many balls as the length of M).  vx, vy, X0,
%Y0 will similarly be vectors giving the initial x and y velocities of each
%billiard ball, and then initial positions.  The program runs for T seconds
%with time step tstep.  A reasonable setup is
%billiards(9,4.5,randi(10,1,9),3*rand(1,9),3*rand(1,9),.5+randi(8,1,9),.4+ran
%di(4,1,9),5,.2,.01)

X = zeros(floor(T/tstep),length(M)); %initialize the three position arrays, one column per particle
Y = zeros(floor(T/tstep),length(M));
Z = zeros(floor(T/tstep),length(M));

X(1,:) = X0; %set initial position
Y(1,:) = Y0;
Z(1,:) = Z0;

for k = 2:floor(T/tstep)
tryxpos = X(k-1,:)+tstep*vx; %here we check if any collisions will happen in the next step
tryypos = Y(k-1,:)+tstep*vy;
tryzpos = Z(k-1,:)+tstep*vz;

for j = 1:size(A,1)
[vx(A(j,1)),vx(A(j,2))] = collide3(M(A(j,1)),M(A(j,2)),vx(A(j,1)),vx(A(j,2))); %avoiding collisions with particles
[vy(A(j,1)),vy(A(j,2))] = collide3(M(A(j,1)),M(A(j,2)),vy(A(j,1)),vy(A(j,2)));
[vz(A(j,1)),vz(A(j,2))] = collide3(M(A(j,1)),M(A(j,2)),vz(A(j,1)),vz(A(j,2)));
end
for j = 1:length(W)
if tryxpos(W(j)) >= w || tryxpos(W(j))<=0 %avoiding collisions with walls
vx(W(j)) = -vx(W(j));
elseif tryypos(W(j))>=l || tryypos(W(j))<=0
vy(W(j)) = -vy(W(j));
elseif tryzpos(W(j)) >=h || tryzpos(W(j))<=0
vz(W(j)) = -vz(W(j));
end
end
X(k,:) = X(k-1,:)+tstep*vx;%updating the position with the “fixed” velocity vectors
Y(k,:) = Y(k-1,:)+tstep*vy;
Z(k,:) = Z(k-1,:)+tstep*vz;
end
end

function billiardplayer3(X,Y,Z,ptime,w,l,h)
%a program for visualizing billiard movement in 3 dimensions.

figure(1)
axis equal
%maxwindow
for j = 2:size(X,1)
%color the first particle red
plot3(X(j,1),Y(j,1),Z(j,1),’ro’,'MarkerFaceColor’,'r’)
hold on
%keep the rest of the particles blue circles
plot3(X(j,2:size(X,2)),Y(j,2:size(Y,2)),Z(j,2:size(Z,2)),’o')
hold off
axis([0 w 0 l 0 h])
grid on
pause(ptime)
end
hold on
%draw the path of the first particle
plot3(X(:,1),Y(:,1),Z(:,1),’k',’LineSmoothing’,'on’)
end

function [X,Y,Z] = billiardgame3(N,massmax,vmax,T)
%generates and plays a game of billiards with N randomly placed billiard
%balls on a table.  The balls have random mass between 0 and massmax,
%velocity between 0 and vmax, and plays for T seconds, with timestep .01.
w = 9;
l = 4.5;
h = 4.5;

M = massmax*rand(1,N);

vx = 2*vmax*rand(1,N)-vmax;
vy = 2*vmax*rand(1,N)-vmax;
vz = 2*vmax*rand(1,N)-vmax;

X0 = w*rand(1,N);
Y0 = l*rand(1,N);
Z0 = h*rand(1,N);

tstep = 0.01;

billiardplayer3(X,Y,Z,0.01,w,l,h)
end

# Bouncing balls

Inspired, as usual, by Leonid’s recent post, I decided to first write a script that would mimic his.  After that, since I had all the numbers worked out, I wrote two more MATLAB programs: one that mimicked elastic collisions in 2-dimensions, and one that mimics them in 3.

In theory, you can specify the number of particles and their radius, as well as the mass, position, and initial velocity for each (I didn’t vectorize radius for some reason, so I cannot model balls of different sizes bouncing around).  However, in practice I just generate random vectors for each of these numbers.  The final aspect is that the domain I put the balls in was a pool table of 9 x 4.5 units, or 9 x 4.5 x 4.5 for the 3D version.  This was just to make calculating the reflecting angle easier when a ball hit the wall.

As with Leonid’s code, mine works by checking whether the next step will cause any collisions, then adjusting the velocity vector so that the collision didn’t happen (using conservation of momentum and kinetic energy).  This algorithm is not “smart” in the sense that by avoiding one collision, it might get pushed into a second collision which it does not detect, and if a particle gets going fast enough, it can reflect off a wall from a large distance (my time step is just 0.01).  You can spot this in some of the figures below.

Anyways, here are some of the outputs.  I did not go through the trouble of turning these into .gifs, but they play fairly smoothly.  What happens is I simulate N particles of varying masses and velocities bouncing around in a 2- or 3- dimensional box for T seconds, then plot the path of one of the particles.  The end position of all the particles, plus this path, is in each picture below (with the “tracked” particle colored in red).

4 particles for 20 seconds. Not many collisions... I can spot 3, I think.

40 particles bouncing for 20 seconds. This particle is involved in a few more collisions than last time (and looks to have been moving faster, too.

400 particles bouncing for 20 seconds. Quite a few collisions here.

4 particles bouncing for 20 seconds in 3 dimensions. Again, not many collisions (the ball radius is 0.1, compared to the volume of the box, 9x4.5x4.5 = 182).

Now 40 particles for 20 seconds in 3d. A few more collisions, and it looks like it was going pretty fast in the middle there for a while.

400 particles for 20seconds. Starting to look more like the "random walk" of Robert Brown's pollen, though I would certainly have to mess with how heavy the particles were to more accurately model that.

# Large data sets

1200 points on the sphere, each connected to its 7 nearest neighbors.

I was thinking about the topic that Rice’s VIGRE program has focused on in the past, namely how to visualize large data sets.  Now, discrete data is not the greatest thing for an analyst, as opposed to, say, a manifold.  Along this line of thinking, one might ask:

If you know that a data set comes from some manifold, is there a way to detect which manifold?

Towards answering this, I wrote a quick program in MATLAB that:
1. Draws random points from a parametrized surface (so far just a sphere and a torus, but it would be easy to use any parametrized surface),
2. Draws a line connecting each point to its n nearest neighbors.

The data without a "ghosted" surface. You can still tell the data set has a nonzero genus (i.e., a "hole"), but it is somewhat hard to see depth without rotating the picture (so I've included the surface in all the others). This is 400 points on a torus, each connected to its 3 nearest neighbors.

The program also “ghosts” in the parametrized surface to see how well this method does at illustrating the surface.

At this stage, the program is just something to play around with a bit, but I think the images you get from it are really great!   One of the problems it still has is that not every vertex has n edges leaving from it (since the relation “is one of the n nearest vertices” is not symmetric, which is easy to see with a sketch).  This also slows down computation since really I am drawing 2*N*n lines, even though only about N*n of them are displayed.  Scroll down to see all the images.

The sphere with N = 400, n = 3.

The torus with N = 400, n = 3.

N = 400, n = 7.

N = 1200, n = 3.

N = 1200, n = 3.

N = 2000, n = 7.

# Late to the recursion game.

Ok, confession time: every once in a while, you find out something that apparently everyone except for you knows about (I’m looking at you, pronunciation-of-rendezvous).  This time, I totally missed out on recursion in the course of teaching myself how to program.  For those in the same boat as me, this means you can call a function inside the same function.  So long as you specify a base case (and make sure you reach this base case), you can get a reasonable answer.

In the course of rectifying this, I’ve written two toy programs.  The first is very un-mathy, but essentially solves the game “TextTwist“. First I use recursion to find all possible substrings of a collection of letters, then I import an English dictionary (thank you, Python!), and check these substrings against the dictionary to come up with a list of English-language words.  See below for the print-out.

The dictionary I am using is a little bit generous with what it allows as a word, but I'd rather err on the side of being thorough in this case.

The second program I wrote was in MATLAB, and the code is a little cleaner, so I can in fact post it here.  It uses recursion to again generate the Sierpinski gasket from the flurry of fractal posts I had earlier, though this is a deterministic approach.  What it does is takes a column of points determining the vertices of a triangle (or any other shape), then for each point, returns a column determining the points of a triangle whose vertices are halfway to the previous vertex, and halfway to the next vertex.  The code may be easier to read, noticing especially that I call gasket on the third to last line:

 function [A,B] = gasket(X,Y,N) %returns the coordinates of the vertices for a Sierpinski gasket with N %iterations m = size(X,1)-1; A = []; B = []; for k = 1:m-1 A = [A,[X(cmod(k,m),:);(X(cmod(k+1,m),:)+X(cmod(k,m),:))/2;(X(cmod(k-1,m),:)+X(cmod(k,m),:))/2;X(cmod(k,m),:)]]; B = [B,[Y(cmod(k,m),:);(Y(cmod(k+1,m),:)+Y(cmod(k,m),:))/2;(Y(cmod(k-1,m),:)+Y(cmod(k,m),:))/2;Y(cmod(k,m),:)]]; end if N~=1 [A,B] = gasket(A,B,N-1); end end

The function “cmod” just changes the “mod” function so that “cmod(m,m) = m” instead of the usual behavior where “mod(m,m) = 1″.  Running this code with N = 8, we get the following image:

Noticeably cleaner than at least some of the randomly generated fractals from earlier, and at the very least faster.  After 8 iterations, there are 3^8 = 6561 triangles, and for each we store 4 pairs of numbers to generate a triangle (I am calling the “line” command, which means I need to specify that the start and end points are the same), so the total number of numbers I store is 52,488, compared to 100,000 for this much lower resolution version from earlier.  Also, I can start with any set of points I want with this program, so here is a belated Valentine for everyone:

It is somewhat hard to find interesting, mostly convex parametric curves, ok?

# Floating point, continued

As a followup to yesterday’s post on a floating point problem I encountered using the arctangent, I want to discuss how floating point numbers are stored in computers.  Apologies in advance to my misrepresenting a lot of this (and corrections are, as always, welcome), but it is helpful for me to work it out.  If you want to just get straight to experimenting, I am writing a Python script which will be posted Wednesday that converts decimals to bits, and for mac users, a quick guide to getting started with Python will appear tomorrow.

Each of these bits is either 0 or 1, and goes towards storing a "double". From wikipedia.

We represent each (double precision) number x as (to use the lingo)

$x \approx sign*mantissa*2^{exponent-1023}$,

where the mantissa and exponent are each an integer.  We will deal with doubles, which means we get 64 bits, each either 0 or 1, to work with (see the nice figure above).  Then

1. The first bit (#63 above) records the sign of the number: 0 for positive, 1 for negative.
2. Bits 63-52 (11 bits) record an exponent E, the “size” of the number.
3. Bits 52-0 (52 bits) record the “mantissa” M of the number itself.  We always choose the mantissa to be as large as possible, which means the leading digit will always be 1, so we will actually squeeze out 53 bits of information by leaving out the leading 1.

Also notice that with 64 bits, we can hope to distinguish at most $2^{64}$ unique numbers.

Some nice hills in Montana.

As an example, we might store the number 1 as follows:
1. The sign is positive, so the first digit is 0.
2. E should be chosen so that the mantissa lies between $2^{53}$ and $2^{54}-1$.  Some inspection shows that E = 970 is the only choice, so the exponent is $2^{970-1023}= 2^{53}$. Note that in binary, 970 = 512+256+128+64+8+1 = 01111001001.
3. M should then be  $2^{53}$, stored in binary, with the leading 1 omitted.  This is just a string of 52 zeros.
4. So the 64 bits to store 1 is 0 01111001001 0000000000000000000000000000000000000000000000000000.

Now the problem: we cannot store most decimals exactly.  Let’s write the number 1.1.  Notice that if you type 1.1 into a Python prompt and hit return, 1.1000000000000001 is displayed, which is already a problem.  But anyways:

1. The sign is positive, so the first digit is 1.
2. Again we choose so the mantissa is in the correct range.  Again will be 970.
3. Now we want to choose M so that $M\cdot 2^{-51}$ is as close to 1.1 as possible, so we want to round the number $\frac{11}{10}*2^{51}$ to the nearest integer, and then encode that integer as a binary number.  This is easy to do with the Python command
 v,r = divmod(11*(2**51),10),
noting that the remainder is greater than 5 (namely, 6), so we should round up v = 9907919180215090 to v+1. Then
bin(v+1) outputs as a binary number.  In particular, we find that
1.1 “=” 0 01111001001 0001100110011001100110011001100110011001100110011001

This is awful close, but not exactly it.  As a last two points:
1. The approximation is off by at most $2^{-53}$ as compared to the leading digit.  So the approximation to 1.1 was this close, whereas 4.1 might be wrong by $2^{-51}$, since the leading digit is about 4 times as big.
2. We can represent all integers between 0 and $2^{53}-1$ precisely.  After that, there is some error.

# An identity and floating point arithmetic

A geometric interpretation of arctan.

The other day I was looking at a string of calculations which were supposed to come out to zero.  Instead, the conclusion of these calculations was written as:

$\arctan{5} + \arctan{1/5}-\arctan{3}-\arctan{1/3}.$

Realizing that not only must this be wrong, but severely wrong, I plug it into a calculator just to make sure.  The results are:
2. Via wolframalpha.com: -5.551115123125783×10-17 is the answer given while typing it in, though after hitting “enter”, we get 0, but we also get a whole bunch of series expansions and continued fractions, which is odd.
3. Via MATLAB: -5.551115123125783e-17
4. Via Python: -5.5511151231257827e-17, which is just one extra digit on wolfram and MATLAB’s output.

First, the right answer is in fact 0.  Remember that arctan(x) takes the ratio of the opposite and adjacent sides of a right triangle (remember, side-angle-side completely determines a triangle), and returns the angle of the triangle, in radians.  Then it follows that since a triangle has pi radians total, and the right angle uses up pi/2 of those radians, we have the following identity for all nonzero x:

$\arctan{x} + \arctan{1/x} = \pi/2.$

It follows, then, that

$\arctan{5} + \arctan{1/5}-\arctan{3}-\arctan{1/3} = 0,$

though this is one of the most… interesting ways I have ever seen the number 0 written.

Tomorrow: more on why the calculators were wrong, and a rough discussion of exactly how computers calculate with floating point numbers.

# Prime matrices II

Someday soon I'll have a post where figures will be useful. Until then, more New England.

Yesterday, I asked for the “smallest” 3×3 singular matrix, each of whose entries is a distinct prime.  By “smallest”, I am adding all the entries in the matrix.

It turns out that there is an optimal matrix, by which I mean there is a way to arrange the smallest 9 primes so that the matrix is singular, and the sum is 100.  One such way (remember, we can exchange rows, exchange columns, or take transposes and still have a solution) is this:

$M=\begin{bmatrix}3 & 2 & 5\\ 11 & 13 & 7\\ 19 & 17 & 23\end{bmatrix}$

As to how I found this solution, I used the following observation for such a solution MThe columns of M are linearly dependent over the integers.  Thus I wrote a short program in MATLAB that looked through the first primes, took them two at a time, say and q, and then checked whether  ap+bq was a prime, for parameters a and b that I could change.

Using a=1, b=-2, I found the optimal matrix is

$M=\begin{bmatrix}11 & 2 & 7\\ 23 & 3 & 17\\ 31 & 13 & 5\end{bmatrix},$

So much New England.

whose sum is 112, and I thought this was pretty good, since it only missed 19 and 29.  The next choice of a = 3, b = -2 turned out to be the winner above (notice, as expected, that 3 times the first column minus twice the second is the third).  I was also surprised to find that choosing a = 5, b = -6 did very well, with a matrix total of 106 (this is, by necessity, the second best answer, since it only swaps out a 23 for a 29).
Since yesterday, I was able to use the same sort of ad-hoc method to find a similarly optimal 4×4 “prime” matrix (again, “optimal” means I use only the first 16 primes):

$M=\begin{bmatrix}7 & 3 & 2 & 13\\ 23 & 5 & 11 & 29\\ 31 & 19 & 17 & 47\\ 43 & 41 & 37 & 53\end{bmatrix},$

which is enough for me to ask the following question, verified only for the first two cases:

For n > 2, is there a singular n x n matrix whose entries are the first n^2 prime numbers?

Too much New England! Creepy woods women!

# A little Python

Two filler photos for today. Neither of them contain pythons.

I had thought of starting today’s post with a picture of a snake, but I think yesterday featured enough unpleasantness.  We had two colloquia today, including a great talk by Luis Caffarelli on diffusion equations which had some great insights into regularity for these solutions.  But I’d like to talk about the second one, which was on the Frobenius coin problem, asking what the largest number  a certain number of coins could not make change for.  The idea is that if you lived in a (rather silly) country with only 3 and 5 cent pieces, you would be at a loss to give someone 7 cents.

But this is not the problem I want to talk about.  I want to think about countries that are moderately less silly.  They will always have coins to make any amount of change (this is equivalent to always having 1 cent pieces).  But the observation I seek to generalize is the following:

With [common] US coins, it is possible to have $1.19 in change, but not be able to make change for a dollar (i.e., 3 quarters, 4 dimes, and 4 pennies). Houston, looking positively bucolic. I want to imagine a country where a dollar is worth d pennies, and to investigate the worst choice of denominations for coins, so that a person can have a lot of money in coins, but not make change for a dollar. I also impose the following restrictions: 1. The country has at most 4 denominations of coin. 2. Each denomination of coin divides the value of a dollar. I wrote a short Python script to solve this. It turns out that as long as you have 4cent and 25cent pieces, you can hold$1.71 without being able to make change for a dollar.

The first time you get over $5 in coins is when 1 dollar = 210 cents, then if you have pennies, 30, 35, and 42 cent pieces, you can have$5.23 in coins without being able to break a dollar.  Similarly, if 1 dollar = 396 cents, you can have \$10.09 in coins, if coins have value 1,36,44 and 99.

I have put a plot below.  It reminds me vaguely of the plot of Euler’s totient function, and certainly a thorough description of a solution to this should have to do with how many factors the denominations of coins have, but I certainly do not have a rigorous explanation for why these two plots look similar.  The points along the bottom correspond to prime numbers, where condition 2 requires that we only have pennies.

A plot for value of a dollar, and the most change you can have without being able to break a dollar (assuming a really bad choice of denominations of coins).

Euler's phi function, for comparison.

# Slices followup

More from yesterday.  Of course I see the following .gif in the course of my daily reading, which blows all my animations out of the water:

This is a sequence of 2-D slices of the human body, collected from the Visible Human Project.  For those who don’t go to the wikipedia page, this data was collected in a much more… invasive manner than the MRI I described yesterday.  Specifically, a cadaver was “frozen in a gelatin and water mixture”, then they grinded away 1mm, take a picture, and repeat.    There is also a female data set with 0.33mm resolution.  Also, apparently this guy was a Texan murderer, which is a far less pleasant tie-in than, say, Tibor Rado being at Rice before solving Plateau’s problem, or Milgram being at Syracuse before proving his theorem.

There is also a fantastic viewer for this dataset hosted here.

In any case, this sort of image is a good introduction to the next logical place I was going to go with visualizing data.  That is, keeping track of five dimensions of data at once.  Yesterday’s images had three spatial dimension and one color dimension.  Today’s .gif has two spatial dimension, one color dimension, and one time dimension.

Long exposures and fire: another way to "see" another dimension.

This suggests (and I plan to provide examples of this) having three space dimensions, one color dimension, and one time dimension.  A great example would be watching a person’s temperature change over time, or how hot an oven is over time.  For today though, we have two somewhat boring examples, in that the color and time dimensions are equal to each other (i.e., I colored these by recording the height, and making that the intensity.

This first is an implementation of a variant of the heat equation with “random shocks” introduced, and Von-Neumann boundary conditions (which means: no energy escapes the system).  It is meant to model how head would travel, and the “random shocks” are small points of heat introduced into the system.

The second image is an implementation of a variant of the wave equation (in both these cases, when I say “variant”, I am allowing just a little bit of energy to leave the system so that there is not too much “heat” at the end of the simulation… things tended to just white-out without this).  The von Neumann boundary conditions are much more evident in this one, where you can see waves “bouncing” off the walls.

# A little more imaging.

Now I want to talk about slightly more extravagant ways of viewing high dimensional maps.  ”Extravagant” does not mean “little used”, though.  I’ve included two pictures and one video of its use at varying-levels-of-importance.  Specifically, I want to think about visualizing maps $f: \mathbb{R}^3 \to \mathbb{R}$, which I have done before with the fibers of those maps.  In this case, we will use color or intensity to stand in for the fourth dimension that we are trying to display: remember, 3 for the domain, one for the range.  We will use the space dimensions for our domain and the color dimension for the range.

If the pieces of plywood making this bear had pictures of bear innards in it, it would be both way more interesting for this post, and way less reasonable to use as a bookshelf.

Examples!  One immediate problem with this plan is that if you assign a color to every point inside a three dimensional space, the space becomes opaque, and you cannot see it.  This is why with these plots, one often takes “slices” instead.  Literally in the case of this video, where a chef maps the heat in an oven by placing slices of toast in it (and I love experimental math).

For a more important example, MRIs can give an idea of the inside of a person’s body by mapping slices.

Slices of a person's head, not arranged spatially like some of these other plots.

For some more traditional math plots, I want to first imitate an oven by giving a plot in the spirit of those above, colored according to distance from the origin (more realistic oven imitations will come sometime in the future with discussions of the heat equation).  Specifically, we will plot $f(x,y,z) = \sqrt{x^2+y^2+z^2}$, and assign an intensity of color to each point.  Recall that if we wanted to visualize this plot instead with the fibers of the map, it would be a series of nested spheres, which you can see in this picture if you sort of squint.

Slices colored according to distance from the origin.

Finally for comparison, the map of the torus from my previous post, as well as this slice plot of the torus.  Recall the the map we are visualizing takes a circle in the x-y plane, and returns the distance of a point (x,y,z) from that circle.  Hence, the level sets are tori as can be seen in either of the figures below.  I guess which method of visualization you use depends on what you are trying to do with your data:

The picture of fibers of the map.

Slices of the torus map.