In 2014 I took a class in C++ programming (that was required for my course). I was expecting to do boring assignments and to be forced to write terrible code using whatever constructs the instructor likes. I was wrong. The assignments were "self-assessed" which is one of the best ideas I've ever come across in education. The idea is simple: Rather than setting a task that all students have to complete, the objective is to set yourself a task of appropriate difficulty that demonstrates you understood the course. So I decided to hand in the most awesome programs I could think of. In C, naturally. Of course they all had to be completed in a reasonable amount of time which was the most limiting constraint. Because people have expressed interest in seeing my work, I've uploaded them below. You can also read the original description that I included. Text in italics was not originally included.

Session 1: Arbitrary precision arithmetic and pi
This program implements arbitrary precision (aka bignum) arithmetic and implements a subset of the dc language in order to run a program that calculates pi digit by digit using a spigot algorithm. It is in principle capable of calculating an infinite number of digits -- it will however slow down very quickly and eventually run out of memory. (And the bignum code may have subtle bugs I'm not aware of that kick in with too large numbers...) The pi program was written on another occasion so the main challenge was implementing bignum arithmetic. Addition, subtraction and, to some extent, multiplication were relatively straightforward, however finding and implementing a suitable algorithm for division took some time. Multiplication and division use very crude algorithms with lots of potential for optimization.


Session 2: Bitvector Prime Number Sieve
The program calculates the first N primes using the classic sieve algorithm. It uses 1 bit for each prime (with 0 = prime) and is so very efficient in memory use. In addition it tries to use parallel bitwise operations as much as possible in order to operate in parallel on 32 numbers at once. This is most relevant for the small factors (<= 31), which are done in the first for() loop by first preparing a bitmask and then rotating that bitmask while iterating over the array. The second for() loop just iterates over all the "0" bits in the bitvector, printing them and also marking any multiples. Writing this program was really just practice with bit operations.

It's also an attempt at obfuscated programming and to show off bizarre things that are actually legal in C.


Session 3: Monte Carlo Integration using SSE
This program takes a comparison (<,>,<=,>=) between arithmetic expression (using +,-,*,/) and compiles it to SSE2 code, then runs it a specified number of times and records how often the comparison succeeded. x and y are available as random variables between 0 and 1. It uses a full grammar for the expressions and knows about precedence, parentheses and so on.

It runs four passes: First it parses the expression and generates a tree, using recursive-descent parsing. Second it simplifies the tree by folding constants (e.g. replace 1+1 by 2). Third it compiles to an immediate representation (basically binary assembly language), allocating registers in a straightforward, but reasonably effective way. Fourth it assembles to binary machine language, ready to be executed.

The assembler could relatively easily be extended to produce 64-bit code. Currently the counters are signed 32 bits, meaning they overflow very quickly (considering the code does over 100 million iterations/sec on my laptop!). It is easy to run out of registers or "constant space". The code probably has many bugs.

The random number generator is the "Even Quicker Generator" from "Numerical Recipes in C", 2nd edition by Press et al. It is very crude and was chosen for its simplicity.


Session 4: Atari 2600 emulator
The included program is an emulator for the Atari 2600, arguably the first successful home video game console. It emulates the CPU, graphics and sound chip of the system with (ideally) cycle for cycle accuracy. The game SPACE INVADERS is included for demonstration and is fully functional. This project was a lot of work but it was very fun to work on so I don't mind. I had originally hoped to reuse my code for the 6502 CPU from a previous project, however it proved not accurate enough and had to be significantly improved in that regard. Emulating the Atari 2600 graphics is tricky because the system is extremely sensitive to CPU timing, with the CPU code essentially running in lockstep with the (on a real system) analog video output tracing the screen. It uses SDL to interface with the operating system.

SPACE INVADERS is (C) 1980 Atari, Inc. Everything else is my own work


The last two sessions were required to be more about physics and required us to submit a more detailed explanation.

Session 5: Simulating the Solar System and Interplanetary Travel
The idea behind this program was to explore the physics behind interplanetary travel, in particular the Voyager missions to the outer planets. The first part was to develop a model of the solar system, integrating the equation of motion using the Runge-Kutta algorithm. (I originally planned to calculate the orbital precession of mercury and so implemented Runge-Kutta, but even that proved too inaccurate with sensible timesteps) One somewhat tricky part is finding the initial conditions, but using the data for the Keplerian Elements from Solar System Dynamics Group at NASA's JPL and the corresponding equations it is not too difficult.

The next part was implementing the actual space ship. For simplicity, and unlike the actual Voyager ship, the space ship is imagined to be a projectile, launched at a certain initial velocity from Earth, without any thrusters or the like to change velocity during flight. One might think that the space ship would simply follow a simple Keplerian orbit (i.e. ellipse, hyperbola, etc.), however the space ship is also under the influence of the planets' gravity. The obvious question is whether one can take advantage of this for interplanetary travel and indeed this is the case.

The classic maneuver that does this is called a gravitational slingshot or gravity assist maneuver. The included program demonstrates this maneuver, flying by Jupiter to gain speed and pass Saturn this way. To understand the idea behind this maneuver as a simple first order approximation we can ignore the effect of the Sun and move to the reference frame of Jupiter. In this frame we essentially do a parabolic/hyperbolic trajectory, coming from one direction and essentially leaving in the opposite direction (more or less). This may seem rather pointless, as the speed is unchanged before and after the maneuver. But it is important to remember that we are [interested] in the motion of our space craft relative to the Sun, not Jupiter. In the sun's reference frame, by appropriately aligning our orbits, that we start with a slow speed (|vJ-vS|, vJ: velocity of jupiter, vS: velocity of ship) and then gain speed as we fly by Jupiter (|vJ+vS|). This is similar to bouncing a projectile off a moving target. The energy gained in this maneuver is extracted from the planet's orbital motion. Minimizing the amount of fuel is important for space travel as any fuel requires extra fuel to bring the fuel into space, the amount of fuel depends exponentially on the Δv needed.

This maneuver sounds relatively straightforward on paper, but actually trying it on simulation shows how difficult it is to execute in practice. It requires rather precise initial conditions which took me quite a while to work out. (Incidentally, it is important not to forget the 3rd dimension, even though the orbits of Jupiter and Earth are only inclined 1.5 deg C to each other) In hindsight it may have been a good idea to work this out analytically rather than fudging numbers until it worked. (I did add code that tracks the minimum separation between Jupiter and the space ship which is not included in the uploaded version) There is four degrees of freedom (time, x-y-z velocity), some values are rather surprising in hindsight (for instance, to meet Jupiter at a positive +z value it is necessary to use an initial negative z-component for the velocity).

An interesting (not implemented) extension of this maneuver is the powered flyby or Oberth maneuver. By firing the thrusters we gain an amount of speed Δv, which turns out to be a constant for the amount of fuel, type of engine etc. The total amount of kinetic energy gained however is m((v+Δv)²-v²)/2 ~= m Δv v, i.e. linearly proportional to the current speed. We thus want to fire our thrusters at a time when we are going very fast. In a flyby maneuver we are converting potential to kinetic energy so we do have such a point: the point of closest approach.


Session 6: Molecular Dynamics and a simple Phase Transition
The objective behind this program was investigate molecular dynamics and in particular the gas-solid phase transition in a simple system. The model system was simply N classical particles that interact with each other according to the Lennard-Jones potential, i.e. the force has the form (α r^-6 - β r^-12)*(x,y) where α,β are constants. The two terms in the potential represent van-der-Waals attraction and Pauli repulsion (i.e. overlap between filled orbitals), respectively. There is also a constant gravitational force on each particle (discussed below). To simulate the system the equations of motion according to Newtonian dynamics are integrated using an adaptive Runge-Kutta-Fehlberg method. For a given stepsize the derivative is evaluated at six points. One linear combination of derivatives then gives a result that is correct to fifth order and another one correct to fourth order. By comparing the two we can determine how accurate the step at this stepsize was and decrease or increase the stepsize as necessary in order to maintain constant relative error eps. (For further details about the method see Press et al., Numerical Recipes in C, 2nd ed., chapter 16)

The walls are simulated in a relatively ad-hoc fashion by evaluating a collision with a wall using the one-dimensional elastic collision equations, with a second particle of mass 1/MU (compile-time parameter) times the mass of a molecule and velocity |T*u| perpendicular to the wall, where T is the temperature (a parameter in an arbitrary unit that can be adjusted at runtime) and u is a normal deviate. This mechanism seems to adjust the temperature reasonably well, by pressing 'x' the mean molecular speed can be printed and it's found to be proportional to temperature (roughly times a factor of sqrt(2) or so).

By adjusting the temperature behaviour (which can be done at runtime using the '1'/'2' keys, the parameter is displayed in the window title), a phase transition can clearly be observed around temperature 0.02-0.04. Below this temperature almost all the molecules stick together to form a hexagonal lattice, whereas above they spread out to fill all space. (The sole purpose of the gravity term mentioned above is to make the transition easier to observe with a small number of particles, which significantly reduces computational effort and allows real-time observation) Of course, even in the solid state, there is still considerable movement at finite temperature as the lattice vibrates; in fact lattice vibrations (phonons) can be seen rather nicely in the horizontal direction. Another observation is that the top layer is relatively volatile, with a relatively large amount of motion, possibly corresponding to the real-life phenomena of evaporation and surface melting. There also seems to be intermediate range of temperatures where they are clustered together near the top, but not exactly in any type of lattice; this could represent a liquid phase or simply a very dense gas that's strongly affected by gravity. One important point with this simulation is that the very small number of molecules and the relatively large value of gravity means that predictions from it have to be taken with a grain of salt and it is always not obvious what real-life behaviour, if any, observations from the simulation correspond to. It would be nice to have some quantitative evidence of the apparent phase transition but that is also made difficult by the small number of particles.

For display it simply plots the current positions on a bit-mapped screen. By also storing the current frame number in a second array each frame can erase the pixels set 50 frames earlier to give a very nice "snake" effect, somewhat similar to a trace in a cloud chamber An improved version of this program might offer an option to output some kind of animation, to relax the requirement of real-time computation.