This is basically a dumping ground for various explanatory notes or code relating to talks, papers, or conversations I've had about technical problems in the past.
Code is typically compilable using gcc (or now clang) under OSX or Linux, as this has typically been my research dev environment over the last decade. You should be able to throw files into a project for the Visual Studio du jour and run them too.
You're welcome to use any of this info or source any way you wish. If it's in a product a mention would be appreciated, but isn't required.
This is a real oldie, and not really relevant to modern graphics, but it's come up a few times in chats with other devs.
Back in the SC4/TS2 era, GPUs weren't powerful enough to do proper depth-tested shadows that handled self-shadowing. Projective (cookie-cutter) shadows were it, and even those were pricey -- render targets were pretty new tech, and high-end cards had 32 MB of VRAM at most. (There was a horrible ATI card at the time that essentially was 2 x 16 MB cards strapped together, which would fail to allocate some surfaces it should have been able to handle due to memory fragmentation.)
Most of our games had some form of height map-based terrain, and we found getting some kind of shadowing from the terrain was crucial to the look we were after. But terrain shadows are all about self-shadowing, by (at the time) easily the most expensive mesh there is to draw in the game. The various hacks around to get some sort of object-object shadowing, like using object indices, obviously wouldn't cut it.
So instead we generated terrain shadows on the CPU, using a row scan approach. Rather than generating a map of shadow depths from the point of view of the sun, as is the usual practice, the shadow depth samples were aligned with the height map. That is, each cell of the shadow map stored the height at which you would transition from sun to shadow, at the corresponding terrain position. Thus these things got called height map shadows. The alignment meant not only was it very cheap to calculate the shadow factor for the terrain (you just looked up the same cell as for the height map, and compared heights), it was also cheap to calculate terrain/object shadowing, by using the object's (x,y) terrain location and its height. Moreover, you could get a nice soft shadow transition by using some kind of smooth step function on the height difference.
The basic algorithm is to start with one side of the height map (the one closest to the sun), and initialise a 'shadow crest' to the corresponding row of height map samples. You then step through rows away from the sun: at each new row, you shift the current crest by some fractional amount dictated by the sun direction (a simple lerp-and-add between two adjacent samples in the previous crest), and take the max of it and the corresponding height map row. It's basically a single linear pass through the height map.
(Another hack on SC4 was that building shadows were generated by warping sprite alpha masks from the lower LODs, which meant in terms of generation or additional storage, they were free. Also on TS2 we added some terrain-attached objects into the height map before generating shadows, to get some cheap shadowing for things like walls for free.)
Anyway, mostly for curiosity's sake, example source code is here. (Though who knows, it might be useful on mobile platforms, as the technology wheel turns once again.) Compiling with TEST_HMS will generate a set of 32 shadow TGAs over all (horizontal) directions from a test height map:
It's obviously predominantly a scalar algorithm, though I did think of trying a CUDA implementation with some basic parallelism via row striping a couple of years ago. Another approach is to have one thread per sample op, as per Timonen's "Scalable Height Field Self-Shadowing", as then each row maps to a warp. (That paper also contains a neat trick that uses a bounded stack to track the max subtended angle.) For a bit more CPU cost, you can also use this approach to generate directed ambient occlusion, via four sweeps through the data, with some additional crest storage depending on how many directional samples you want to take.
In dynamically lit games we've seen a steady progression from scene lighting being completely done on the CPU, to almost completely on the CPU, per the deferred rendering approaches so popular these days. For a while, a good sweet spot was to calculate environment object lighting on the CPU using a representation that could be (relatively) efficiently evaluated on the GPU, namely, Spherical Harmonics. This was particularly because in the PC space the SH approach lent itself to scaling across GPUs of varying capabilities. For less powerful GPUs, evaluation could be restricted to one or two bands. The per-pixel computational workload could also be scaled across vertex and pixel shaders, with the lower frequency bands evaluated in the VS, and higher bands added in in the PS. There's also a neat trick whereby you can evaluate the full SH term for a particular direction, and scale bands as you go to get the diffuse convolution of that at the same time, giving you separate diffuse and glossy terms that can then be mixed per texel.
Because of this I did a fair bit of work with spherical harmonics techniques, assembling various utilities from all over the place, most particularly articles from Robin Green and Peter-Spike Sloan. The focus was on 1-5 band SH coefficient set, and 1-7 band ZH (Zonal Harmonics) coefficient sets, so code generation was used to generate the corresponding optimised routines, rather than using a more general routine that could handle any number of bands.
The resulting (single file) library can be downloaded here. It provides the usual utilities for projecting environments into SH coefficients, evaluating them, and performing cheap(ish) operations like symmetric rotation and mirroring. However it also contains a set of ZH routines for constructing rotationally symmetric BRDFs that can then be rotated into the full 3D SH space, or convolved with a set of SH coefficients. There's nothing hugely novel here (apart from maybe the use of HG phase functions to quickly construct local sky/ground environments), but it's useful to have everything in one place.
Our SIGGRAPH presentation on Spore's spherical planet generation system didn't really have scope for getting into technical details, beyond the mention of our use of inflated cube maps, and a quick allusion to the Jacobian we derived to transform properties from cube map space to the object space the corresponding sphere was in. We mostly used that to cheaply generate normal maps for the sphere terrain, by using the standard finite difference approach in 2D, and then transforming the results.
Another key detail glossed over was coming up with a consistent parameterisation of the cube map, and establishing utilities to make it easier to adapt standard grid-based simulation code to the sphere. Mostly this was a matter of making it possible to transition between cube map faces in a consistent way. A diagram showing the parameterisation I settled on is here. The local uv coordinates of a face were chosen to be a consistent permutation of XYZ, with the u axis negated for negative faces, to maintain right-handedness.
Research code illustrating how all this fits together can be downloaded from here. It contains source for the aforementioned Jacobian, and code that will use that to generate normal maps for a sphere, given the underlying height field. In particular, the full-blown routine UpdateFaceNormalMap handles partial updates, and seams between faces.
Secondly, it contains a suite of more general cube map handling utilities for navigating around a cube map. (As mentioned in the talk, the projection employed ensures straight lines in cube map space are great circles on the sphere, so much gameplay manipulation reduces to the standard case for handling a 2 1/2 D play space, plus some additional logic for handling wrapping between faces.)
Finally, here are a few example planet cube maps made with our planet-authoring system, in case they're useful for research purposes.
For ambient occlusion in SimCity, we used the occlusion volume technique (more here). Directional occlusion was generated by adapting a sweeping technique I'd previously used for generating distance fields. In fact, the original prototype for this came from me hacking up some code I had written to explore some different algorithms for generating distance fields, namely the Chamfer, Danielsson, and "Fast Sweeping" method due to Zhao. (The latter two turn out to be quite closely related.) Once the basic algorithm was working, the most effective extension turned out to be to separate out the original occlusion from the generated occlusion, which, together with the directionality, helped solve the issue of self-occlusion.
I've cleaned up the original research version of this code and placed it here. It includes the distance algorithms mentioned above, and the occlusion generation algorithm, all in both 2D and 3D. It can be compiled as a command-line tool which will generate output distance or occlusion for test images or volumes. It also includes a quick and dirty obj mesh reader so it can generate occlusion volumes direct from mesh data.
(Note: updated with mesh reader and Clang 5 fixes October 2nd.)