7a91a704c5
git-svn-id: svn://kolibrios.org@1806 a494cfbc-eb01-0410-851d-a64ba20cac60 |
||
---|---|---|
.. | ||
aclocal.m4 | ||
AUTHORS | ||
ChangeLog | ||
config.guess | ||
config.sub | ||
configure | ||
configure.in | ||
COPYING | ||
depcomp | ||
dirty.c | ||
dirty.h | ||
dirtyrects.txt | ||
engine.c | ||
engine.h | ||
evil.png | ||
font.png | ||
glassfont.png | ||
icons.png | ||
INSTALL | ||
install-sh | ||
INSTRUCTIONS | ||
lifepig.png | ||
Makefile.am | ||
Makefile.in | ||
missing | ||
mkinstalldirs | ||
NEWS | ||
pig-linux-x86 | ||
pig.c | ||
pigframes.png | ||
README | ||
slime.png | ||
stars.png | ||
tiles.png |
Fixed Rate Pig - a fixed logic frame rate demo ---------------------------------------------- This SDL programming example - a simple platform game - demonstrates the use of a fixed virtual logic frame rate together with interpolation, for smooth and accurate game logic that is independent of the rendering frame rate. The example also demonstrates sprite animation and partial display updating techniques, suitable for games and applications that need high frame rates but can do without updating the whole screen every frame. Fixed Logic Frame Rate ---------------------- Having a fixed logic frame rate means that the game logic (that is, what defines the gameplay in terms of object behavior and user input handling) runs a fixed number of times per unit of time. This makes it possible to use "frame count" as a unit of time. More interestingly, since the logic frame rate can be set at any sufficient value (say, 20 Hz for a slow turn based game, or 100 Hz for fast action) the logic code will run exactly once per frame. Thus, there is no need to take delta times in account, solving equations, making calculations on velocity, acceleration, jerk and stuff like that. You can just deal with hardcoded "step" values and simple tests. Perhaps most importantly, you can *still* rely on the game behaving *exactly* the same way, regardless of the rendering frame rate or other system dependent parameters - something that is virtually impossible with delta times, since you cannot have infinite accuracy in the calculations. Virtual Logic Frame Rate ------------------------ By "virtual", I mean that the actual frame rate is not necessarily stable at the nominal value at all times. Rather, the *average* logic frame rate is kept at the nominal value by means of controlling the number of logic frames processed for each rendered frame. That is, if the rendering frame rate is lower than the nominal logic frame rate, the engine will run the game logic several times before rendering each frame. Thus, the game logic may actually be running at tens of kHz for a few frames at a time, but this doesn't matter, as long as the game logic code relies entirely on logic time. So, do not try to read time using SDL_GetTicks() or similar in the game logic code! Instead, just count logic frames, like we did back in the C64 and Amiga days, where video frames were actually a reliable time unit. It really works! Resampling Distortion --------------------- Now, there is one problem with fixed logic frame rates: Resampling distortion. (The same phenomena that cause poor audio engines to squeal and feep when playing back waveforms at certain pitches.) The object coordinates generated by the game logic engine can be thought of as streams of values describing signals (in electrical engineering/DSP terms) with a fixed sample rate. Each coordinate value is one stream. Since the logic frame rate is fixed, and the game logic runs an integer number of times per rendered frame, what we get is a "nearest point" resampling from the logic frame rate to the rendering frame rate. That's not very nice, since only the last set of coordinates after each run of logic frames is actually used - the rest are thrown away! What's maybe even worse, especially if the logic frame rate is low, is that you get new coordinates only every now and then, when the rendering frame rate is higher than the logic frame rate. Getting Smooth Animation ------------------------ So, what do we do? Well, given my hint above, the answer is probably obvious: interpolation! We just need to replace the basic "nearest sample" method with something better. Resampling is a science and an art in the audio field, and countless papers have been written on the subject, most of which are probably totally incomprehensible for anyone who hasn't got a degree in maths. However, our requirements for the resampling can be kept reasonably low by keeping the logic frame rate reatively high (ie in the same order of magnitude as the expected rendering frame rate) - and we generally want to do that anyway, to reduce the game's control latency. Chosing An Interpolator ----------------------- Since the rendering frame rate can vary constantly in unpredictable ways, we will have to recalculate the input/output ratio of the resampling filter for every rendered frame. However, using a polynomial interpolator (as opposed to a FIR resampling filter), we can get away without actually doing anything special. We just feed the interpolator the coordinates and the desired fractional frame time, and get the coordinates calculated. DSP people will complain that a polynomial resampler (that is, without a brickwall filter, or oversampling + bandlimited downsampling) doesn't really solve the whole problem. Right, it doesn't remove frequencies above Nyqvist of the rendering frame rate, so those can cause aliasing distortion. But let's consider this: Do we actually *have* significant amounts of energy at such frequencies in the data from the game logic? Most probably not! You would have to have objects bounce around or oscillate at insane speed to get anywhere near Nyqvist of (that is, 50% of) any reasonable (ie playable) rendering frame rate. In fact, we can probably assume that we're dealing with signals in the range 0..10 Hz. Not even the transients caused by abrupt changes in speed and direction will cause visible side effects. So, in this programming example, I'm just using a simple linear interpolator. No filters, no oversampling or anything like that. As simple as it gets, but still an incredible improvement over "nearest sample" resampling. You can enable/disable interpolation with the F1 key when running the example. Rendering Sprites ----------------- In order to cover another animation related FAQ, this example includes "smart" partial updates of the screen. Only areas that are affected by moving and/or animated sprites are updated. To keep things simple and not annoyingly non- deterministic, updates are done by removing all sprites, updating their positions and animation frames, and then rendering all sprites. This is done every frame, and includes all sprites, whether they move or not. So, why not update only the sprites that actually moved? That would allow for cheap but powerful animated "backgrounds" and the like. Well, the problem is that sprites can overlap, and when they do, they start dragging each other into the update loop, leading to recursion and potentially circular dependencies. A non-recursive two-pass (mark + render) algorithm is probably a better idea than actual recursion. It's quite doable and neat, if the updates are restricted by clipping - but I'll leave that for another example. Pretty much all sprites in Fixed Rate Pig move all the time, so there's nothing to gain by using a smarter algorithm. Efficient Software Rendering ---------------------------- To make it a bit more interesting, I also added alpha blending for sprite anti-aliasing and effects. Most 2D graphics APIs and drivers (and as a result, most SDL backends) lack h/w acceleration of alpha blended blits, which means the CPU has to perform the blending. That's relatively expensive, but SDL's software blitters are pretty fast, and it turns out *that's* usually not a problem. However, there is one problem: Alpha blending requires that data is read from the target surface, modified, and then written back. Unfortunately, modern video cards handle CPU reads from VRAM very poorly. The bandwidth for CPU reads - even on the latest monster AGP 8x card - is on par with that of an old hard drive. (I'm not kidding!) This is why I wanted to demonstrate how to avoid this problem, by rendering into a s/w back buffer instead of the h/w display surface. If you're on a system that supports hardware display surfaces, you can see the difference by hitting F2 in the game, to enable/disable rendering directly into VRAM. Indeed, SDL can set that up for you, but *only* if you ask for a single buffered display - and we do NOT want that! Single buffered displays cannot sync animation with the retrace, and as a result, we end up hogging the CPU (since we never block, but just pump out new frames) and still getting unsmooth animation. Accidentally, this approach of using a s/w back buffer for rendering mixes very well with partial update strategies, so it fits right in. Smart Dirty Rectangle Management -------------------------------- The most complicated part of this implementation is keeping track of the exact areas of the screen that need updating. Just maintaining one rectangle per sprite would not be sufficient. A moving sprite has to be removed, animated and then re-rendered. That's two rectangles that need to be pushed to the screen; one to remove the old sprite image, and one for the new position. On a double buffered display, it gets even worse, as the rendering is done into two alternating buffers. When we update a buffer, the old sprites in it are actually *two* frames old - not one. I've chosen to implement a "smart" rectangle merging algorithm that can deal with all of this with a minimum of support from higher levels. The algorithm merges rectangles in order to minimize overdraw and rectangle count when blitting to and updating the screen. See the file dirtyrects.txt for details. You can (sort of) see what's going on by hitting F3 in the game. Here's what's going on: 1. All sprites are removed from the rendering buffer. The required information is found in the variables that store the results of the interpolation. 2. The dirtyrect table for the display surface is swapped into a work dirtyrect table. The display surface dirtyrect table is cleared. 3. New graphic coordinates are calculated, and all sprites are rendered into the rendering buffer. The bounding rectangles are fed into the display surface dirtyrect table. 4. The dirtyrect table compiled in step 3 is merged into the work dirtyrect table. The result covers all areas that need to be updated to remove old sprites and make the new ones visible. 5. The dirtyrect table compiled in step 4 is used to blit from the rendering buffer to the display surface. On a double buffered display, there is one dirtyrect table for each display page, and there is (obviously) a page flip operation after step 5, but other than that, the algorithm is the same. Command Line Options -------------------- -f Fullscreen -s Single buffer <n> Depth = <n> bits //David Olofson <david@olofson.net>