270 lines
10 KiB
Plaintext
270 lines
10 KiB
Plaintext
|
|
||
|
Fixed Rate Pig - a fixed logic frame rate demo
|
||
|
----------------------------------------------
|
||
|
|
||
|
This SDL programming example - a simple
|
||
|
platform game - demonstrates the use of
|
||
|
a fixed virtual logic frame rate
|
||
|
together with interpolation, for smooth
|
||
|
and accurate game logic that is
|
||
|
independent of the rendering frame rate.
|
||
|
|
||
|
The example also demonstrates sprite
|
||
|
animation and partial display updating
|
||
|
techniques, suitable for games and
|
||
|
applications that need high frame rates
|
||
|
but can do without updating the whole
|
||
|
screen every frame.
|
||
|
|
||
|
|
||
|
Fixed Logic Frame Rate
|
||
|
----------------------
|
||
|
Having a fixed logic frame rate means that the game
|
||
|
logic (that is, what defines the gameplay in terms
|
||
|
of object behavior and user input handling) runs a
|
||
|
fixed number of times per unit of time. This makes
|
||
|
it possible to use "frame count" as a unit of time.
|
||
|
More interestingly, since the logic frame rate
|
||
|
can be set at any sufficient value (say, 20 Hz for
|
||
|
a slow turn based game, or 100 Hz for fast action)
|
||
|
the logic code will run exactly once per frame.
|
||
|
Thus, there is no need to take delta times in
|
||
|
account, solving equations, making calculations on
|
||
|
velocity, acceleration, jerk and stuff like that.
|
||
|
You can just deal with hardcoded "step" values and
|
||
|
simple tests.
|
||
|
Perhaps most importantly, you can *still* rely
|
||
|
on the game behaving *exactly* the same way,
|
||
|
regardless of the rendering frame rate or other
|
||
|
system dependent parameters - something that is
|
||
|
virtually impossible with delta times, since you
|
||
|
cannot have infinite accuracy in the calculations.
|
||
|
|
||
|
|
||
|
Virtual Logic Frame Rate
|
||
|
------------------------
|
||
|
By "virtual", I mean that the actual frame rate is
|
||
|
not necessarily stable at the nominal value at all
|
||
|
times. Rather, the *average* logic frame rate is
|
||
|
kept at the nominal value by means of controlling
|
||
|
the number of logic frames processed for each
|
||
|
rendered frame.
|
||
|
That is, if the rendering frame rate is lower
|
||
|
than the nominal logic frame rate, the engine will
|
||
|
run the game logic several times before rendering
|
||
|
each frame. Thus, the game logic may actually be
|
||
|
running at tens of kHz for a few frames at a time,
|
||
|
but this doesn't matter, as long as the game logic
|
||
|
code relies entirely on logic time.
|
||
|
So, do not try to read time using SDL_GetTicks()
|
||
|
or similar in the game logic code! Instead, just
|
||
|
count logic frames, like we did back in the C64 and
|
||
|
Amiga days, where video frames were actually a
|
||
|
reliable time unit. It really works!
|
||
|
|
||
|
|
||
|
Resampling Distortion
|
||
|
---------------------
|
||
|
Now, there is one problem with fixed logic frame
|
||
|
rates: Resampling distortion. (The same phenomena
|
||
|
that cause poor audio engines to squeal and feep
|
||
|
when playing back waveforms at certain pitches.)
|
||
|
The object coordinates generated by the game
|
||
|
logic engine can be thought of as streams of values
|
||
|
describing signals (in electrical engineering/DSP
|
||
|
terms) with a fixed sample rate. Each coordinate
|
||
|
value is one stream.
|
||
|
Since the logic frame rate is fixed, and the
|
||
|
game logic runs an integer number of times per
|
||
|
rendered frame, what we get is a "nearest point"
|
||
|
resampling from the logic frame rate to the
|
||
|
rendering frame rate. That's not very nice, since
|
||
|
only the last set of coordinates after each run of
|
||
|
logic frames is actually used - the rest are thrown
|
||
|
away!
|
||
|
What's maybe even worse, especially if the logic
|
||
|
frame rate is low, is that you get new coordinates
|
||
|
only every now and then, when the rendering frame
|
||
|
rate is higher than the logic frame rate.
|
||
|
|
||
|
|
||
|
Getting Smooth Animation
|
||
|
------------------------
|
||
|
So, what do we do? Well, given my hint above, the
|
||
|
answer is probably obvious: interpolation! We just
|
||
|
need to replace the basic "nearest sample" method
|
||
|
with something better.
|
||
|
Resampling is a science and an art in the audio
|
||
|
field, and countless papers have been written on
|
||
|
the subject, most of which are probably totally
|
||
|
incomprehensible for anyone who hasn't got a degree
|
||
|
in maths.
|
||
|
However, our requirements for the resampling can
|
||
|
be kept reasonably low by keeping the logic frame
|
||
|
rate reatively high (ie in the same order of
|
||
|
magnitude as the expected rendering frame rate) -
|
||
|
and we generally want to do that anyway, to reduce
|
||
|
the game's control latency.
|
||
|
|
||
|
|
||
|
Chosing An Interpolator
|
||
|
-----------------------
|
||
|
Since the rendering frame rate can vary constantly
|
||
|
in unpredictable ways, we will have to recalculate
|
||
|
the input/output ratio of the resampling filter for
|
||
|
every rendered frame.
|
||
|
However, using a polynomial interpolator (as
|
||
|
opposed to a FIR resampling filter), we can get
|
||
|
away without actually doing anything special. We
|
||
|
just feed the interpolator the coordinates and the
|
||
|
desired fractional frame time, and get the
|
||
|
coordinates calculated.
|
||
|
DSP people will complain that a polynomial
|
||
|
resampler (that is, without a brickwall filter, or
|
||
|
oversampling + bandlimited downsampling) doesn't
|
||
|
really solve the whole problem. Right, it doesn't
|
||
|
remove frequencies above Nyqvist of the rendering
|
||
|
frame rate, so those can cause aliasing distortion.
|
||
|
But let's consider this:
|
||
|
Do we actually *have* significant amounts of
|
||
|
energy at such frequencies in the data from the
|
||
|
game logic? Most probably not! You would have to
|
||
|
have objects bounce around or oscillate at insane
|
||
|
speed to get anywhere near Nyqvist of (that is, 50%
|
||
|
of) any reasonable (ie playable) rendering frame
|
||
|
rate. In fact, we can probably assume that we're
|
||
|
dealing with signals in the range 0..10 Hz. Not
|
||
|
even the transients caused by abrupt changes in
|
||
|
speed and direction will cause visible side
|
||
|
effects.
|
||
|
So, in this programming example, I'm just using
|
||
|
a simple linear interpolator. No filters, no
|
||
|
oversampling or anything like that. As simple as it
|
||
|
gets, but still an incredible improvement over
|
||
|
"nearest sample" resampling. You can enable/disable
|
||
|
interpolation with the F1 key when running the
|
||
|
example.
|
||
|
|
||
|
|
||
|
Rendering Sprites
|
||
|
-----------------
|
||
|
In order to cover another animation related FAQ,
|
||
|
this example includes "smart" partial updates of
|
||
|
the screen. Only areas that are affected by moving
|
||
|
and/or animated sprites are updated.
|
||
|
To keep things simple and not annoyingly non-
|
||
|
deterministic, updates are done by removing all
|
||
|
sprites, updating their positions and animation
|
||
|
frames, and then rendering all sprites. This is
|
||
|
done every frame, and includes all sprites, whether
|
||
|
they move or not.
|
||
|
So, why not update only the sprites that
|
||
|
actually moved? That would allow for cheap but
|
||
|
powerful animated "backgrounds" and the like.
|
||
|
Well, the problem is that sprites can overlap,
|
||
|
and when they do, they start dragging each other
|
||
|
into the update loop, leading to recursion and
|
||
|
potentially circular dependencies. A non-recursive
|
||
|
two-pass (mark + render) algorithm is probably a
|
||
|
better idea than actual recursion. It's quite
|
||
|
doable and neat, if the updates are restricted by
|
||
|
clipping - but I'll leave that for another example.
|
||
|
Pretty much all sprites in Fixed Rate Pig move all
|
||
|
the time, so there's nothing to gain by using a
|
||
|
smarter algorithm.
|
||
|
|
||
|
|
||
|
Efficient Software Rendering
|
||
|
----------------------------
|
||
|
To make it a bit more interesting, I also added
|
||
|
alpha blending for sprite anti-aliasing and effects.
|
||
|
Most 2D graphics APIs and drivers (and as a result,
|
||
|
most SDL backends) lack h/w acceleration of alpha
|
||
|
blended blits, which means the CPU has to perform
|
||
|
the blending. That's relatively expensive, but
|
||
|
SDL's software blitters are pretty fast, and it
|
||
|
turns out *that's* usually not a problem.
|
||
|
However, there is one problem: Alpha blending
|
||
|
requires that data is read from the target surface,
|
||
|
modified, and then written back. Unfortunately,
|
||
|
modern video cards handle CPU reads from VRAM very
|
||
|
poorly. The bandwidth for CPU reads - even on the
|
||
|
latest monster AGP 8x card - is on par with that of
|
||
|
an old hard drive. (I'm not kidding!)
|
||
|
This is why I wanted to demonstrate how to avoid
|
||
|
this problem, by rendering into a s/w back buffer
|
||
|
instead of the h/w display surface. If you're on a
|
||
|
system that supports hardware display surfaces, you
|
||
|
can see the difference by hitting F2 in the game,
|
||
|
to enable/disable rendering directly into VRAM.
|
||
|
Indeed, SDL can set that up for you, but *only*
|
||
|
if you ask for a single buffered display - and we
|
||
|
do NOT want that! Single buffered displays cannot
|
||
|
sync animation with the retrace, and as a result,
|
||
|
we end up hogging the CPU (since we never block,
|
||
|
but just pump out new frames) and still getting
|
||
|
unsmooth animation.
|
||
|
Accidentally, this approach of using a s/w back
|
||
|
buffer for rendering mixes very well with partial
|
||
|
update strategies, so it fits right in.
|
||
|
|
||
|
|
||
|
Smart Dirty Rectangle Management
|
||
|
--------------------------------
|
||
|
The most complicated part of this implementation
|
||
|
is keeping track of the exact areas of the screen
|
||
|
that need updating. Just maintaining one rectangle
|
||
|
per sprite would not be sufficient. A moving sprite
|
||
|
has to be removed, animated and then re-rendered.
|
||
|
That's two rectangles that need to be pushed to the
|
||
|
screen; one to remove the old sprite image, and one
|
||
|
for the new position.
|
||
|
On a double buffered display, it gets even worse,
|
||
|
as the rendering is done into two alternating
|
||
|
buffers. When we update a buffer, the old sprites
|
||
|
in it are actually *two* frames old - not one.
|
||
|
I've chosen to implement a "smart" rectangle
|
||
|
merging algorithm that can deal with all of this
|
||
|
with a minimum of support from higher levels. The
|
||
|
algorithm merges rectangles in order to minimize
|
||
|
overdraw and rectangle count when blitting to and
|
||
|
updating the screen. See the file dirtyrects.txt for
|
||
|
details. You can (sort of) see what's going on by
|
||
|
hitting F3 in the game. Here's what's going on:
|
||
|
|
||
|
1. All sprites are removed from the rendering
|
||
|
buffer. The required information is found
|
||
|
in the variables that store the results of
|
||
|
the interpolation.
|
||
|
2. The dirtyrect table for the display surface
|
||
|
is swapped into a work dirtyrect table. The
|
||
|
display surface dirtyrect table is cleared.
|
||
|
3. New graphic coordinates are calculated, and
|
||
|
all sprites are rendered into the rendering
|
||
|
buffer. The bounding rectangles are fed
|
||
|
into the display surface dirtyrect table.
|
||
|
4. The dirtyrect table compiled in step 3 is
|
||
|
merged into the work dirtyrect table. The
|
||
|
result covers all areas that need to be
|
||
|
updated to remove old sprites and make the
|
||
|
new ones visible.
|
||
|
5. The dirtyrect table compiled in step 4 is
|
||
|
used to blit from the rendering buffer to
|
||
|
the display surface.
|
||
|
|
||
|
On a double buffered display, there is one
|
||
|
dirtyrect table for each display page, and there
|
||
|
is (obviously) a page flip operation after step 5,
|
||
|
but other than that, the algorithm is the same.
|
||
|
|
||
|
|
||
|
Command Line Options
|
||
|
--------------------
|
||
|
|
||
|
-f Fullscreen
|
||
|
-s Single buffer
|
||
|
<n> Depth = <n> bits
|
||
|
|
||
|
|
||
|
//David Olofson <david@olofson.net>
|