I'm working on a Windows 8 synthesizer app using XAudio2 and a
C++ + DirectX/XAML Windows Store app for Windows 8. As part of
this, I thought it would be fun to add a simple bit
crusher effect with included sample rate reducer. The point of
this effect is to make samples sound like they came from older
machines with lower bitrates and sample depth. To do that, I had to
do two things to the samples:
1. Reduce the bit depth of the samples. This would control
vertical stepping as with a traditional bitcrusher.
2. Reduce the bit rate of the samples. By default, this is 44100
or 48000 samples per second. Older systems did quite a bit less,
often 8192 samples, sometimes fewer like 4096 or 8192. I want to
get that old low-fi vibe as simply as possible.
I'm not plugging into the XAudio2 effects pipeline or otherwise
using any XAudio2 plumbing for the bit crusher effect here. The
algorithm would be given a buffer of stereo samples to process. The
buffer size will eventually be based on performance, but right now,
I just have a looping sample buffer. If you want to learn how to
create your own audio using XAudio2, please refer to my older post on this topic (take
care to notice the comment at the bottom where I pointed out that I
didn't initialize one of the values).
I'm not presenting a complete project here, but will post enough
source for you to have context for what I'm doing.
The StereoSample Structure
Each sample is actually a stereo pair. Stereo oscillators? How
cool :)
struct StereoSample
{
public:
SAMPLE_t Left;
SAMPLE_t Right;
};
SAMPLE_t is defined as float.
The Voice Class
The voice class represents a single voice in the synthesizer.
Among other things, it includes a collection of oscillators. Right
now, all three oscillators are configured to output exactly the
same thing. The Render function handles that output as well as
plugging in the bit crusher effect. Note that I apply the effect to
the output from all three oscillators, but in the real synth, this
is decoupled so I could apply it to a single oscillator in a single
voice.
Voice::Voice(IXAudio2* audioEngine, int stereoBufferSize) :
_audioEngine(audioEngine),
_stereoBufferSize(stereoBufferSize)
{
_bufferData = new StereoSample[stereoBufferSize];
for (int i = 0; i < MAX_OSCILLATORS; i++)
{
_oscillators.push_back(shared_ptr<Oscillator>(new Oscillator()));
}
// set up wave format using my good friend WAVEFORMATEX
WAVEFORMATEX wfx;
wfx.wBitsPerSample = SAMPLE_BITS;
wfx.nAvgBytesPerSec = SAMPLE_RATE * SAMPLE_CHANNELS * SAMPLE_BITS / 8;
wfx.nChannels = SAMPLE_CHANNELS;
wfx.nBlockAlign = SAMPLE_CHANNELS * SAMPLE_BITS / 8;
wfx.wFormatTag = WAVE_FORMAT_IEEE_FLOAT; // or could use WAVE_FORMAT_PCM
wfx.nSamplesPerSec = SAMPLE_RATE;
wfx.cbSize = 0; // set to zero for PCM or IEEE float
DX::ThrowIfFailed(_audioEngine->CreateSourceVoice(
&_xavoice,
(WAVEFORMATEX*)&wfx,
0,
XAUDIO2_DEFAULT_FREQ_RATIO,
reinterpret_cast<IXAudio2VoiceCallback*>(&_voiceCallbackHandler),
nullptr,
nullptr));
}
void Voice::Render(long phase, float noteFrequency)
{
XAUDIO2_BUFFER buffer;
//int byteCount = sizeof(SAMPLE_t) * 2 * _stereoBufferSize;
int byteCount = sizeof(StereoSample) * _stereoBufferSize;
// zero all buffer data
memset((byte*)_bufferData, 0, byteCount);
//_oscillators[0]->Render(phase, noteFrequency, _bufferData, _stereoBufferSize);
vector<shared_ptr<Oscillator>>::const_iterator cii;
for (cii = _oscillators.begin(); cii < _oscillators.end(); cii++)
{
(*cii)->Render(phase, noteFrequency, _bufferData, _stereoBufferSize);
}
// TEMP! Bit crunching to try out audio processing
BitCruncher cruncher;
cruncher.BitDepth = 24;
cruncher.BitRate = 2048;
cruncher.ProcessSampleBuffer(phase, noteFrequency, _bufferData, _stereoBufferSize);
// TEMP Looping
// the buffer will be looped infinitely
buffer.AudioBytes = byteCount;
buffer.PlayBegin = 0;
buffer.PlayLength = 0; // play entire buffer
buffer.LoopBegin = 0;
buffer.LoopLength = 0; // loop entire buffer
buffer.LoopCount = XAUDIO2_LOOP_INFINITE;
buffer.pAudioData = (const BYTE *)_bufferData;
buffer.pContext = NULL;
buffer.Flags = 0; // this is the value I left out in the previous post
// wire up the buffer
DX::ThrowIfFailed(_xavoice->SubmitSourceBuffer(&buffer));
// start playing sound.
DX::ThrowIfFailed(_xavoice->Start(0));
}
My C++ isn't great yet, but I'm learning. I've even learned the
shared_ptr and vector templates :)
The Oscillators
The oscillators have a lot more to them, but the render
functions are the core:
void Oscillator::RenderSine(long initialPhaseIncrement, float noteFrequency, StereoSample* buffer, long bufferCount)
{
// fill the buffer
for (int i = 0; i < bufferCount; i++)
{
float sample = (sin((i + initialPhaseIncrement) * 2 * PI * noteFrequency / SAMPLE_RATE));
// left audio or mono
buffer->Left += sample * this->_volume;
// right audio
buffer->Right += sample * this->_volume;
buffer++;
}
}
Volume is per-oscillator. Panning etc. is not implemented in
this listing. I have different render functions for each type of
waveform. They are switched using a function template to point to
the current render function. Thanks to everyone on Twitter last
night (especially Jeremiah Morrill) for helping me sort out how to
use the std::function type.
RenderFunction = std::bind<void>(
&Oscillator::RenderSine, // function pointer
this, // implicit this
std::placeholders::_1, // initialPhaseIncrement
std::placeholders::_2, // noteFrequency
std::placeholders::_3, // buffer
std::placeholders::_4); // bufferCount
The entries in the Oscillator's class definition therefore look
like this:
void RenderSine(long phaseIncrement, float noteFrequency, StereoSample* buffer, long bufferCount);
void RenderPulse(long phaseIncrement, float noteFrequency, StereoSample* buffer, long bufferCount);
void RenderTriangle(long phaseIncrement, float noteFrequency, StereoSample* buffer, long bufferCount);
void RenderSawtooth(long phaseIncrement, float noteFrequency, StereoSample* buffer, long bufferCount);
void RenderRamp(long phaseIncrement, float noteFrequency, StereoSample* buffer, long bufferCount);
void RenderMultiSaw(long phaseIncrement, float noteFrequency, StereoSample* buffer, long bufferCount);
void RenderNoise(long phaseIncrement, float noteFrequency, StereoSample* buffer, long bufferCount);
typedef std::function<void(long, float, StereoSample*, const long)> RenderFunction_t;
RenderFunction_t RenderFunction;
The magic part is the std::function. (Aside: searching for
"std::anything" will get you a nice selection of STD testing ads in
the search engine sidebar).
Now, just because it sounds so good with the bit crusher, here's
my noise implementation:
void Oscillator::RenderNoise(long initialPhaseIncrement, float noteFrequency, StereoSample* buffer, long bufferCount)
{
std::uniform_real_distribution<float> dist(-1.0f * _volume, 1.0f * _volume);
for (int i = 0; i < bufferCount; i++)
{
float sample = dist(_randomGenerator);
buffer->Left = sample;
buffer->Right = sample;
buffer++;
}
}
For that to work, you'll need to initialize the random number
generator elsewhere in the code. I do it in the constructor:
Oscillator::Oscillator(void)
: _randomGenerator(std::time(0))
{
_volume = DEFAULT_OSCILLATOR_VOLUME;
_pan = 0.0f;
RenderFunction = std::bind<void>(
&Oscillator::RenderSine, // function pointer change to render function you want
this, // implicit this
std::placeholders::_1, // initialPhaseIncrement
std::placeholders::_2, // noteFrequency
std::placeholders::_3, // buffer
std::placeholders::_4); // bufferCount
}
x
The Bit Crusher/Cruncher
This simple class is the real point of this post. Here's the
class's header file. I call it "BitCruncher" because it does bit
crushing plus sample rate reduction.
#pragma once
class BitCruncher
{
public:
int BitRate; // for reducing the sample rate
int BitDepth; // for quantizing the sample values, for example, to make them 8 bit
BitCruncher(void);
~BitCruncher(void);
void ProcessSampleBuffer(long initialPhaseIncrement, float noteFrequency, StereoSample* buffer, long bufferCount);
};
And here's the implementation. Details after the listing.
#include "pch.h"
#include "BitCruncher.h"
#include <cmath>
BitCruncher::BitCruncher(void) :
BitDepth(4),
BitRate(4096)
{
}
BitCruncher::~BitCruncher(void)
{
}
#define ROUND(f) ((float)((f > 0.0) ? floor(f + 0.5) : ceil(f - 0.5)))
void BitCruncher::ProcessSampleBuffer(long initialPhaseIncrement, float noteFrequency, StereoSample* buffer, long bufferCount)
{
int max = pow(2, BitDepth) - 1;
int step = SAMPLE_RATE / BitRate;
int i = 0;
while (i < bufferCount)
{
float leftFirstSample = ROUND((buffer->Left + 1.0) * max) / max - 1.0;
float rightFirstSample = ROUND((buffer->Right + 1.0) * max) / max - 1.0;
// this loop causes us to simulate a down-sample to a lower sample rate
for (int j = 0; j < step && i < bufferCount; j++)
{
buffer->Left = leftFirstSample;
buffer->Right = rightFirstSample;
// move on
buffer++;
i++;
}
}
}
For reducing the bit rate, the current algorithm is like a
sample and hold. It takes the first sample and holds it for however
many steps it needs to. In fact, on my modular synth, if I patch an
oscillator into the sample and hold unit, I can get pretty much the
same result.
Note that this happens per-oscillator. Each voice has multiple
oscillators each of which may or may not be crushed, so I don't
simply submit a low bitrate buffer to XAudio2 and let it do the
expansion. I'm also working on a couple other algorithms in
addition to the current.
The bit depth reduction is handled by the round statements.
Floating point samples are in the range of -1.0 to +1.0. I first
calculate the maximum number for the specified bit depth (first
statement with pow). Then, I add 1.0 to the sample to get it into
the range of 0..2.0. I then map that 0..2.0 to the "max" value
(which is 0..max"). Then, dividing by "max" I get back a now
rounded value in the range or 0..2.0, with at most "max" unique
possible values. Finally, I subtract 1.0 to get back in the range
of -1.0 to +1.0.
UPDATE: @c64_gio on twitter sent me some really great
tips for how I can improve this code. I especially like the
iteration approach and use of vectors instead of raw bytes. You can
see some of his comments here: http://pastebin.com/Q8HqHRDd.
This algorithm appears to work, so let's take a look at the
results.
Results
I hooked up my Rigol to the main output of my sound card (a MOTU
828mk3) to take a look at the generated waveforms.
What follows are the configurations set on the BitCruncher class
and a photo of the resulting wave forms.
BitDepth 24, BitRate, 44100. This is about as good as it gets in
terms of the scope's resolution.
BitDepth 24, BitRate 8192 (very slight stepping)
(note that the frequency is warbling a bit between 110 and 220.
I think my Rigol is confused, possibly due to me not completing
cycles in the buffer)
BitDepth 24, BitRate 4096 (more stepping)
BitDepth 24, BitRate 2048 (a whole lot of stepping). This sounds
like a sine wave with whistling harmonics over it. It reminds me
(only louder) of the overtones from some 8 bit synth chips from
80's computers.
I believe this one was BitDepth 4 and BitRate 44100. Notice the
flats at the peaks. This is due to rounding and contributes a
square-wave overtone to the sound.
BitDepth 2, BitRate 2048 (this sounds much more like a square
wave). This one had a fair bit of jitter to it, presumably because
44100 is not evenly divisible by 2048, and for the final output,
I'm simply looping a buffer of 44100 * 5 seconds.
The bitrate reduction is especially interesting when applied to
white noise. You totally get the Atari/C64 vibe from it.
As I do other interesting things with this synthesizer project,
which will hopefully be in the Windows Store when I complete it,
I'll continue to post about them here.
No downloadable source code for this
project.