A simple bitcrusher and sample rate reducer in C++ for a Windows Store App

Pete Brown - 13 January 2013

I'm working on a Windows 8 synthesizer app using XAudio2 and a C++ + DirectX/XAML Windows Store app for Windows 8. As part of this, I thought it would be fun to add a simple bit crusher effect with included sample rate reducer. The point of this effect is to make samples sound like they came from older machines with lower bitrates and sample depth. To do that, I had to do two things to the samples:

1. Reduce the bit depth of the samples. This would control vertical stepping as with a traditional bitcrusher.

2. Reduce the bit rate of the samples. By default, this is 44100 or 48000 samples per second. Older systems did quite a bit less, often 8192 samples, sometimes fewer like 4096 or 8192. I want to get that old low-fi vibe as simply as possible.

I'm not plugging into the XAudio2 effects pipeline or otherwise using any XAudio2 plumbing for the bit crusher effect here. The algorithm would be given a buffer of stereo samples to process. The buffer size will eventually be based on performance, but right now, I just have a looping sample buffer. If you want to learn how to create your own audio using XAudio2, please refer to my older post on this topic (take care to notice the comment at the bottom where I pointed out that I didn't initialize one of the values).

I'm not presenting a complete project here, but will post enough source for you to have context for what I'm doing.

The StereoSample Structure

Each sample is actually a stereo pair. Stereo oscillators? How cool :)

struct StereoSample

{

public:

    SAMPLE_t Left;

    SAMPLE_t Right;

};

SAMPLE_t is defined as float.

The Voice Class

The voice class represents a single voice in the synthesizer. Among other things, it includes a collection of oscillators. Right now, all three oscillators are configured to output exactly the same thing. The Render function handles that output as well as plugging in the bit crusher effect. Note that I apply the effect to the output from all three oscillators, but in the real synth, this is decoupled so I could apply it to a single oscillator in a single voice.

Voice::Voice(IXAudio2* audioEngine, int stereoBufferSize) :

    _audioEngine(audioEngine),

    _stereoBufferSize(stereoBufferSize)

{

    _bufferData = new StereoSample[stereoBufferSize];



    for (int i = 0; i < MAX_OSCILLATORS; i++)

    {

        _oscillators.push_back(shared_ptr<Oscillator>(new Oscillator()));

    }



    // set up wave format using my good friend WAVEFORMATEX

    WAVEFORMATEX wfx;

    wfx.wBitsPerSample = SAMPLE_BITS;

    wfx.nAvgBytesPerSec = SAMPLE_RATE * SAMPLE_CHANNELS * SAMPLE_BITS / 8;

    wfx.nChannels = SAMPLE_CHANNELS;

    wfx.nBlockAlign = SAMPLE_CHANNELS * SAMPLE_BITS / 8;

    wfx.wFormatTag = WAVE_FORMAT_IEEE_FLOAT; // or could use WAVE_FORMAT_PCM

    wfx.nSamplesPerSec = SAMPLE_RATE;

    wfx.cbSize = 0;    // set to zero for PCM or IEEE float



    DX::ThrowIfFailed(_audioEngine->CreateSourceVoice(

        &_xavoice,

        (WAVEFORMATEX*)&wfx,

        0,

        XAUDIO2_DEFAULT_FREQ_RATIO,

        reinterpret_cast<IXAudio2VoiceCallback*>(&_voiceCallbackHandler),

        nullptr,

        nullptr));

}











void Voice::Render(long phase, float noteFrequency)

{

    XAUDIO2_BUFFER buffer;



    //int byteCount = sizeof(SAMPLE_t) * 2 * _stereoBufferSize;

    int byteCount =  sizeof(StereoSample) * _stereoBufferSize;



    // zero all buffer data

    memset((byte*)_bufferData, 0, byteCount);



    //_oscillators[0]->Render(phase, noteFrequency, _bufferData, _stereoBufferSize);

    vector<shared_ptr<Oscillator>>::const_iterator cii;

    for (cii = _oscillators.begin(); cii < _oscillators.end(); cii++)

    {

        (*cii)->Render(phase, noteFrequency, _bufferData, _stereoBufferSize);

    }



    // TEMP! Bit crunching to try out audio processing

    BitCruncher cruncher;

    cruncher.BitDepth = 24;

    cruncher.BitRate = 2048;



    cruncher.ProcessSampleBuffer(phase, noteFrequency, _bufferData, _stereoBufferSize);



    // TEMP Looping

    // the buffer will be looped infinitely

    buffer.AudioBytes = byteCount;

    buffer.PlayBegin = 0;

    buffer.PlayLength = 0;    // play entire buffer

    buffer.LoopBegin = 0;

    buffer.LoopLength = 0;    // loop entire buffer

    buffer.LoopCount = XAUDIO2_LOOP_INFINITE;

    buffer.pAudioData = (const BYTE *)_bufferData;

    buffer.pContext = NULL;

    buffer.Flags = 0;         // this is the value I left out in the previous post



    // wire up the buffer

    DX::ThrowIfFailed(_xavoice->SubmitSourceBuffer(&buffer));



    // start playing sound.

    DX::ThrowIfFailed(_xavoice->Start(0));

}

My C++ isn't great yet, but I'm learning. I've even learned the shared_ptr and vector templates :)

The Oscillators

The oscillators have a lot more to them, but the render functions are the core:

void Oscillator::RenderSine(long initialPhaseIncrement, float noteFrequency, StereoSample* buffer, long bufferCount)

{

    // fill the buffer

    for (int i = 0; i < bufferCount; i++)

    {

        float sample = (sin((i + initialPhaseIncrement) * 2 * PI * noteFrequency / SAMPLE_RATE));



        // left audio or mono

        buffer->Left += sample * this->_volume;



        // right audio

        buffer->Right += sample * this->_volume;



        buffer++;

    }

}

Volume is per-oscillator. Panning etc. is not implemented in this listing. I have different render functions for each type of waveform. They are switched using a function template to point to the current render function. Thanks to everyone on Twitter last night (especially Jeremiah Morrill) for helping me sort out how to use the std::function type.

RenderFunction = std::bind<void>(

    &Oscillator::RenderSine,    // function pointer

    this,                        // implicit this

    std::placeholders::_1,        // initialPhaseIncrement

    std::placeholders::_2,        // noteFrequency

    std::placeholders::_3,        // buffer

    std::placeholders::_4);        // bufferCount

The entries in the Oscillator's class definition therefore look like this:

void RenderSine(long phaseIncrement, float noteFrequency, StereoSample* buffer, long bufferCount);

void RenderPulse(long phaseIncrement, float noteFrequency, StereoSample* buffer, long bufferCount);

void RenderTriangle(long phaseIncrement, float noteFrequency, StereoSample* buffer, long bufferCount);

void RenderSawtooth(long phaseIncrement, float noteFrequency, StereoSample* buffer, long bufferCount);

void RenderRamp(long phaseIncrement, float noteFrequency, StereoSample* buffer, long bufferCount);

void RenderMultiSaw(long phaseIncrement, float noteFrequency, StereoSample* buffer, long bufferCount);

void RenderNoise(long phaseIncrement, float noteFrequency, StereoSample* buffer, long bufferCount);



typedef std::function<void(long, float, StereoSample*, const long)> RenderFunction_t;

RenderFunction_t RenderFunction;

The magic part is the std::function. (Aside: searching for "std::anything" will get you a nice selection of STD testing ads in the search engine sidebar).

Now, just because it sounds so good with the bit crusher, here's my noise implementation:

void Oscillator::RenderNoise(long initialPhaseIncrement, float noteFrequency, StereoSample* buffer, long bufferCount)

{

    std::uniform_real_distribution<float> dist(-1.0f * _volume, 1.0f * _volume);



    for (int i = 0; i < bufferCount; i++)

    {

        float sample = dist(_randomGenerator);



        buffer->Left = sample;

        buffer->Right = sample;

   

        buffer++;

    }

}

For that to work, you'll need to initialize the random number generator elsewhere in the code. I do it in the constructor:

Oscillator::Oscillator(void)

    : _randomGenerator(std::time(0))

{

    _volume = DEFAULT_OSCILLATOR_VOLUME;



    _pan = 0.0f;



    RenderFunction = std::bind<void>(

        &Oscillator::RenderSine,    // function pointer change to render function you want

        this,                        // implicit this

        std::placeholders::_1,        // initialPhaseIncrement

        std::placeholders::_2,        // noteFrequency

        std::placeholders::_3,        // buffer

        std::placeholders::_4);        // bufferCount

}

The Bit Crusher/Cruncher

This simple class is the real point of this post. Here's the class's header file. I call it "BitCruncher" because it does bit crushing plus sample rate reduction.

#pragma once



class BitCruncher

{

public:

    int BitRate;        // for reducing the sample rate

    int BitDepth;        // for quantizing the sample values, for example, to make them 8 bit



    BitCruncher(void);

    ~BitCruncher(void);



    void ProcessSampleBuffer(long initialPhaseIncrement, float noteFrequency, StereoSample* buffer, long bufferCount);

};

And here's the implementation. Details after the listing.

#include "pch.h"

#include "BitCruncher.h"

#include <cmath>



BitCruncher::BitCruncher(void) :

    BitDepth(4),

    BitRate(4096)

{

}





BitCruncher::~BitCruncher(void)

{

}





#define ROUND(f) ((float)((f > 0.0) ? floor(f + 0.5) : ceil(f - 0.5)))



void BitCruncher::ProcessSampleBuffer(long initialPhaseIncrement, float noteFrequency, StereoSample* buffer, long bufferCount)

{

    int max = pow(2, BitDepth) - 1;

    int step = SAMPLE_RATE / BitRate;



    int i = 0;

    while (i < bufferCount)

    {

        float leftFirstSample = ROUND((buffer->Left + 1.0) * max) / max - 1.0;

        float rightFirstSample = ROUND((buffer->Right + 1.0) * max) / max - 1.0;



        // this loop causes us to simulate a down-sample to a lower sample rate

        for (int j = 0; j < step && i < bufferCount; j++)

        {

            buffer->Left = leftFirstSample;

            buffer->Right = rightFirstSample;



            // move on

            buffer++;

            i++;

        }

    }

}

For reducing the bit rate, the current algorithm is like a sample and hold. It takes the first sample and holds it for however many steps it needs to. In fact, on my modular synth, if I patch an oscillator into the sample and hold unit, I can get pretty much the same result.

Note that this happens per-oscillator. Each voice has multiple oscillators each of which may or may not be crushed, so I don't simply submit a low bitrate buffer to XAudio2 and let it do the expansion. I'm also working on a couple other algorithms in addition to the current.

The bit depth reduction is handled by the round statements. Floating point samples are in the range of -1.0 to +1.0. I first calculate the maximum number for the specified bit depth (first statement with pow). Then, I add 1.0 to the sample to get it into the range of 0..2.0. I then map that 0..2.0 to the "max" value (which is 0..max"). Then, dividing by "max" I get back a now rounded value in the range or 0..2.0, with at most "max" unique possible values. Finally, I subtract 1.0 to get back in the range of -1.0 to +1.0.

UPDATE: @c64_gio on twitter sent me some really great tips for how I can improve this code. I especially like the iteration approach and use of vectors instead of raw bytes. You can see some of his comments here: http://pastebin.com/Q8HqHRDd.

This algorithm appears to work, so let's take a look at the results.

Results

I hooked up my Rigol to the main output of my sound card (a MOTU 828mk3) to take a look at the generated waveforms.

What follows are the configurations set on the BitCruncher class and a photo of the resulting wave forms.

BitDepth 24, BitRate, 44100. This is about as good as it gets in terms of the scope's resolution.

BitDepth 24, BitRate 8192 (very slight stepping)

(note that the frequency is warbling a bit between 110 and 220. I think my Rigol is confused, possibly due to me not completing cycles in the buffer)

BitDepth 24, BitRate 4096 (more stepping)

BitDepth 24, BitRate 2048 (a whole lot of stepping). This sounds like a sine wave with whistling harmonics over it. It reminds me (only louder) of the overtones from some 8 bit synth chips from 80's computers.

I believe this one was BitDepth 4 and BitRate 44100. Notice the flats at the peaks. This is due to rounding and contributes a square-wave overtone to the sound.

BitDepth 2, BitRate 2048 (this sounds much more like a square wave). This one had a fair bit of jitter to it, presumably because 44100 is not evenly divisible by 2048, and for the final output, I'm simply looping a buffer of 44100 * 5 seconds.

The bitrate reduction is especially interesting when applied to white noise. You totally get the Atari/C64 vibe from it.

As I do other interesting things with this synthesizer project, which will hopefully be in the Windows Store when I complete it, I'll continue to post about them here.

No downloadable source code for this project.

erixsays:

Monday, January 14, 2013 at 5:10:53 AM

Thank you for sharing this.

Jeremiah Morrillsays:

Monday, January 14, 2013 at 2:53:31 PM

Cool post man! Forgot about bind<>, which is much cleaner than the alternative of static methods!

I was actually looking for algo's to do this very same thing for a Win8 app I am working on, but for MediaFoundation. In my case MF was nice enough to (internally) down/up sample for me and allowed me to be lazy. :)

If you get into doing any channel mixing, I'd be interested in algos to help with audio artifacts, like clipping *nudge, nudge* ;)

Kennosays:

Tuesday, April 30, 2013 at 8:06:21 AM

Nice work Pete!

I think those ultra modern and sleek C64 machines weren't as classy as their predecessor, the VIC-20