Creating sound from raw bits is, believe it or not, slightly more involved than creating video from raw bits in Silverlight 3.
Getting samples to Silverlight is an interesting task. Your code will respond to Silverlight’s request for a sample with a buffer of samples of a size you determine. Rather than a push model, like generating video, it is a pull model.
Let’s start with the thing that defines our sample stream: the WaveFormatEx structure
WaveFormatEx
In order to create sound, you’ll need to be able to populate the WaveFormatEx structure, and serialize that out to a binhex-style string. Luckily, there’s code out there to do that already (see attached project below)
Once you have that class in your project, you’ll need to populate the members:
_waveFormat = new WaveFormatEx();
_waveFormat.BitsPerSample = 16;
_waveFormat.AvgBytesPerSec = (int)ByteRate;
_waveFormat.Channels = ChannelCount;
_waveFormat.BlockAlign = ChannelCount * (BitsPerSample / 8);
_waveFormat.ext = null; // ??
_waveFormat.FormatTag = WaveFormatEx.FormatPCM;
_waveFormat.SamplesPerSec = SampleRate;
_waveFormat.Size = 0; // must be zero
_waveFormat.ValidateWaveFormat();
BitsPerSample | This is going to be 16, for 16 bit (two byte) samples |
AvgBytesPerSec | SampleRate * ChannelCount * BitsPerSample / 8 |
Channels | The number of channels you have. Typically this is 1 for mono, 2 for stereo |
BlockAlign | Channels * (BitsPerSample / 8) |
ext | No idea what this is, but it needs to be null |
FormatTag | PCM or IEEE format. I’ve only used PCM. In fact, in the WaveFormatEx example, if you use anything other than PCM, it will throw an error when you try and validate the data |
SamplesPerSec | The number of samples per second. Usually this is something like 44100 for CD quality. |
Size | must be 0 |
Describing your Stream
The next thing you need to do, is set up a few dictionaries full of options for the stream. You typically do this in the OpenMediaAsync method:
_startPosition = _currentPosition = 0;
// Init
Dictionary<MediaStreamAttributeKeys, string> streamAttributes =
new Dictionary<MediaStreamAttributeKeys, string>();
Dictionary<MediaSourceAttributesKeys, string> sourceAttributes =
new Dictionary<MediaSourceAttributesKeys, string>();
List<MediaStreamDescription> availableStreams =
new List<MediaStreamDescription>();
// Stream Description and WaveFormatEx
streamAttributes[MediaStreamAttributeKeys.CodecPrivateData] =
_waveFormat.ToHexString(); // wfx
MediaStreamDescription msd =
new MediaStreamDescription(MediaStreamType.Audio,
streamAttributes);
_audioDesc = msd;
// next, add the description so that Silverlight will
// actually request samples for it
availableStreams.Add(_audioDesc);
// Tell silverlight we have an endless stream
sourceAttributes[MediaSourceAttributesKeys.Duration] =
TimeSpan.FromMinutes(0).Ticks.ToString(
CultureInfo.InvariantCulture);
// we don't support seeking on our stream
sourceAttributes[MediaSourceAttributesKeys.CanSeek] =
false.ToString();
// tell Silverlight we're done opening our media
ReportOpenMediaCompleted(sourceAttributes, availableStreams);
Reporting Samples
Next, we need to handle sample requests. In this example, I’m going to return a stereo noise sample, generated by creating random samples at each sample point. Since we have two channels, the effect will be in stereo. I do this in GetSampleAsync
int numSamples = ChannelCount * 256;
int bufferByteCount = BitsPerSample / 8 * numSamples;
// fill the stream with noise
for (int i = 0; i < numSamples; i++)
{
short sample = (short)_random.Next(
short.MinValue, short.MaxValue);
_stream.Write(BitConverter.GetBytes(sample),
0,
sizeof(short));
}
// Send out the next sample
MediaStreamSample msSamp = new MediaStreamSample(
_audioDesc,
_stream,
_currentPosition,
bufferByteCount,
_currentTimeStamp,
_emptySampleDict);
// Move our timestamp and position forward
_currentTimeStamp += _waveFormat.AudioDurationFromBufferSize(
(uint)bufferByteCount);
_currentPosition += bufferByteCount;
ReportGetSampleCompleted(msSamp);
The number of bytes you buffer will depend on what you can get away with. Ideally, you want a buffer equal to only one sample per channel for that call. In reality, you can’t get to that even with dedicated professional audio gear and on-the-metal sound generation. So experiment with some buffer sizes, and keep in mind that the more work you do in code, the larger your buffer will likely need to be. This is because you’ll likely be filling an internal sound buffer on a background thread and Silverlight will be pulling from that buffer on its own thread. You want to make sure Silverlight never gets ahead of you, but also that you don’t get more than about 10ms ahead of the actual audio output (10ms is the smallest delay/difference a human ear can typically discern)
[detour]
FWIW, this is the sound card I use for my pro-audio (click to see the specs):
And for things other than code projects like what I’m doing here, this is my setup:
http://www.flickr.com/photos/psychlist1972/3313289838/
I’ve been playing with synthesizers since I was a teenager (my first real one was a Roland HS-60 (a Juno 106 in disguise), followed by an Alpha Juno and a Korg DW-6000. During the late 80s and early 90s in high school and college, I worked at a music store so I was also able to play around with lots of cool Roland and Korg synthesizers, plus some fun old analog beaters that often came in on trade.
[/detour]
Note that we use some helper functions from WaveFormatEx in this call in order to set the time stamp for this sample set. Note also that I use the built-in BitConverter to get the two bytes from the 16 bit sample. BitConverter.GetBytes returns an array sized to contained the bytes in a given variable of a type. Finally, ntoice the _emptySampleDict. That is, as it is named, an empty dictionary of MediaSampleAttributeKeys/strings:
// you only need sample attributes for video
private Dictionary<MediaSampleAttributeKeys, string> _emptySampleDict =
new Dictionary<MediaSampleAttributeKeys, string>();
To round it out, here are the other private variables in this example:
private WaveFormatEx _waveFormat;
private MediaStreamDescription _audioDesc;
private long _currentPosition;
private long _startPosition;
private long _currentTimeStamp;
private const int SampleRate = 44100;
private const int ChannelCount = 2;
private const int BitsPerSample = 16;
private const int ByteRate =
SampleRate * ChannelCount * BitsPerSample / 8;
private MemoryStream _stream;
private Random _random = new Random();
Other Functions
There are other functions you need to implement, even if you just throw an error or report them completed:
protected override void SeekAsync(long seekToTime)
{
ReportSeekCompleted(seekToTime);
}
protected override void SwitchMediaStreamAsync(
MediaStreamDescription mediaStreamDescription)
{
throw new NotImplementedException();
}
protected override void CloseMedia()
{
// Close the stream
_startPosition = _currentPosition = 0;
_audioDesc = null;
}
protected override void GetDiagnosticAsync(
MediaStreamSourceDiagnosticKind diagnosticKind)
{
throw new NotImplementedException();
}
Wiring up in Xaml
The next step is to wire our MediaStreamSource to a MediaElement in Silverlight
<MediaElement x:Name="TestMediaElement" AutoPlay="True" />
TestMediaElement.SetSource(new MyMediaStreamSource());
One word of caution. The current beta bits have a delay from the time a first sample is requested until you first hear audio. No amount of configuration on your part is going to change that, so don’t bother playing with buffering time settings. The team knows this is an issue, so I hope to see a solution at RTW.
More Complex Uses
You can certainly take this much further. For example, I built a basic synthesizer using MediaStreamSource. The synthesizer has multiple oscillators, each of which generates samples which are all mixed together and then output as a single two-channel stereo stream. You can try out an unstable/buggy build of the synthesizer here:
http://www.irritatedvowel.com/Silverlight/sl3/Synth/Default.html [link no longer valid as it was built using a beta]
(try the arpeggiator on the top right of the keyboard)
There are a number of bugs in my code for that synthesizer including issues with distortion and getting all out of whack, so if you hit an issue, first lower the volume. If that doesn’t take care of it, refresh the browser and try again. Here’s a screenshot of the synthesizer in action:
Source
The source code for the example in this post may be downloaded here.
Enjoy!