23 Mar 2009

Creating Sound using MediaStreamSource in Silverlight 3 Beta

     

Creating sound from raw bits is, believe it or not, slightly more involved than creating video from raw bits in Silverlight 3.

Getting samples to Silverlight is an interesting task. Your code will respond to Silverlight’s request for a sample with a buffer of samples of a size you determine. Rather than a push model, like generating video, it is a pull model.

Let’s start with the thing that defines our sample stream: the WaveFormatEx structure

WaveFormatEx

In order to create sound, you’ll need to be able to populate the WaveFormatEx structure, and serialize that out to a binhex-style string. Luckily, there’s code out there to do that already (see attached project below)

Once you have that class in your project, you’ll need to populate the members:

_waveFormat = new WaveFormatEx();
_waveFormat.BitsPerSample = 16;
_waveFormat.AvgBytesPerSec = (int)ByteRate;
_waveFormat.Channels = ChannelCount;
_waveFormat.BlockAlign = ChannelCount * (BitsPerSample / 8);
_waveFormat.ext = null; // ??
_waveFormat.FormatTag = WaveFormatEx.FormatPCM;
_waveFormat.SamplesPerSec = SampleRate;
_waveFormat.Size = 0; // must be zero

_waveFormat.ValidateWaveFormat();
BitsPerSampleThis is going to be 16, for 16 bit (two byte) samples
AvgBytesPerSecSampleRate * ChannelCount * BitsPerSample / 8
ChannelsThe number of channels you have. Typically this is 1 for mono, 2 for stereo
BlockAlignChannels * (BitsPerSample / 8)
extNo idea what this is, but it needs to be null
FormatTagPCM or IEEE format. I’ve only used PCM. In fact, in the WaveFormatEx example, if you use anything other than PCM, it will throw an error when you try and validate the data
SamplesPerSecThe number of samples per second. Usually this is something like 44100 for CD quality.
Sizemust be 0

Describing your Stream

The next thing you need to do, is set up a few dictionaries full of options for the stream. You typically do this in the OpenMediaAsync method:

_startPosition = _currentPosition = 0;

// Init
Dictionary<MediaStreamAttributeKeys, string> streamAttributes = 
    new Dictionary<MediaStreamAttributeKeys, string>();
Dictionary<MediaSourceAttributesKeys, string> sourceAttributes = 
    new Dictionary<MediaSourceAttributesKeys, string>();
List<MediaStreamDescription> availableStreams = 
    new List<MediaStreamDescription>();

// Stream Description and WaveFormatEx
streamAttributes[MediaStreamAttributeKeys.CodecPrivateData] = 
    _waveFormat.ToHexString(); // wfx
MediaStreamDescription msd = 
    new MediaStreamDescription(MediaStreamType.Audio, 
                                streamAttributes);
_audioDesc = msd;

// next, add the description so that Silverlight will
// actually request samples for it
availableStreams.Add(_audioDesc);

// Tell silverlight we have an endless stream
sourceAttributes[MediaSourceAttributesKeys.Duration] = 
    TimeSpan.FromMinutes(0).Ticks.ToString(
                        CultureInfo.InvariantCulture);

// we don't support seeking on our stream
sourceAttributes[MediaSourceAttributesKeys.CanSeek] = 
    false.ToString();

// tell Silverlight we're done opening our media
ReportOpenMediaCompleted(sourceAttributes, availableStreams);

Reporting Samples

Next, we need to handle sample requests. In this example, I’m going to return a stereo noise sample, generated by creating random samples at each sample point. Since we have two channels, the effect will be in stereo. I do this in GetSampleAsync

int numSamples = ChannelCount * 256;
int bufferByteCount = BitsPerSample / 8 * numSamples;

// fill the stream with noise
for (int i = 0; i < numSamples; i++)
{
    short sample = (short)_random.Next(
        short.MinValue, short.MaxValue);

    _stream.Write(BitConverter.GetBytes(sample), 
                  0, 
                  sizeof(short));
}


// Send out the next sample
MediaStreamSample msSamp = new MediaStreamSample(
    _audioDesc,
    _stream,
    _currentPosition,
    bufferByteCount,
    _currentTimeStamp,
    _emptySampleDict);

// Move our timestamp and position forward
_currentTimeStamp += _waveFormat.AudioDurationFromBufferSize(
                        (uint)bufferByteCount);
_currentPosition += bufferByteCount;

ReportGetSampleCompleted(msSamp);

The number of bytes you buffer will depend on what you can get away with. Ideally, you want a buffer equal to only one sample per channel for that call. In reality, you can’t get to that even with dedicated professional audio gear and on-the-metal sound generation. So experiment with some buffer sizes, and keep in mind that the more work you do in code, the larger your buffer will likely need to be. This is because you’ll likely be filling an internal sound buffer on a background thread and Silverlight will be pulling from that buffer on its own thread. You want to make sure Silverlight never gets ahead of you, but also that you don’t get more than about 10ms ahead of the actual audio output (10ms is the smallest delay/difference a human ear can typically discern)

[detour]

FWIW, this is the sound card I use for my pro-audio (click to see the specs):

EMU 1616 PCI Digital Audio System

And for things other than code projects like what I’m doing here, this is my setup:

http://www.flickr.com/photos/psychlist1972/3313289838/

Pete's EX-5, MT32 and SH-32

I’ve been playing with synthesizers since I was a teenager (my first real one was a Roland HS-60 (a Juno 106 in disguise), followed by an Alpha Juno and a Korg DW-6000. During the late 80s and early 90s in high school and college, I worked at a music store so I was also able to play around with lots of cool Roland and Korg synthesizers, plus some fun old analog beaters that often came in on trade.

[/detour]

Note that we use some helper functions from WaveFormatEx in this call in order to set the time stamp for this sample set. Note also that I use the built-in BitConverter to get the two bytes from the 16 bit sample. BitConverter.GetBytes returns an array sized to contained the bytes in a given variable of a type. Finally, ntoice the _emptySampleDict. That is, as it is named, an empty dictionary of MediaSampleAttributeKeys/strings:

// you only need sample attributes for video
private Dictionary<MediaSampleAttributeKeys, string> _emptySampleDict = 
    new Dictionary<MediaSampleAttributeKeys, string>();

To round it out, here are the other private variables in this example:

private WaveFormatEx _waveFormat;
private MediaStreamDescription _audioDesc;
private long _currentPosition;
private long _startPosition;
private long _currentTimeStamp;

private const int SampleRate = 44100;
private const int ChannelCount = 2;
private const int BitsPerSample = 16;
private const int ByteRate = 
    SampleRate * ChannelCount * BitsPerSample / 8;

private MemoryStream _stream;
private Random _random = new Random();

Other Functions

There are other functions you need to implement, even if you just throw an error or report them completed:

protected override void SeekAsync(long seekToTime)
{
    ReportSeekCompleted(seekToTime);
}

protected override void SwitchMediaStreamAsync(
    MediaStreamDescription mediaStreamDescription)
{
    throw new NotImplementedException();
}
protected override void CloseMedia()
{
    // Close the stream
    _startPosition = _currentPosition = 0;
    _audioDesc = null;
}

protected override void GetDiagnosticAsync(
    MediaStreamSourceDiagnosticKind diagnosticKind)
{
    throw new NotImplementedException();
}

Wiring up in Xaml

The next step is to wire our MediaStreamSource to a MediaElement in Silverlight

<MediaElement x:Name="TestMediaElement" AutoPlay="True" />
TestMediaElement.SetSource(new MyMediaStreamSource());

One word of caution. The current beta bits have a delay from the time a first sample is requested until you first hear audio. No amount of configuration on your part is going to change that, so don’t bother playing with buffering time settings. The team knows this is an issue, so I hope to see a solution at RTW.

More Complex Uses

You can certainly take this much further. For example, I built a basic synthesizer using MediaStreamSource. The synthesizer has multiple oscillators, each of which generates samples which are all mixed together and then output as a single two-channel stereo stream. You can try out an unstable/buggy build of the synthesizer here:

http://www.irritatedvowel.com/Silverlight/sl3/Synth/Default.html [link no longer valid as it was built using a beta]

(try the arpeggiator on the top right of the keyboard)

There are a number of bugs in my code for that synthesizer including issues with distortion and getting all out of whack, so if you hit an issue, first lower the volume. If that doesn’t take care of it, refresh the browser and try again. Here’s a screenshot of the synthesizer in action:

image

Source

The source code for the example in this post may be downloaded here.

 

 

Enjoy!

Share |
posted by Pete Brown on Monday, March 23, 2009
filed under:      

11 comments for “Creating Sound using MediaStreamSource in Silverlight 3 Beta”

  1. Joesays:
    Amazing. Going to have to write a Silverlight version of the sonic wheel and the reactable now:
    http://www.youtube.com/watch?v=0h-RhyopUmc
  2. Alexandre [@lx]says:
    Nice article, thanks for sharing Pete!

    I wish the latency problem will be solved in the final release of SL3. It would open a new kind of web application.
  3. Mark Heathsays:
    Hi Pete,
    this looks really cool. I look forward to having a play with the code once I am set up for Silverlight 3 dev.
  4. Tomsays:
    Nice, an MT-32! I always wanted one of those as a kid but had to settle for SB/GUS. Eventually I grew up and bought one from eBay. What a great piece of technology and history.
  5. Rob Burkesays:
    Superb stuff, Pete. Thanks very much for taking the time to do this investigation and writing it up so thoroughly. This was one of the things I was secretly very happy about at MIX09. It's cool to be able to synthesize our own sounds. After all, what else is a richer client experience for, if not for stuff like this?
  6. mliebstersays:
    Thanks for this post!

    I was thinking of opening up a new browser window and embedding the wav as an object in the HTML. In our current SL2 app, we're converting our WAV to WMA on the server on-demand with mixed results.

    So I'm very excited about being able to play the WAVs natively in SL3.
  7. mike hodnicksays:
    Any possibility of getting your synth app upgraded to final SL3?
  8. Pete Brownsays:
    @Mike

    Yes. I have a version here that was compiled with a not-quite-rtw build. I just need to make a couple minor changes and then post the app and code.

    Email me via the contact page if you need something immediately.

    Pete
  9. Mark Kestenbaumsays:
    Hi Pete,

    I have been struggling with a problem which maybe you can help with. I installed your project and ran it and everything works well. Then I added a textblock to display the value of the TestMediaElement.Position and used a timer to update it every second.

    What I find is that the position is approximately 1/6 the value that it should be (the ratio is not constant and keeps getting asymptotically lower). This means that if you want to show the current position of your audio using a slider or textblock, you cannot rely on the MediaElement.Position property to be accurate.

    In other tests, I tried setting the above property and found that it also doesn't jump to the middle of the audio but instead begins playing from the beginning.

    Do you have any idea how I can work around this?

    Thanks,

    Mark
  10. Pete Brownsays:
    @Mark

    I'm not sure what effect you're trying to achieve. The synthesizer sound, by nature, doesn't have a valid position. It's not like a media file you can scrub around.

    One thing you may be running into is me recreating the buffer each time (another reader pointed this out as a memory leak I need to fix) that, combined with the concept above, means the positioning info would be pretty useless.

    Let me know what you're trying to do and I'll see what I can think of.

    Thanks.

    Pete
  11. Mark Kestenbaumsays:
    Thanks Pete, for the comment.
    I'm trying to get the position within the file in order to move a slider and display the elapsed time of the sound file. Theoretically, the position property should give me that value but it gives me 1/6th of the correct value.

    One thing I noticed is that when I multiply the timestamp by 6.15 it gives me a more accurate position (not completely since it varies). I can't for the life of me figure out why there is this discrepency.

    When I use a regular media file that I've converted from a-law to pcm, it plays nicely but again the position is 1/6 of what it should be.

    I'd be happy to provide further information and would be very appreciative if you could help me figure out what is going on here.

    Mark

Comment on this Post

Remember me

9 trackbacks for “Creating Sound using MediaStreamSource in Silverlight 3 Beta”

  1. Silverlight Travel &raquo; Creating Sound using MediaStreamSource in Silverlightsays:
    PingBack from http://www.silverlight-travel.com/blog/2009/03/24/creating-sound-using-mediastreamsource-in-silverlight/
  2. DotNetShoutoutsays:
    Thank you for submitting this cool story - Trackback from DotNetShoutout
  3. POKE 53280,0: Pete Brown's Blogsays:
    At and around MIX I managed to get into a couple shows to talk about my work in Silverlight 3. Hanselminutes
  4. POKE 53280,0: Pete Brown's Blogsays:
    Thanks again to Marc Schweigert for hosting last night’s DevDinner in Reston. My blog is at www.irritatedVowel.com/Blog
  5. VBandi's blogsays:
    In early May, I gave a talk about the new features in Silverlight 3. As I’ve started to gather material
  6. POKE 53280,0: Pete Brown's Blogsays:
    A while back, I wrote about the sound test project I created in Silverlight that quickly turned into
  7. Community Blogssays:
    A while back, I wrote about the sound test project I created in Silverlight that quickly turned into
  8. Community Blogssays:
    This is Windows Client Developer roundup #10. The Windows Client Developer Roundup aggregates information
  9. Community Blogssays:
    This is Windows Client Developer roundup #10. The Windows Client Developer Roundup aggregates information