Welcome to Pete Brown's 10rem.net

First time here? If you are a developer or are interested in Microsoft tools and technology, please consider subscribing to the latest posts.

You may also be interested in my blog archives, the articles section, or some of my lab projects such as the C64 emulator written in Silverlight.

(hide this)

The Commodore 64 Emulator: Emulating Pointers in a sandbox when the real thing is not allowed

Pete Brown - 06 February 2012

Late last week, I cracked open the Commodore 64 emulator code once again, in preparation to post it. However, I had to have a change made to the source control on CodePlex, so I had a few days to make some changes. So far, it's shaping up quite nicely:

imageimage

I went back to the latest version of the Frodo C64 emulator source code and decided to port some of their changes over to this version. Frodo is written in C and C++, and makes very heavy use of pointers (and not always safe use of them, as there was at least one logical overrun). In the previous version, I had replaced pointers with array manipulation, but I did it in a way that resulted in an awful lot of array copies floating around. The arrays were small, so this wasn't a big deal memory-wise, but the copies all took time in an otherwise time-critical application.

Silverlight doesn't support pointers or unsafe code. Of course, in elevated trust mode, with a not-really-supported hack, you can have pointers in Silverlight. However, I wanted to stay away from that for now as it only works in Silverlight 5, only on Windows, and won't port to any other sandboxed XAML platforms.

So, I decided to finish what I started and build out some decent almost-pointerish safe analogs in Silverlight. It was a fairly large amount of effort, but I thought it might also be interesting to read about here.

How Pointers Work

If you've never written any C/C++ code, there's a better than zero chance that you haven't done any explicit pointer manipulation in your own code. That's not a bad thing, really, as the typical business application (and many other types) simply doesn't need to do pointer arithmetic. It's not worth the inherent danger for the small performance increase. That said, an understanding of how pointers work is  fundamental computer science, and is important for all developers to know.

It's only when you have to do a lot of memory walking in a performance-critical application that you run into this. Programming for microcontrollers like the AVR is a great way to refresh your memory as to how pointers work and what they bring to the table.

A pointer is an address in memory and an associated type size. For example, if we have some fictional 6 byte memory chunk and declare a pointer to a byte, we end up with something like this (bonus points if you can figure out the significance of the 5 bytes starting at 0x01):

image

In this case, our byte pointer points to the 8 bits starting at the address 0x01. You can then manipulate memory using pointer arithmetic:

byte* p = 0x01;    // declare pointer to memory address 0x01

*p = 0x30; // changes value at 0x01 to 0x30 from 0x78

*(p+3) = 0x00; // changes value at 0x04 to 0x00 from 0x7A

while (p < 0x06) // clear rest of memory
*p++ = 0x00;

This is very fast because there really is no indirection. You're telling the compiler exactly what address to manipulate - it doesn't have to look anything up.

The size of the pointer variable has to do with the architecture of the system and its addressing scheme (8 bit, 16 bit, 32 bit, 64 bit), as well as compiler options you select. (In the case of the Commodore 64, it's all 8 bit.).

You're not limited to the smallest memory unit size, though. If you wanted to declare a pointer to a 16 bit integer, you'd get this (assuming "short" is 16 bits on your system):

image

Now, depending on the endian-ness of your system, the resulting 16 bit number may be 0x7879 (big endian) or 0x7978 (little endian).

It starts to get really powerful when you consider that a pointer to a structure could also be considered a pointer to a number of bytes. That allows you to, for example, read a bunch of bytes from a disk, look at the first couple bytes to figure out what you have, and then cast the remaining X bytes as the pointer a directory structure or something. The original C++ C64 emulator code does a ton of that.

How pointers are used in Frodo

In addition to pointers to structures in byte buffers, frodo does a lot of string parsing using pointers. In particular, the 1541 drive emulation code uses this for command parsing. In that code, the first character is often some command identifier, the next is a delimiter (like a colon) and then some number of comma-separated values.

In our normal pointer-less code, we'd typically parse the string and then make copies for each of the individual components. Another way to deal with it is to simply provide pointers to each of the sections therefore avoiding making copies of memory. That's typically how the Frodo code works.

In my first version of the code, I used array indexes to simulate this. However, that code got incredibly nasty to work with. I needed another solution.

The Simulated Pointer

You can simulate pointers by simply passing around arrays, making copies, or even using array indexes. But, as I mentioned, that code tends to get pretty messy as you have lots of dependent variables. Instead, I needed a way to encapsulate all of this and provide pointer-like functionality to handle the majority of the cases.

  • The primary goal of this effort is to make it easy to have code which is close to 1:1 with its C++ counterparts. For example, if the C++ code increments a pointer and then assigns a value, I want to be able to do something similar without having to worry about which array the pointer points to, or what index value is. The code won't be 1:1, but I want it as close as possible.
  • Another goal of this effort is to minimize the number of times I copy arrays around. Primarily this is for performance reasons as the emulator code needs to loop at least one million times a second (1MHz). Minimizing array copies and object instances helps keep GC under control as well.
  • A non-goal of this effort is efficient use of actual memory. In fact, my pointer structures use more memory than a pointer just to simulate a pointer. I'm ok with that.
  • Another non-goal is the ability to cast a pointer to a structure or other complex type. I'd love to do that, but that's not sandbox-friendly and it's not going to happen that way in this code.

The C64 code has a number of places where it has memory allocated in chunks. The most obvious is the main memory of the system (64K), but there's also 2K in the VIC-1541 drive. Beyond those, there are several chunks of ROM code which get loaded up and then mapped into various address locations (cartridges too). In the emulator code, these are just large arrays of bytes.

If you ever had to simulate an operating system in college, you've done this before.

Here's an example of two pointers "A" and "B". Pointer A points to address 0x01 in the memory array. Pointer B points to address 0x05.

image

The code to create these looks something like this:

SystemRam ram = new SystemRam();

var a = new RamBytePointer(ram, 0x00);
var b = new RamBytePointer(ram, 0x05);

The RamBytePointer structure represents the pointer. It has a member Value which can be used to get/set the value at that position. It also has a number of other helper methods to do things like copy values from arrays, comparisons with other pointers, etc. Note that the pointer structure takes a reference to the memory it is allowed to manipulate. A real pointer has pretty much free run of memory, so you need to keep the RAM pretty broadly defined if you need that flexibility.

Here's the current version of the structure. It doesn't do quite everything I want yet (for example, nothing with comparisons between pointers, for example), but it does a fair bit.

using System.Diagnostics;
using System.Text;
using System;

namespace PeteBrown.C64.Core.Memory
{
public struct RamBytePointer
{
public RamBytePointer(RamBase memory, int address)
: this(memory)
{
_address = address;
}

public RamBytePointer(RamBase memory)
{
_memory = memory;
_address = 0;
}


private RamBase _memory;
private RamBase Memory
{
[DebuggerStepThrough]
get { return _memory; }
[DebuggerStepThrough]
set { _memory = value; }
}

private int _address;
public int Address
{
[DebuggerStepThrough]
get { return _address; }
[DebuggerStepThrough]
set { _address = value; }
}

public byte this[int offset]
{
[DebuggerStepThrough]
get { return _memory[_address + offset]; }
[DebuggerStepThrough]
set { _memory[_address + offset] = value; }
}

public byte Value
{
[DebuggerStepThrough]
get { return _memory[_address]; }
[DebuggerStepThrough]
set { _memory[_address] = value; }
}

public byte[] GetValues(int length)
{
return _memory.Read(_address, length);
}

public void SetValues(byte[] values)
{
_memory.Write(_address, values);
}


public void CopyFrom(byte[] values, bool incrementPointer)
{
for (int i = 0; i < values.Length; i++)
{
_memory[_address + i] = values[i];
}

if (incrementPointer)
_address += values.Length;
}

public void CopyFrom(char[] characters, bool incrementPointer)
{
for (int i = 0; i < characters.Length; i++)
{
_memory[_address + i] = (byte)characters[i];
}

if (incrementPointer)
_address += characters.Length;
}

public void CopyFrom(string characters, bool incrementPointer)
{
for (int i = 0; i < characters.Length; i++)
{
_memory[_address + i] = (byte)characters[i];
}

if (incrementPointer)
_address += characters.Length;
}

public RamBytePointer NewPointerAtOffset(int offset)
{
return new RamBytePointer(_memory, _address + offset);
}


public static RamBytePointer operator +(RamBytePointer p, int offset)
{
return new RamBytePointer(p.Memory, p.Address + offset);
}


public static RamBytePointer operator -(RamBytePointer p, int offset)
{
return new RamBytePointer(p.Memory, p.Address - offset);
}


public static RamBytePointer operator ++(RamBytePointer p)
{
p.Address += 1;

return p;
}

public static RamBytePointer operator --(RamBytePointer p)
{
p.Address -= 1;

return p;
}


public int IndexOf(byte value, int count)
{
return _memory.IndexOf(value, 0, count);
}

public int IndexOf(char value, int count)
{
return _memory.IndexOf(value, 0, count);
}

public bool Contains(char value, int count)
{
return _memory.Contains(value, 0, count);
}

public bool Contains(byte value, int count)
{
return _memory.Contains(value, 0, count);
}


public string ToString(int startIndex, int length)
{
StringBuilder builder = new StringBuilder();

for (int i = startIndex; i < length; i++)
{
builder.Append((char)_memory[i]);
}

return builder.ToString();
}

}
}

The memory array itself is always created and maintained outside of this class. It must be valid when the pointer is created.

Operator overloading and Offsets

Of interest is the operator overloading. This allows me to do things like p++ or p += 1 in order to modify the address that the pointer is pointing to. That's an important part of keeping the C# code as similar as possible to the C++ code. However, it doesn't allow everything the C++ code allows. For example, I can't do the exact equivalent of this code:

*(p + 5) = 0x05;
*(p - 7) = 0x0A;

If I try to do the equivalent of that, the compiler will barf at me as it needs an actual LValue on the left side. Instead, code like that must use something like this:

p[5] = 0x05;
p[-7] = 0x0A;

The code is similar enough that I'm happy with it, although developers new to the code may raise a few eyebrows at the p[-7] version.

Converting Strings

Because much of the C++ code treats characters and bytes as interchangeable (something you can't really do with arrays in C# on Windows), I also have a few overloads that can take characters or bytes. This helps clean up the consuming code so it doesn't have so many casts.

I have helper functions in there to copy from strings. This helps me easily translate code such as this:

*p++ = 'B';
*p++ = 'L';
*p++ = 'O';
*p++ = 'C';
*p++ = 'K';
*p++ = 'S';
*p++ = ' ';
*p++ = 'F';
*p++ = 'R';
*p++ = 'E';
*p++ = 'E';
*p++ = '.';

Into a nice single-liner like p.CopyFrom("BLOCKS FREE.", true);

A Structure, not a Class

Note that this is a structure, not a class. Why? Having it as a structure lets me make copies easily. It's common practice in C/C++ code to have a pointer passed into a function, and then make a copy of it to increment yourself. For example:

void DoSomething (byte * baseAddress)
{
byte * p = baseAddress;

while (p++ != 0x0A)
...
}

In this example C++ function, p starts off at the same address as "baseAddress", but then increments it. The value for baseAddress must be left alone.

Reference types like class in C# are themselves pointers. So, due to the level of indirection this has, we'd end up modifying the original value:

class RamBytePointer { ... }

void DoSomething (RamBytePointer baseAddress)
{
RamBytePointer p = baseAddress;

p += 5;

Debug.WriteLine(p.Address);
Debug.WriteLine(baseAddress.Address);
}

// ---------------------------------------

struct RamBytePointer { ... }

void DoSomething (RamBytePointer baseAddress)
{
RamBytePointer p = baseAddress;

p += 5;

Debug.WriteLine(p.Address);
Debug.WriteLine(baseAddress.Address);
}

The first example doesn't work like pointers in C++ at all. Incrementing p also increments baseAddress. The second example, however, works as we would expect because p is a copy of baseAddress, not a reference to it.

The downside of using a struct is there's no inheritance. So, I instead put a fair bit of the potentially reusable code into the memory classes instead, and simply call that from the pointer class.

Summary

Simulating pointers in pointer-free sandboxed platforms is a bit unorthodox, but can make translating code easier.

This isn't something you're likely to need to do in your own code, but it an interesting exercise in any case. There's no downloadable source code yet, as I'm still refining this implementation. You'll be able to get a version of these classes in the SilverlightC64 code when I post it in the next week or less. It's taking longer to swap out all this code than I had intended :)

If you've ever had to do anything like this in your own code, I'd love to hear about it.

         
posted by Pete Brown on Monday, February 6, 2012
filed under:          

5 comments for “The Commodore 64 Emulator: Emulating Pointers in a sandbox when the real thing is not allowed”

  1. Mark Heathsays:
    Nice idea. I do a lot of converting of audio DSP code from C/C++ to C# and my usual approach is to simply turn each pointer into an int which is an index into an array. If a function takes a pointer as a parameter, the converted function takes an array plus an offset integer.

    Your approach has some interesting possibilities, such as hiding the complexity of swapping an algorithm that expects left and right audio samples in separae arrays to one that has them interleaved left right.
  2. Daniel Cormiersays:
    What about using IntPtr?

    http://msdn.microsoft.com/en-us/library/system.intptr(v=vs.95).aspx

    http://blogs.msdn.com/b/jaredpar/archive/2008/11/11/properly-incrementing-an-intptr.aspx
  3. Petesays:
    @Daniel

    Intptr is nice, but really meant for API calls and whatnot. It doesn't (without a bunch of additional work like that shown in the second blog post) actually fill the need in this situation. In addition, I believe some of its methods may not work in normal sandboxed Silverlight (I haven't checked).

    It may certainly work for other situations though. Thanks.

    Pete
  4. Petesays:
    @Jay

    Sorry, every time I plan to work on it, something else comes up. At this point, with interest in Silverlight waning (sales of my SL5 version of my book confirm that trend), I think I'll take another path with this specific code.

    I'm seriously considering a Windows 8 port using native code and DirectX. TBD.

    Pete

Comment on this Post

Remember me