Kermit Alexander II's blog

Our Augustine - a serial

2019-03-06T23:15:00.003-06:00

For about a year or so, I've been intending on publishing a serial I've been writing, which I've dubbed "Our Augustine." I think the prologue - which is what I'm releasing in conjunction with this blog post - should mostly explain itself, but to make a (mostly-unpublished!) not-so-long story short, I set out to answer this question: How would a society that truly does not believe in the concept of free will structure itself? How would it come to make decisions it considered ethical? I've tried to wrap up what I believe to be the answer in a reasonably-entertaining sci-fi war story, but you can be the judge of that yourself by reading the prologue here.

I'm licensing the text of the serial under the CC BY-SA-4.0 license; in the future, I'll try to embed the text of successive releases in the blog itself.

Homogenization of Contexts, or The Right to a Hot Take

2018-11-17T16:03:00.000-06:00

A PDF copy of this essay can be found here.

This is something that's a little bit out of the ordinary for my blog - something more philosophical than technical - but it's something that's been weighing on my mind for some time now. I've discussed the ideas I present here with many of my friends, and they understood (though not all necessarily agreed!), so maybe the internet at large would be interested as well. I think that the growing homogenization of the internet at large through sites like Facebook and YouTube, as well as the growing use of SSO (Single Sign-On) solutions to use a single account to login to a variety of smaller sites, has led to a chilling effect on the honesty of internet-based discourse; but more importantly, threatens to prevent our complete selves from being expressed to the world, which could prove extraordinarily harmful in the long run.

To begin with, "our complete selves" could bear some more explanation. I believe that people have multiple faces - or windows - that they switch out as appropriate for different people, gatherings, and so on. I don't think that they are completely separate, but a window onto the same base identity; the union of these windows form "you." I also believe that if there is a portion of the "you" which is not currently expressed through one of the windows, then "you" will make an effort to create a situation where the unexpressed portions can be broadcast. With this idea in mind, a couple other things fall into place.

The matter of confidants makes perfect sense, for instance. Parts of us that we couldn't otherwise express - our fears, insecurities, truly honest thoughts - we make available to a chosen few due to that base need for expression. We also see how trust factors into this; it is not always in our best interest to make others aware of more than one window onto ourselves, yet for certain people - significant others, close friends - we provide them with that privilege. Intimate relationships are then defined as the process of making yourself more of a known quantity to certain others.

The general aversion most of us have to snitching also makes sense when viewed in this light. If we know someone who normally presents themselves as a straight-laced type, but they also have a rebellious punk side that they allow us to see - maybe they smoke or do drugs or something like that - then the hesitance of most of us to alert their employer or some other authority is understandable. Firstly, there's the matter of trust here; we've been entrusted with the knowledge of a separate window, knowledge that our friend believes we will not use to bring them to harm. Secondly, why must the employer know? What they hired was the calm and mild-mannered employee window, not the party animal window. Are we harming the employer by not alerting them to a separate window that has no bearing on the one they hired? The obvious counter-argument here is that the two windows are connected and our friend's ability as an employee is being harmed. I would counter that by saying that if that were so, the image presented by their employee window would gradually deteriorate to the point that they were fired. Why not let that happen naturally?

Notice how the examples I gave all had the fear of consequences as the root motivation; the threat of the world at large seeing us as more of a known quantity (thus eroding our ability to make intimate relationships) [1], and the fear of backlash from someone expressly interested not in our whole selves but merely one or two windows, prevents us from ever entirely expressing ourselves all at once. It's not necessarily a bad thing that we're unable to do so; in most cases only certain aspects of ourselves are really necessary in a given context - this is why we have the phrase "too much information," after all!

But where does the Internet factor into all this? The Internet is perhaps the best thing that has ever happened for the need for expression, which brings me to the main idea propelling this essay, the homogenization of Internet contexts. It used to be that the web was reasonably decentralized, at least in terms of who (on average) went where. Exposing little slivers of windows of yourself, and properly separating them so as not to unintentionally become more of a known quantity, was relatively simple. Not all discussion was appropriate on all sites - or contexts, as I call them here - but there was still a context in which your discussion was considered appropriate. But this has become less true as time has passed.

Nowadays, Internet traffic and discussion is heavily dominated by a small handful of sites. Proper separation of our own windows has become extremely difficult. Making comments and content under a pseudonym is not enough; for sites like Reddit, Twitter, or YouTube that keep a history of all things published by a specific user, it often takes just one person malicious or bored enough to make our carefully-constructed house of cards come crashing down around our ears, often with serious real-life consequences. The problem is compounded by the growing use of SSO, where our identities across multiple sites are explicitly tied together by using a single username to login to a variety of places; consider all the sites you see today with phrases such as "Log on with Facebook" or "Log in with Google." In effect, the majority of the modern web functions as a single window onto ourselves, acting as a giant soapbox to put our public selves on, with all the risks and potential consequences that entails.

But - perhaps directly due to our base need for expression! - some sites have been made that allow for total anonymity, allowing for each of our individual comments to be its own distinct window if we chose them to be, with the rest of the userbase being none the wiser. Most of these places carry some infamy, like 4chan, due to the layperson perceiving them as the absolute worst of what humanity has to offer in terms of dialogue. I think this is due to a couple reasons; I believe that the average discussion quality of any Internet group becomes markedly worse as the size of that group grows, though that is an argument worth its own blog post [2]. For a site like 4chan that roughly 1 in 25 Americans visit on a monthly basis [3], discussion quality can get quite poor, especially when there are intentionally few rules on what can be discussed and how, and people are used to that not being the case. I think heavily compartmentalizing such a large place can mitigate the effect somewhat (Reddit is roughly on the right track but makes some serious missteps [4], 8chan's use of user-created boards is the closet thing out there to what I feel is the right solution), but that's a topic for another time.

Getting a bit back on topic, another byproduct of the homogenization of contexts in the modern web has been the polarization of what is and is not allowable in terms of speech. Everything is either carefully curated to fit a safe, milquetoast public image - as in the case of the Facebook/Google iceberg - or we are flung into the deep end of the pool where we are bombarded with the consequences of a large group of people given the double-edged sword of total anonymity. In an ideal world - similar to how the web used to be - there would be an archipelago of sites one could go to to discuss different matters; in other words, a variety of separate contexts. Note that I've never argued that all speech should be considered acceptable in all contexts, but rather that for a given thought, there should exist a context within which it is acceptable to express it [5]. In this light, the existence of sites with total anonymity is more of a necessary evil than anything else, though I believe them to be the best short-term band-aid and am actively attempting to improve that landscape.

But why is this all important? Why should we take such pains to ensure that there is an avenue (with a decently-sized audience) for voicing our every thought and idea? This ties into the alternate title for this essay, the so-called "right to a hot take." [6] I think it's plain to see that the progress of ethics and human decency is built upon the long-suffering shoulders of the iconoclast - that ideas and concepts now considered obvious by the majority of the population were once the sole opinion of a select few radicals. This has always been the case, and it would be silly to think it would not be so in the future; there will probably be many contemporary conventions (some of which I probably hold!) that will be seen as backwards or even morally wrong by those not yet living.

This is why the diversity of contexts on the Internet is so important to me, and why I have endeavored to show that it should be important to you, as well. Never before in the history of the world have we been collectively blessed with a platform that potentially allows us to put forward and voice all kinds of ideas to countless others without directly putting ourselves at risk of discrimination or physical harm. Yet, at the same time, we are currently in grave danger of losing that blessing.

Some would say that some thoughts are "better left unexpressed." In a select few instances, I would be inclined to agree. Emotionally-charged outbursts, like how Alice left you out to dry at work today, are maybe better left on simmer. But core concepts and beliefs that form who we are must have an outlet, and I believe that outlet will eventually be forcibly made if no means of expressing those windows is allowed to exist, and I would prefer that such a reality would never come to pass. I think it's possible to express every part of ourselves in a respectful and safe manner thanks to the Internet, if only we separate the relevant windows well enough.

Finally, I would like to address the fear some have that a means of total self-expression, even if compartmentalized as I've described throughout this essay, could lead humanity down all kinds of dark roads. This idea has a couple problems; firstly, the assumption that humanity at large will throw away a concept if they are threatened into not discussing it in broad daylight has no historical basis. If anything, I believe the opposite is true. Secondly, it falls into the same trap I talked about earlier about believing no further moral progress will be made. We must all not be so prideful as to believe we are perfectly-enlightened bastions of morality; it would certainly be a sin of some sort to silence the voice of one who would have otherwise brought about a moral or ethical revolution [7]. Lastly, I think the idea that we can avoid every kind of misstep fails to take into account basic human nature. We trip, stumble, and fall. We proceed down paths and trains of thought we perceive as perfectly logical that are completely nonsensical in hindsight. At times, we even commit atrocities against our fellow human beings, justifying it with some convoluted ideal or another.

Yet, at the end of the day, when all is said and done, we can always count on that one small voice. Who, despite risk to their public image, their social status, and perhaps even their life, dares to speak out: "My friends, this is not right."

I wish to do everything in my power to protect that one small voice.

Footnotes

[1] The importance of the ability - I would go so far as to call it a right, actually - to withhold information about ourselves from others cannot be overstated. I would argue that it is one of the primary currencies of social capital, and furthermore is a kind of finite resource, as it is not generally possible to easily generate new sensitive information about yourself. This is why trust is such an innately personal, intimate thing - it is the act of providing collective ownership to a highly valuable finite resource - and why it feels so painful to have one's trust be violated.
[2] I make this claim based largely on anecdotal evidence stemming from observing various Internet communities grow over time, some totally anonymous, some with pseudonyms. I think somehow compartmentalizing or breaking down large communities into lots of little largely-separate ones is the right path forward, but I'm admittedly not sure how to go about this.
[3] This is the result of some back-of-the-napkin math I did with 4chan's advertising demographics. The info is publicly available if you'd like to do the numbers yourself.
[4] The concept of upvotes incentivizes people to say what they believe will be the most palatable/agreeable/funny thing in the moment, in order to garner as much attention as possible. Additionally, opinions can be downvoted and consequently hidden from view, and often are if they disagree with the majority opinion, collectively leading to a chilling effect of a kind on expression. Places like 4chan, which lack upvotes but list replies to each individual post, cultivate a culture where people often tend to disagree wherever possible. When is someone most likely to respond to you? When they believe you to be wrong. It's funny to see how the inherent thirst for attention plays out on various sites.
[5] It is for this reason that I believe web hosting providers have a special responsibility to not limit what is hosted on their platforms inasmuch as the hosted content does not fall afoul of the law. I do not think this responsibility should necessarily be enshrined in law, as I also think it important for businesses to generally serve whoever they wish for whatever reasons. There's plenty of room for discussion on the matter, though - should political affiliations be a protected class, for instance? How would political affiliations or opinions be defined in the eyes of the law? If some sites are removed, is the hosting provider now a curator and responsible for all (potentially illegal!) content and activity that they provide a platform for?
[6] "Right" is admittedly a fairly strong term here. It could imply that we are owed the ability to speak whatever we want, whenever we want. In the context of this essay, I don't necessarily take that stance, but rather believe that for every potential window - every aspect of ourselves - there should an exist an appropriate platform for voicing that thought, with ideally as few ties to our general public image as possible. This is largely a self-correcting issue, what with things such as decentralized communication protocols and all, but I think this is a matter the Internet at large should take more note of to bring as many people into the fold as possible - not everyone is going to know how to set up something like GNUNet or IPFS. That, and "right to a hot take" just sounds cool.
[7] I believe that it is because of this that countries with more relaxed laws and cultures surrounding acceptable speech tend to progress faster in the realm of human rights than ones that do not. There can be regressions (e.g, Jim Crow), but that is the nature of human progress and such mistakes correct themselves over time.

Schego Part 4, Or The Heap, Linked Lists, and Other Scary Subsystems

2018-04-30T20:37:00.001-05:00

This is the second half (see the first half here) of the giant blog update on Schego I wrote. I concentrated so hard on actually coding, I forgot to update the blog as I went! Hopefully I'll be able to split my time more evenly between documentation and coding, but on with the show; we're going to talk about a couple different new additions to Schego that most interpreted programming languages will have to implement in one way or another.

The first up, and the subsystem that everything else in this blog post depends on in one way or another, is the new heap allocator based on the buddy memory allocation technique. It's fairly venerable (I believe the Linux kernel uses a modified version for kmalloc) and is also pretty easy to implement, with Schego's implementation spanning a hair under 200 lines as of this time of writing.

I'm gonna skimp on justifying the need for a heap allocator since I feel it's kinda obvious (how many non-embedded programs have you written that just used the stack?), but I do think it would be a good idea to examine why I'd use the buddy allocator in particular. It's certainly not the only allocator on the block, but it's definitely the most popular one; when looking around for preexisting allocators to study, the buddy allocator was by far and away the most covered. I think the advantage of having one or more reference implementations is difficult to oversell; when you're unsure if you're "doing it right," having something beyond a white paper is pretty handy. You could argue that if you're about to write a memory allocator you should be able to implement it based solely on a white paper, but the whole point of this series is to teach, right? So let's do some teaching, especially since prose explanations of any memory allocation algorithm are basically non-existent at this point.

The buddy system at its core is pretty intuitive: The basic idea involves simply splitting up the heap into these stretches of memory called blocks that we then can place our variables into. Since we want to handle differently-sized objects, say, a single 8-bit value versus a 32-bit array of 8-bit values, we can have differently-sized blocks. Also, because referring to blocks by how many bits or bytes they take up is kind of unwieldy, let's use the idea of an order system to specify how large our blocks are. So at first, we have some base size that we'll define as order 0 (an order 0 block in Schego is currently 32 bytes, for instance), and then each order beyond that is twice as big as the one preceding it, or in other words, order n has a length of (order 0 length) * 2 ^ n. In Schego, the order calculation looks something like this right now:

func (h *VMHeap) OrderFor(requestedBytes uint64) uint8 {
 if(requestedBytes <= blockSize) {
  return 0
 }
 // TODO: handle requests past maxOrder gracefully
 // this is ugly, but kinda fast, and less stupid than the old solution
 var order uint8
 order = uint8(math.Log2(float64(requestedBytes)) - math.Log2(float64(blockSize)))
 return order
}

By the way, the math in the OrderFor function is just a simplification of the order calculation I mentioned earlier so we can solve for n directly - don't worry that it looks a bit different.

Anyway, now we have a way to refer to chunks of memory, but how do we actually allocate objects? First, we figure out what the smallest order block would be capable of holding the requested number of bytes with OrderFor. If there's already a free block in the heap with the same size order, we're done - we simply return the address of that block. But what if only bigger blocks are available? This will have to happen at least once, too, as the heap starts its life as one giant block.

In that case, we have to subdivide one of the bigger blocks down the middle, and simply repeat the process on one of the halves until we have a block of the size they want. It requires a bit of bookkeeping in practice, but on paper it's fairly straightforward. As a nice bonus, we've created a bunch of other free blocks in the process that should fit the bill if the caller requests allocation of similarly-sized objects in the future, saving us a bit of work. Additionally, if only smaller blocks are available we've run out of memory and need to tell the caller that that's happened, so we only deal with if larger blocks are available. Schego implements the splitting process like so:

func (h *VMHeap) CreateBlock(order uint8) {
 // find smallest order that we can pull from
 freeOrder := order + 1
 for {
  if h.NoFreeBlocksFor(freeOrder) {
   freeOrder += 1
  } else {
   break
  }
 }
 // repeatedly split blocks until we get one (technically, two) of the order we originally wanted
 for freeOrder > order {
  blockAddress := h.GetFreeBlock(freeOrder)
  h.SplitBlock(blockAddress, freeOrder)
  freeOrder -= 1
 }
}

So that's allocation handled, but how do we deallocate objects? If you followed the block subdivision part of allocation, you might notice blocks of the same size get created in pairs. Sometimes one of those blocks then gets subdivided further, but this idea of pairs - or buddies - is where the buddy allocation system gets its name. After freeing up a single block (which is as easy as putting its address back on the list of free blocks), we check if its buddy exists and is free, too. If so, we merge the two blocks back together into a single larger block, and do the same buddy check/merge process until no buddy can be found. This keeps the heap from fragmenting and wasting space, and generally makes it such that there are only barely more blocks (each time we request a block, we're potentially creating 2 blocks with that size, remember?) than necessary to cover all objects in the heap. In Schego, the merge process is done like this:

func (h *VMHeap) MergeWithBuddy(address uint64, order uint8) {
 buddyAddress := h.GetBuddyAddress(address, order)
 // figure out which address is lower and delete the other block
 // take the lower address for the new merged block
 var newAddress uint64
 if buddyAddress < address {
  newAddress = buddyAddress
  delete(h.blockMap, address)
 } else {
  newAddress = address
  delete(h.blockMap, buddyAddress)
 }
 buddyIndex := h.GetUnusedBlockIndex(buddyAddress, order)
 h.RemoveBlockFromUnused(buddyIndex, order)
 blockIndex := h.GetUnusedBlockIndex(address, order)
 h.RemoveBlockFromUnused(blockIndex, order)
 h.blockMap[newAddress] = order + 1
 h.unusedBlocks[order+1] = append(h.unusedBlocks[order+1], newAddress)
 // recurse if we still have potential merging left undone
 if h.HasBuddy(newAddress, order+1) {
  h.MergeWithBuddy(newAddress, order+1)
 }
}

After all that, our top-level Allocate() and Free() functions are as simple as this:

func (h *VMHeap) Allocate(numBytes uint64) uint64 {
 order := h.OrderFor(numBytes)
 if h.NoFreeBlocksFor(order) {
  h.CreateBlock(order)
 }
 blockAddress := h.GetFreeBlock(order)
 // GetFreeBlock always returns the first/0th free block,
 // so remove that one
 h.RemoveBlockFromUnused(0, order)
 return blockAddress
}

func (h *VMHeap) Free(address uint64) {
 order := h.blockMap[address]
 // add the newly freed block back to the list of unused blocks
 // MergeWithBuddy will take care of removing it if need be due to merging
 h.unusedBlocks[order] = append(h.unusedBlocks[order], address)
 if h.HasBuddy(address, order) {
  h.MergeWithBuddy(address, order)
 }
}

The only real downside that I'm aware of with the buddy allocator is that there's a bit of fine-tuning involved with block sizes; if blocks are too big, there will be a lot of wasted space. Going back to Schego, every time an 8-byte integer is allocated, 24 bytes of heap memory are currently wasted. On the other hand, smaller block sizes mean more merging upon deallocation, which is a performance hit and time wasted. I haven't looked yet to see how heap allocators in other virtual machines handle this, but I plan on doing so shortly.

That was a lot to take in, I know, but once we have our heap built and tested, the most we'll have to interact with it is to use its Allocate() and Free() functions - the rest is all under the hood. So all that said, let's examine the next piece of the puzzle - strings.

Strings aren't a difficult concept. They're just a bunch of bytes in memory all lined up in a row that get treated as letters and stuff - most of you reading this now could've probably implemented basic string functionality in Schego given a heap allocator pretty easily. The R7RS Scheme spec throws a mild curveball at us, however, in that proper string support also involves supporting UTF-8 as well.

Pretty much all of you are familiar with the ASCII character encoding that older languages like C support out of the box - one byte per character, all that good stuff. Most of you have probably heard of UTF-8 (and definitely used it - you're reading UTF-8-encoded text right now), but maybe not had to ever think about writing explicit support for it in your applications - many programming languages today like Python 3 offer built-in support for UTF-8 in their strings.

Going over the UTF-8 specification, it's fairly simple, but for those of you coming from ASCII, you now have to deal with the fact that you're not guaranteed that every character is going to be some fixed number of bytes; for instance, all ASCII characters have the same exact length and value in UTF-8, but all non-Latin characters take up more than one byte. In technical terms, UTF-8 is what is known as a variable-width character encoding.

This requires only a little bit of planning out on our end, though, since UTF-8 makes it pretty easy to figure out how large each character is. All we have to do is check the first four bits of the first byte. If we get all zeros, we've got a 1-byte ASCII character or the null character on our hands. If we get a 0xC, that means our character will be 2 bytes long, and for 0xE, 3 bytes. Currently Schego assumes that for any other value the character will be 4 bytes long, which will always be the case in properly-formed UTF-8 text, but could pose a problem when ensuring no invalid characters slipped through the cracks. Anyway, here's how Schego currently reads in UTF-8 text off the stack:

                utfBytes := make([]byte, 0)
  for {
   firstByte := v.ReadBytes(1)[0]
   if firstByte&0x80 == 0 || firstByte == 0 {
    // ASCII or null byte
    utfBytes = append(utfBytes, firstByte)
    // null?
    if firstByte == 0 {
     break
    } else {
     continue
    }
   }
   // with UTF-8, the most significant bits tell us
   // how many bytes to read, so construct some simple bitmasks
   // to check
   // see: https://en.wikipedia.org/wiki/UTF-8#Description
   // essentially, check the upper four bits of the first byte,
   // with 1111 meaning read 4 bytes total, 1110 3 bytes, and
   // 1100, 2 bytes
   var codepointLength int
   upperFourBits := firstByte >> 4
   if upperFourBits == 0xC {
    codepointLength = 2
   } else if upperFourBits == 0xE {
    codepointLength = 3
   } else {
    // naively assume 4-byte codepoint length,
    // could be dangerous, should probably be replaced
    // at some point in the future
    codepointLength = 4
   }
   // we've already read one byte (firstByte)
   numBytes := codepointLength - 1
   extraBytes := v.ReadBytes(numBytes)
   utfRune := append([]byte{firstByte}, extraBytes...)
   // append the finished character
   utfBytes = append(utfBytes, utfRune...)
  }
  v.Stack.PushString(utfBytes)

That's actually the entire current implementation of the opcode to push a string onto the stack. Not that bad, huh? Putting strings on the heap isn't too much extra work. All we really have to do is ensure there's enough space to store the string in the heap, store the length along with it, and maybe reallocate space if what we currently allocated isn't enough:

         mnemonic := string(v.ReadBytes(2))
  address := v.mnemonicMap[mnemonic]
  strBytes := v.Stack.PopString()
  var strBuffer bytes.Buffer
  numBytes, _ := strBuffer.Write(strBytes)
  numBytes64 := uint64(numBytes)
  var allocatedBytes uint64
  binary.Read(v.Heap.Read(8, address), binary.LittleEndian, &allocatedBytes)
  if numBytes64 > allocatedBytes {
   v.Heap.Free(address)
   newAddress := v.Heap.Allocate(8 + numBytes64)
   v.mnemonicMap[mnemonic] = newAddress
   intBuffer := bytes.NewBuffer(make([]byte, 0))
   binary.Write(intBuffer, binary.LittleEndian, &numBytes64)
   v.Heap.Write(intBuffer, newAddress)
   // offset by 8 to avoid writing over the int we just wrote
   v.Heap.Write(&strBuffer, newAddress+8)
  } else {
   v.Heap.Write(&strBuffer, address+8)
  }

This is also a good time to introduce Schego's reference mnemonics into the picture, which are just 2 arbitrary bytes that can refer to some point in memory. You can think of them like variable names or pointers - the heap allocator decides what specific address in memory they point to, but you can always refer to that address with the name.

That about does it for strings. There's some test cases showing off the UTF-8 support, but you can go look at the code yourself if you're curious - they don't exactly tell you anything you don't already know at this point. So let's keep moving to linked lists.

Schego, like a good little Scheme implementation, provides bytecode-level support for linked lists. Under the hood, a single linked list node in Schego is pretty straightforward, being composed of just 3 different unsigned integers: The first stores the number of bytes allocated for the data held in the node, the next stores the address of the data itself, and the last integer stores the address of the next node in the list. This basically means we just pop 3 integers at a time under the hood inside the linked list-related opcodes, so as long as our underlying code is solid (implementation of one of the opcodes revealed a bug in the addi instruction that caused an extra byte to be prepended to every sum, ouch) and we always push/pop in the correct order, the resulting implementation is pretty boring. For instance, this is the implementation for what's known as car (Contents of Address Register - long story), or in other words, the function to retrieve the data stored at a linked list node and put it onto the stack:

                cell := v.Stack.PopCell()
  numBytes := cell[0]
  dataAddress := cell[1]
  cellData := v.Heap.Read(numBytes, dataAddress).Bytes()
  for i := uint64(0); i < numBytes; i++ {
   v.Stack.PushByte(cellData[numBytes-i-1])
  }
  // hack to get VMStack.lenLastPushed to play nicely with hcar
  // should probably be replaced with a dedicated method on
  // VMStack in the future (PushCellData?)
  v.Stack.lenLastPushed = numBytes

Not that involved, right? Let's close out the article with the test case that shows how all the list opcodes are meant to be used together:

func TestListSum(t *testing.T) {
 // create a list containing the values 1, 2, and 3,
 // and sum them up
 // mnemonics:
 // 0xBEEF - 1st list cell/list head
 // 0xDEAD - 2nd list cell
 // 0xFACE - 3rd list cell
 // 0xACED - sum counter
 opcodes := []byte{
  0x25, // hnewl
  0xBE,
  0xEF, // reference mnemonic - 0xBEEF
  0x06, // cons
  0x03, // pushi
  0x01,
  0x00,
  0x00,
  0x00,
  0x00,
  0x00,
  0x00,
  0x00, // 1
  0x4B, // hscar
  0x0D, // hstorel
  0xBE,
  0xEF, // reference mnemonic - 0xBEEF
  0x25, // hnewl
  0xDE,
  0xAD, // reference mnemonic - 0xDEAD
  0x06, // cons
  0x03, // pushi
  0x02,
  0x00,
  0x00,
  0x00,
  0x00,
  0x00,
  0x00,
  0x00, // 2
  0x4B, // hscar
  0x0D, // hstorel
  0xDE,
  0xAD, // reference mnemonic - 0xDEAD
  0x19, // hloadl
  0xBE,
  0xEF, // reference mnemonic - 0xBEEF
  0x4D, // hscdr
  0xDE,
  0xAD, // reference mnemonic - 0xDEAD
  0x0D, // hstorel
  0xBE,
  0xEF, // reference mnemonic - 0xBEEF
  0x25, //  hnewl
  0xFA,
  0xCE, // reference mnemonic - 0xFACE
  0x06, // cons
  0x03, // pushi
  0x03,
  0x00,
  0x00,
  0x00,
  0x00,
  0x00,
  0x00,
  0x00, // 3
  0x4B, // hscar
  0x0D, // hstorel
  0xFA,
  0xCE, // reference mnemonic - 0xFACE
  0x19, // hloadl
  0xDE,
  0xAD, // reference mnemonic - 0xDEAD
  0x4D, // hscdr
  0xFA,
  0xCE, // reference mnemonic - 0xFACE
  0x0D, // hstorel
  0xDE,
  0xAD, // reference mnemonic - 0xDEAD
  0x22, // hnewi
  0xAC,
  0xED, // reference mnemonic - 0xACED
  // zero out our allocated memory, as we are not
  // currently guaranteed any new memory will be zeroed
  0x03, // pushi
  0x00,
  0x00,
  0x00,
  0x00,
  0x00,
  0x00,
  0x00,
  0x00, // 0
  0x0A, // hstorei
  0xAC,
  0xED, // reference mnemonic - 0xACED
  0x19, // hloadl
  0xBE,
  0xEF, // reference mnemonic - 0xBEEF
  // use dup/hcdr to leave the list cells on the stack
  // in reverse sequential order (3-2-1 from top to bottom, and not 1-2-3)
  0x07, // dup
  0x49, // hcdr
  0x07, // dup
  0x49, // hcdr
  0x47, // hcar
  0x16, // hloadi
  0xAC,
  0xED, // reference mnemonic - 0xACED
  0x36, // addi
  0x0A, // hstorei
  0xAC,
  0xED, // reference mnemonic - 0xACED
  0x47, // hcar
  0x16, // hloadi
  0xAC,
  0xED, // reference mnemonic - 0xACED
  0x36, // addi
  0x0A, // hstorei
  0xAC,
  0xED, // reference mnemonic - 0xACED
  0x47, // hcar
  0x16, // hloadi
  0xAC,
  0xED, // reference mnemonic - 0xACED
  0x36, // addi
  0x43, // syscall
  0x03, // print integer
  0x03, // pushi
  0x00,
  0x00,
  0x00,
  0x00,
  0x00,
  0x00,
  0x00,
  0x00, // 0
  0x43, // syscall
  0x06, // exit
 }
 console := DummyConsole{}
 RunVM(opcodes, &console)
 if console.consoleOutput != "6" {
  t.Error("Incorrect output, got: ", console.consoleOutput)
 }
}

Pretty lengthy, I know, but only because of the tedium involved at writing out a linked list at the bytecode level. If you're already familiar with Scheme/LISP, or just functional programming in general, the general flow of the program should make a fair amount of sense. But that about does it for this installment. As usual, the code is on GitHub if you want to take a deeper look.

Schego Part 3, Or Virtual Machines Aren't Really That Hard

2018-04-25T15:12:00.001-05:00

Since the previous installment in the Schego saga, a lot's happened. So much, in fact, that I've split up the next update into 2 separate parts - the one you're reading now, and one that will follow very shortly. Life has a way of getting in the way; you get so absorbed with "important" stuff you forget the blog series you promised to update on a reasonably-sane basis. On the upside, those of you paying attention will notice that I've still been updating the actual codebase itself with some pretty exciting features, and I'll do my best to go over some of them with this entry, in which we'll be looking at stack-based virtual machines. I'm thinking most of you will be familiar with the concept of a virtual machine, but perhaps haven't coded one before, so I'm not going to delve into what a virtual machine is. You might not have heard the term "stack-based virtual machine," so I'll take a slight detour to cover that first.

Essentially, there are - at a very high level - two major types of virtual machines; stack-based and register-based. A stack-based virtual machine requires we yank any data we want to manipulate or analyze onto a stack the virtual machine makes available before doing anything with it. A register-based virtual machine, on the other hand, will allow you to directly store the results of any calculations directly into a register (hence the name) without any stack usage. I personally find stack-based VMs a bit easier to understand, and they're arguably more widespread; both the .NET CLR and the JVM are implemented as stack-based VMs, though as a notable exception Lua's VM is a register-based implementation. In any case, Schego's VM is a stack-based one.

With some terminology out of the way, how do we go about implementing a VM? They're actually fairly straightforward things - we grab the next instruction to execute and simply run a different function depending on the one we grab. Slightly over-simplified, we create a giant switch-case structure and switch on the instruction. The beginning of the VMState.Step() function in Schego, which executes a single opcode, actually starts like this:

func (v *VMState) Step() {
 if v.CanStep() == false {
  // TODO: properly handle finished VM
  return
 }
 currentOpcode := v.NextOpcode()
 switch currentOpcode {
                // [opcode implementations here]

You could also handle this as a hashtable if you wanted, with the keys being the instructions and the values being function pointers to their respective implementations, but Lua uses a switch-case structure for this part of its VM, and for me, that's a sign that I probably don't stand to gain enough from switching.

What does an instruction look like, though? In Schego, as in most other VMs, an instruction (or opcode) is just a single byte that signifies what the following bytes in memory should be taken as. For an opcode to push an integer onto the stack, for instance, we'd expect the following bytes in memory to be the integer to push, like we have here:

case 0x03:
  // pushi
  // simply grab the next 8 bytes and push them
  intBytes := v.ReadBytes(8)
  v.Stack.PushInt(intBytes)

Not that hard to follow, right? Just grab the next 8 bytes in the program and treat them as a 64-bit integer to be pushed onto our stack for the rest of our program to manipulate. After that, we know the next byte should be treated as another opcode, and we can run our switch-case statement again. We can interact with the stack with a pop too, as in the case of comparing two integers:

case 0x40:
  // cmpi
  y := v.Stack.PopInt()
  x := v.Stack.PopInt()
  if x == y {
   v.Stack.PushByte(0)
  } else if x > y {
   v.Stack.PushByte(1)
  } else {
   v.Stack.PushByte(2)
  }

And here's an instruction we might use after comparing two integers - jne (jump if not equal):

func (v *VMState) jump() {
 addressBytes := v.ReadBytes(8)
 var address int64
 binary.Read(bytes.NewBuffer(addressBytes), binary.LittleEndian, &address)
 v.opcodeBuffer.Seek(address, io.SeekCurrent)
}

//----------------------

case 0x2D:
  // jne
  cmpResult := v.Stack.PopByte()
  if cmpResult != 0 {
   v.jump()
  } else {
   // skip the jump address
   v.opcodeBuffer.Seek(8, io.SeekCurrent)
  }

See how we're just simply reimplementing the same logic our physical CPU might be performing for us? That's all a virtual machine is, no black magic or anything else crazy. At least for now; things will get slightly hairy later on down the line. But the basic switch-case logic detailed above is something all VMs share.

Now, we've seen how individual opcodes are implemented, what might a full-blown program in our little language look like? Here's the very first test I wrote for Schego's VM - a small program that just pushes an ASCII string of "Hello, World!" onto the stack before printing it and exiting with status code 0:

type DummyConsole struct {
 consoleOutput string
}

func (d *DummyConsole) Write(line string) {
 // trim null
 d.consoleOutput = strings.TrimRight(line, "\x00")
}

func TestHelloWorld(t *testing.T) {
 opcodes := []byte{
  0x05, // pushs
  0x48, // H
  0x65, // e
  0x6C, // l
  0x6C, // l
  0x6F, // o
  0x2C, // ,
  0x20, // space
  0x57, // W
  0x6F, // o
  0x72, // r
  0x6C, // l
  0x64, // d
  0x21, // !
  0x0A, // \n
  0x00, // null
  0x43, // syscall
  0x05, // print string
  0x03, // pushi
  0x00,
  0x00,
  0x00,
  0x00,
  0x00,
  0x00,
  0x00,
  0x00, // 0
  0x43, // syscall
  0x06, // exit
 }
 console := DummyConsole{}
 retcode := RunVM(opcodes, &console)
 if retcode != 0 {
  t.Error("Expected return code of 0, got:\n", retcode)
 }
 if console.consoleOutput != "Hello, World!\n" {
  t.Error("Incorrect output, got: ", console.consoleOutput)
 }
}

The opcodes in the test are fairly decently annotated, so it should be pretty obvious what's going on. I've skipped some uninteresting stuff (like the specific implementation of Schego's stack) for the sake of keeping this article brief and interesting, and also some potentially interesting stuff (how do we push a string onto the stack?), since I intend on covering that in a later installment, especially considering how relatively technically dense the next article will be.

But you guys will have to wait just a little bit longer for that one. As always, though, the code is up on GitHub if you want to take a further look.

Sophomore year: a (rambling) retrospective

2018-04-23T15:06:00.000-05:00

I wish there was a way to know you're in the good old days before you've actually left them.

"Gone in a flash" would be the phrase that first came to mind if you asked me to describe how my sophomore year of college went. Honestly, college in general has seemingly flown by; it doesn't seem like all that long ago I was a high school graduate, thinking of college as the "next big phase" of his life. And yet somehow, I'm on the tail end of that journey already. You've probably heard the old bit about how life seems to move faster as you get older - it certainly seems that way to me, at least.

The scenery is different compared to freshman year, if nothing else. I've traded the empty expanse of the dorm lobby for the living room of my apartment. I've got working internet without having to leave my own abode - though it was only a few hours prior to typing this that I got the wireless drivers working on the Debian installation on my Thinkpad. Unlike last time, I'm not sick, and I'm pretty grateful for that, considering finals week is looming. It's a lot quieter here in the living room than it was in the lobby, except for the unerring tick-tock of the wall clock mounted above the utility room door. The home theater setup me and my roommates collectively built sits against the wall opposite me - a gorgeous 4K TV and subwoofer in addition to a Blu-ray player and my trusty Playstation 2 and Playstation 4. I wish someone would've told me how little time for gaming I would have here, though.

The outside world isn't all that's changed. Skipping class has gone from a moral affront of the worst degree to a last resort if I can't keep myself awake during prior lectures. I've noticed myself becoming much more proactive and confident when it comes to getting involved in campus activities. I'm an undergraduate researcher now - not something I ever saw myself becoming during high school or even freshman year. I got a job offer from Cisco to come work with them in Silicon Valley over the summer. I've volunteered to a great degree to get the good word out about STEM, probably reaching as many grade school children as I did my entire time in high school. There's still room for improvement, though; I told myself I would get involved in hockey but have failed to do so yet, and I've been on the fence about a workout routine for some time now. Maybe putting all that into words will finally shame me into action, or so I hope.

Honestly, with the whirlwind that sophomore year has been, I wonder what I was doing at all during my freshman year. On paper, freshman year was pretty quiet for the most part; classes, some volunteering activities here and there, not much else. Ultimately, I just think I was simply unaware of how far I could truly push myself.

There's a balance to be found between work and pleasure. Some of my fondest memories from freshman year are of me doing inane things with friends, or lazing around doing nothing in particular. I remember a particular afternoon a copy of Watchmen caught my eye in the university bookstore, and I spent several subsequent afternoons reading it on the steps in front of the plinth at the center of campus. It's a really pretty spot, still one of my favorites on campus. The pool surrounding the plinth, the row of trees adorning the linear pools further on that you can just make out if you crane your neck from the top of the steps. It wasn't there my freshman year, but immediately behind where I once read now stands the largest indoor Starbucks in the state, so there's that too.

I wouldn't trade any of those memories for the world, but I wouldn't part with any of my newfound achievements, either; both have their place. I think it does one no good to be on top of the world, but have nobody to celebrate it with. Conversely, I would almost consider my life wasted if I spent it in a purely cavalier manner - there is far too much I want to do, to see, to contribute, to burn it all away on pleasure. I don't claim to know how much of both is right for me or anyone else, mind you; I wouldn't be surprised if I found myself dissatisfied with sophomore year come junior year.

Yet, to a certain extent, I welcome that sense of discontent. I take it to mean that I am holding myself to a higher standard than I did before. At the end of the day, how much more can you ask from someone aside from steady improvement? As long as you know where you stand and where you're headed, I don't think you can go wrong.

So here's to the latter half of undergraduate life, with even more joy and achievements to be had. I hope one day I'll be able to look back on sophomore year with dissatisfaction.

Schego Part 2, Or On Parsing and Other Related Stuff

2017-10-01T16:45:00.002-05:00

Alright, part 2. And only a couple weeks later than planned to boot!

Last time, we discussed the basics of lexing - turning a string into a list of tokens - and the ins and outs of how that's done. So now, we'll move onto parsing, which is the next step in the process of turning our original source code into something the computer can understand.
The rationale behind the need for the parsing process is very simple, and mirrors that of the need for lexing: It's a lot easier to reason about generating code if you're working at the statement/expression level instead of at the token level. Think about it. If you were trying to write a program using a description of what that program is supposed to do, wouldn't you rather the description say "put an if statement here" instead of "stick a semicolon here and a number there"?
Additionally - and this is probably at least two future blog posts away - it's a lot easier to perform optimizations when thinking at the statement level as well. If you have a general idea of what the original code is attempting to do, you can potentially replace it with something faster. We'll need other data structures besides what our parsing process implements, but we'll cross that bridge when we get to it.
So, how is Schego's parser built? At a high level, much the same way as any parser: We represent the program's structure with a tree, known as an AST (Abstract Syntax Tree), with each individual node adhering to this interface:

// base interface for functions needing to accept any kind of AST node
type AstNode interface {
 GetSubNodes() []AstNode
 AddSubNode(AstNode)
 GetType() AstNodeType
 DebugString() string
}

If you recall the first article, I didn't create any interfaces there, so why the change? There will be functions - as we'll see in a moment - that should ideally work with/return any potential combination of AST nodes; when you're parsing a language where literally everything evaluates to some kind of expression, you need flexibility without repeating yourself. Due to Go not having partially abstract classes, I had to split the definition and implementation into two separate places, with the definition being in AstNode and the baseline implementation in another struct, SExp - short for S-Expression, which is what the structure of LISP/Scheme is based off of:

// base struct that all AST node implementations build off of
type SExp struct {
 subNodes []AstNode
}

func (s *SExp) GetSubNodes() []AstNode {
 return s.subNodes
}
func (s *SExp) AddSubNode(node AstNode) {
 s.subNodes = append(s.subNodes, node)
}

Still pretty straightforward, right? Specific nodes implement GetType() and DebugString() to return their proper values, and the basic stuff that should only be written once gets written once. By the way, GetType() returns a value from an enum describing what sort of node the struct is. This isn't normally something we'll need, but for certain optimizations, including the tail-call optimization technique required for all Scheme implementations, it'll be handy to see if some node has other sub-nodes with a certain type.
For the actual implementation of one of the nodes, here's the current implementation of if expressions:

type IfExp struct {
 SExp
}

func NewIfExp(cond AstNode, onTrue AstNode, onFalse AstNode) *IfExp {
 node := new(IfExp)
 node.AddSubNode(cond)
 node.AddSubNode(onTrue)
 node.AddSubNode(onFalse)
 return node
}
func (i IfExp) GetType() AstNodeType {
 return IfNode
}
func (i IfExp) DebugString() string {
 return "IfExp(" + i.subNodes[0].DebugString() + ", " + i.subNodes[1].DebugString() + ", " + i.subNodes[2].DebugString() + ")"
}

Nothing too out of the ordinary here. But what about literals?

type StringLiteral struct {
 SExp
 Value string
}

func NewStringLiteral(value string) *StringLiteral {
 node := new(StringLiteral)
 node.Value = value
 return node
}
func (s StringLiteral) GetType() AstNodeType {
 return StringNode
}
func (s StringLiteral) DebugString() string {
 return "\"" + s.Value + "\""
}

You might think that there's really no need for a string literal - or any other sort of literal, really - to have the ability to contain sub-nodes, which is an ability brought in by the SExp struct. And you'd be right; this is something of a leaky abstraction we've got going on here. For now, I think it's fine. Any code that will actually be messing with the AST after the fact should be calling GetType() in the first place, and due to the way parsing works in Schego, nothing else except post-AST-construction stuff will be modifying the literal except the exact same code that creates the literal from the original tokens in the first place. I'll also admit that I don't know for sure if what I describe here is an idiomatic Go solution to the problem, but it seemed fairly intuitive to me. If there's something more elegant out there, do let me know - I'm all in favor of doing no more work than necessary!
On a side note, structuring our data like this lets us write our tests fairly easily. All we have to do is compare the AST our parser made with one made ourselves in the test, like this:

func TestParseSingleExp(t *testing.T) {
 tokens := LexExp("(+ 5 3)")
 program := ParseTokens(tokens)
 expectedProgram := NewProgram(NewAddExp(NewIntLiteral(5), NewIntLiteral(3)))
 checkProgram(program, expectedProgram, t)
}

But this article was about parsing, right? How do we actually do the parsing bit? We've covered the data structures behind the process, but haven't detailed the actual parsing algorithm itself. Schego's parsing algorithm is what is known as a "recursive descent" parser; essentially, we define our "grammar" - our set of rules that describes our language - recursively. Take the if expression, for instance. We might say it's defined by the word "if," followed by a conditional expression, followed by two more expressions that will be evaluated if the conditional evaluates to true or false.
What do we define as an expression? It could be a lot of different things - a literal, a math formula, a function, another if expression (!), lots of stuff. What's important is that when we're constructing an if expression, we ask the parser to run its general expression-parsing function for the if's expressions. As long as we handle the base cases (literals/identifiers) properly, we can throw all sorts of wonky expressions at our if-parsing routine and it'll handle it just fine due to the recursion.
Additionally, the resultant code is fairly reusable inside the parser itself. If we define a function to parse, say, a list literal, we can simply call that function any time we expect a list literal, and it handles all the nitty-gritty details of validation and list-literal-specific parsing for us. Pretty much a win-win no matter how you look at it. But enough justification. What does Schego's parsing code look like?

// ParseTokens takes tokens and returns an AST (Abstract Syntax Tree) representation
func ParseTokens(tokens []*Token) *Program {
 program := NewProgram()
 currentIndex := 0
 for len(tokens)-1 >= currentIndex {
  node, _ := parseExpression(tokens, &currentIndex)
  program.AddSubNode(node)
 }
 return program
}

// accept checks to see if the current token matches a given token type, and advances if so
func accept(tokens []*Token, expectedType TokenType, currentIndex *int) bool {
 if tokens[*currentIndex].Type == expectedType {
  *currentIndex++
  return true
 }
 return false
}

// grabAccepted returns the token just before current, useful for grabbing the value of an accepted token
func grabAccepted(tokens []*Token, currentIndex *int) *Token {
 return tokens[*currentIndex-1]
}

// expect returns an error if the current token doesn't match the given type
func expect(tokens []*Token, expectedType TokenType, currentIndex *int) error {
 if len(tokens)-1 < *currentIndex {
  return errors.New("Unexpected EOF")
 } else if tokens[*currentIndex].Type != expectedType {
  return errors.New("Unexpected token")
 }
 return nil
}

func parseExpression(tokens []*Token, currentIndex *int) (AstNode, error) {
 // try literals first
 if accept(tokens, TokenIntLiteral, currentIndex) {
  literal := grabAccepted(tokens, currentIndex)
  return NewIntLiteral(bufferToInt(literal.Value)), nil
 } else if accept(tokens, TokenFloatLiteral, currentIndex) {
  literal := grabAccepted(tokens, currentIndex)
  return NewFloatLiteral(bufferToFloat(literal.Value)), nil
 } else if accept(tokens, TokenStringLiteral, currentIndex) {
  literal := grabAccepted(tokens, currentIndex)
  return NewStringLiteral(literal.Value.String()), nil
 } else if accept(tokens, TokenBoolLiteral, currentIndex) {
  literal := grabAccepted(tokens, currentIndex)
  return NewBoolLiteral(literal.Value.Bytes()[0] == 1), nil
 }
 // not a literal, attempt to parse an expression
 lparenError := expect(tokens, TokenLParen, currentIndex)
 if lparenError != nil {
  return nil, lparenError
 }
 // jump past the lparen
 *currentIndex++
 if accept(tokens, TokenOp, currentIndex) {
  // operator-parsing code here
 }
 if accept(tokens, TokenIdent, currentIndex) {
  identToken := grabAccepted(tokens, currentIndex)
  switch identToken.Value.String() {
  case "if":
   // TODO: error-handling here (and throughout the parser!)
   cond, _ := parseExpression(tokens, currentIndex)
   ifTrue, _ := parseExpression(tokens, currentIndex)
   ifFalse, _ := parseExpression(tokens, currentIndex)
   ifNode := NewIfExp(cond, ifTrue, ifFalse)
   expError := closeExp(tokens, currentIndex)
   if expError != nil {
    return nil, expError
   }
   return ifNode, nil
  }
 }
 // no matches?
 return nil, errors.New("Unexpected token")
}

// convenience function to ensure an expression is properly closed
func closeExp(tokens []*Token, currentIndex *int) error {
 rparenError := expect(tokens, TokenRParen, currentIndex)
 if rparenError != nil {
  return rparenError
 }
 *currentIndex += 1
 return nil
}

func bufferToInt(buffer bytes.Buffer) int64 {
 num, _ := binary.Varint(buffer.Bytes())
 return num
}
func bufferToFloat(buffer bytes.Buffer) float64 {
 bits := binary.LittleEndian.Uint64(buffer.Bytes())
 return math.Float64frombits(bits)
}

I've gutted some case-specific stuff like the details of parsing operators in an attempt to get the point across as succinctly as possible, but the overarching algorithm is still there. You can see the recursive calls to parseExpression present in the if-parsing code, and the literal-parsing code that forms the base cases. This basic setup is enough to get this tricky-looking test to pass:

func TestIfExp(t *testing.T) {
 tokens := LexExp("(if (> 6 5) \"true\" \"false\")")
 program := ParseTokens(tokens)
 expectedProgram := NewProgram(NewIfExp(NewGtExp(NewIntLiteral(6), NewIntLiteral(5)), NewStringLiteral("true"), NewStringLiteral("false")))
 checkProgram(program, expectedProgram, t)
}

All the if-parsing code knows is that an if node takes three expressions. It doesn't know anything about what constitutes a valid expression, or how to parse one; it leaves that up to the parseExpression() function, and it all Just Works.

As always, the code is up on GitHub if you want to take a look. We'll be looking at stack-based virtual machines next time, so we can actually do something with the AST we've generated. As I alluded to earlier, at some point in the (near? Distant?) future we'll also take a look at optimization, which is something I haven't really found in any similar how-to-make-a-language series. So that'll be something to look forward to.

Schego Part 1, or Why Implementing Scheme in Go Isn't All That Crazy

2017-08-24T13:34:00.000-05:00

So for anyone who has been following me on Github closely, this article is a week or two late - I want to apologize for that firstly. I've been busy with getting back up to speed with college and such. But on to the main event.

So, you may be wondering, what exactly is a Schego? It's my current baby, my main pet project, the first on the list of bullet points that I'd point to when people ask about what I've done. It's an R7RS Scheme implementation written in the Go programming language, with some nifty technical features planned for it down the road.

I imagine the first thing you'd ask is, why implement yet another Scheme interpreter, let alone one in Go of all languages? The answer is pretty simple and self-centered - I've been planning on learning Scheme since at least 2015, and I've taken an interest in Go as of late. Implementing a language won't familiarize you with its 3rd-party libraries, but it will give you a reasonably in-depth look at the way the language's syntax is built, and its various ins and outs. Even for a minimalist language such as Scheme, building an interpreter is no small undertaking, so the project stands to give me some reasonable experience in Go as well. Additionally, most of the relevant techniques and materials required to undertake such a project are usually senior-level classes, so it would look fairly good on a sophomore's resume, at least if I did it well.

As far as how to differentiate it from the other Scheme implementations out there, I have two main features in mind: The first is R7RS compatibility. This is the newest version of the official Scheme specification, ratified only about 4 years ago as of this time of writing. Only about 7-8 or so other Scheme interpreters currently implement the full R7RS spec to my knowledge, so in the Scheme world, that's reasonably rare air.

The second feature is a much more notable one, even though it arguably requires less work: Software transactional memory, or STM for short. It's a language implementation-level replacement for the lock-based multithreading paradigm that's usually implemented at the OS-level, unless we're talking about lightweight threads that are often cooperatively scheduled. As the name might imply, it works very similarly to the transactions you might find operating in a database or at a bank: Critical stretches of code are marked as atomic, and are run without actually committing any of their memory writes to RAM. At the end of the block, the first atomic block to write to a shared space of memory "wins," and its changes are committed. The rest are "rolled back" and made to run again. There's a little bit more to it than that, but that's the gist of it, and what will eventually end up in Schego. Unless someone with more time and/or motivation sees this and decides to one-up me, I think I'll have the first ever STM-capable Scheme implementation, too. So that's nice.

On top of doing all this, I've planned on writing a blog post every once in a while detailing the design decisions I've made in Schego, both as a documentation/teaching tool and as a justification method - if I can't defend it in writing, it needs to be reworked. At the time of this post, the only major Schego subsystem that's currently built is the lexer, and it's almost (but not quite) finished. For the uninitiated, a lexer is a system that takes in a stream of characters and turns it into a list of tokens, which are simply small bits of data that other code later on can use. For instance, a function or variable name would be a token, a mathematical operator would be a token, and so on and so forth. For the code that needs to figure out how to actually run our original Scheme program, it's much easier to think in terms of tokens than raw characters. So let's get on with that lexer, shall we?

I think usability is important. It's better to have software that's easy to use for the end-user and tough to implement compared to the other way around. Even if your end-user is another programmer, like if you were developing a library for instance, or even if your end-user is yourself, usability is still important. There's no easier way to create good library-level usability than to write software that has an interface that you'd want to use yourself, which in turn is done by writing the unit tests for your code before any underlying code has been written. Test-driven development has some other advantages (such as having reasonable code coverage if you genuinely do write all your tests before writing the code that makes them succeed), but I'm mostly concerned with this one facet. So how would I want a lexer to give me tokens back?

// test lexing a single s-expression
func TestLexSingleExp(t *testing.T) {
 tokens := LexExp("(abc def ghi)")
 expectedTokens := []*Token{
  NewTokenString(TokenLParen, "("),
  NewTokenString(TokenIdent, "abc"),
  NewTokenString(TokenIdent, "def"),
  NewTokenString(TokenIdent, "ghi"),
  NewTokenString(TokenRParen, ")")}
 checkTokens(tokens, expectedTokens, t)
}

This is what I came up with - the first unit test I wrote for Schego. It's really very simplistic - a LexExp function that takes a string to be lexed for tokens, and then returns a list of Token pointers containing their type and value. A single Token looks like this, by the way:

type TokenType int

const (
 TokenNone TokenType = iota
 TokenRParen
 TokenLParen
 TokenIdent
 TokenIntLiteral
 TokenFloatLiteral
 TokenDot
 TokenOp
 TokenChar
)

type Token struct {
 Type  TokenType
 Value bytes.Buffer
}

Pretty straightforward, right? I chose a bytebuffer to hold the value for the token since, to the best of my knowledge, there is no easier method for storing a generic bunch of bytes of uncertain value in Go. It could be my inexperience with the language showing, but this is the best method I've found.
So how does LexExp work?

// LexExp lexes an input string into Token objects. There are no possible user-facing
// errors from this process.
func LexExp(input string) []*Token {
 var tokens []*Token
 // accumulation variables for multi-character tokens such as idents and literals
 accumulating := false
 var accumulatingType TokenType
 var accumulatorBuffer bytes.Buffer
 for index, glyphRune := range input {
  glyph := string(glyphRune)
  if glyph == "(" {
   if accumulating == true {
    flushAccumulator(&accumulatingType, &accumulatorBuffer, &tokens)
    accumulating = false
   }
   tokens = append(tokens, NewTokenString(TokenLParen, glyph))
   // rparen
  } else if glyph == ")" {
   if accumulating == true {
    flushAccumulator(&accumulatingType, &accumulatorBuffer, &tokens)
    accumulating = false
   }
   tokens = append(tokens, NewTokenString(TokenRParen, glyph))
   // ident
  } else if unicode.IsLetter(glyphRune) {
   // were we building a number literal beforehand?
   if accumulating == true && accumulatingType != TokenIdent {
    flushAccumulator(&accumulatingType, &accumulatorBuffer, &tokens)
   }
   accumulating = true
   accumulatingType = TokenIdent
   accumulatorBuffer.WriteString(glyph)
   // we're not sure what this character is, let the parser deal with it
  } else {
   tokens = append(tokens, NewTokenString(TokenChar, glyph))
  }
 }
 // corner case if the input string while we're still accumulating
 // should never happen in proper Scheme, but still...
 if accumulating == true {
  tokens = append(tokens, NewTokenRaw(accumulatingType, accumulatorBuffer))
  accumulatorBuffer.Reset()
  accumulating = false
 }
 return tokens
}

If this isn't completely valid Go code, please forgive me. This is a rough recreation of what the original function looked like after implementing the first test case. Let's go over its internals all the same, though. You'll notice that an extended if/else is used instead of a switch block. Especially for those of you who read Lua's source code (which is a delightful read by the way - it's a very clear and well-constructed codebase), you might be wondering why I went with an if/else and not a switch. The main reason I did so is that R7RS is a little more complicated to lex - not parse! - compared to Lua, in my opinion. There's stuff like Unicode identifiers that really cannot be easily handled in a switch statement. I will admit that, similar to how Lua does things, a switch/case block with a default case that takes care of other situations that the switch could not would be worth a look, though; it's very possible I'll change LexExp over at a later date.

For multi-character tokens - identifiers, numerical/string literals, and so on - we have an accumulator buffer. Essentially, if we recognize that we've started lexing what could end up being a multi-character token, then we write it in the accumulator buffer instead of creating a token immediately. We then go on about our merry way accumulating (hence the name) characters in the accumulator buffer up until we find a character that cannot belong to a multi-character token, like a parenthesis, when that happens, we "flush" the accumulator and add a new token to the token list using the accumulator buffer's contents as a value before continuing to lex our not-multi-char token.
Lua does things a little differently. If it encounters a character that's (potentially) the start of a multi-character token, it lets each individual lexing function take care of proceeding further on down the line until it finds a suitable place to stop. For example, this is how Lua 5.3 handles numbers:

/* LUA_NUMBER */
/*
** this function is quite liberal in what it accepts, as 'luaO_str2num'
** will reject ill-formed numerals.
*/
static int read_numeral (LexState *ls, SemInfo *seminfo) {
  TValue obj;
  const char *expo = "Ee";
  int first = ls->current;
  lua_assert(lisdigit(ls->current));
  save_and_next(ls);
  if (first == '0' && check_next2(ls, "xX"))  /* hexadecimal? */
    expo = "Pp";
  for (;;) {
    if (check_next2(ls, expo))  /* exponent part? */
      check_next2(ls, "-+");  /* optional exponent sign */
    if (lisxdigit(ls->current))
      save_and_next(ls);
    else if (ls->current == '.')
      save_and_next(ls);
    else break;
  }
  save(ls, '\0');
  if (luaO_str2num(luaZ_buffer(ls->buff), &obj) == 0)  /* format error? */
    lexerror(ls, "malformed number", TK_FLT);
  if (ttisinteger(&obj)) {
    seminfo->i = ivalue(&obj);
    return TK_INT;
  }
  else {
    lua_assert(ttisfloat(&obj));
    seminfo->r = fltvalue(&obj);
    return TK_FLT;
  }
}

See the loop there? For comparison, this is what Schego's number-lexing code looks like, which admittedly doesn't handle things like exponents:

         } else if glyph == "." {
   // . is a valid character in an ident - add it to the accumulator
   // if we were building an ident
   if accumulating == true && accumulatingType == TokenIdent {
    accumulatorBuffer.WriteString(glyph)
    // we can't start an ident with . - are we building a floating point literal?
   } else if chr := peek(input, index); !unicode.IsSpace(chr) && unicode.IsNumber(chr) {
    accumulating = true
    accumulatingType = TokenFloatLiteral
    accumulatorBuffer.WriteString(glyph)
    // there's situations where a standalone . is valid
   } else {
    tokens = append(tokens, NewTokenString(TokenDot, glyph))
   }
 // number literal
  } else if unicode.IsNumber(glyphRune) {
   if accumulating == true && accumulatingType == TokenIdent {
    flushAccumulator(&accumulatingType, &accumulatorBuffer, &tokens)
   }
   accumulating = true
   // only declare that we are accumulating an int if we didn't see a . already
   if accumulatingType != TokenFloatLiteral {
    accumulatingType = TokenIntLiteral
   }
   accumulatorBuffer.WriteString(glyph)
   // we're not sure what this character is, let the parser deal with it
  }

Something you'll notice comparing the two - or something you might've figured out from reading the leading comment on LexExp - is that I intend on not having any errors occur during the lexing process. I think malformed code is easier to catch during parsing, and that the resulting error messages are still informative enough such that the culprit has enough info to to correct their mistake. Is this actually the case? This is simply me theorycrafting, so we'll have to wait and see.
But getting back to numbers and lexing numbers, personally I feel my method leads to less overall boilerplate, though you do have to check the accumulator essentially every time you grab a new token. You also need functions to manipulate the accumulator, which look like this:

// bufferStringToNum takes an input buffer and converts it from a string of
// character bytes to a float/int
func bufferStringToNum(tokenType TokenType, inputBuffer bytes.Buffer) *bytes.Buffer {
 bufferString := inputBuffer.String()
 var byteBuffer [binary.MaxVarintLen64]byte
 if tokenType == TokenFloatLiteral {
  num, _ := strconv.ParseFloat(bufferString, 64)
  binary.LittleEndian.PutUint64(byteBuffer[:], math.Float64bits(num))
 } else {
  num, _ := strconv.ParseInt(bufferString, 10, 64)
  binary.PutVarint(byteBuffer[:], num)
 }
 returnBuffer := bytes.NewBuffer(byteBuffer[:])
 return returnBuffer
}

// flushAccumulator empties the contents of the given Buffer into a new Token
// and resets it and the accumulator token type. A convenience function for LexExp.
func flushAccumulator(
 accumulatorType *TokenType,
 accumulatorBuffer *bytes.Buffer,
 tokenBuffer *[]*Token) {
 if *accumulatorType == TokenFloatLiteral || *accumulatorType == TokenIntLiteral {
  convertedBuffer := bufferStringToNum(*accumulatorType, *accumulatorBuffer)
  *tokenBuffer = append(*tokenBuffer, NewTokenRaw(*accumulatorType, *convertedBuffer))
 } else {
  *tokenBuffer = append(*tokenBuffer, NewTokenRaw(*accumulatorType, *accumulatorBuffer))
 }
 accumulatorBuffer.Reset()
 *accumulatorType = TokenNone
}

Not that bad though, right? Most of this code will need to be written only the one time; the checks for the accumulator are the only stretches that have to be done repeatedly throughout LexExp.

That's all for this first installment. The actual lexer is a bit longer to handle stuff like operators and other things, but the basic accumulator-based framework is still there. If you want to see what the lexer currently looks like, the code is up on GitHub if you want to look further. The complete lexer test suite can also be found there. Next time we'll be taking a look at parsing and a little bit of parser theory!

An end, and a beginning

2016-08-23T12:16:00.000-05:00

How lucky I am to have something that makes saying goodbye so hard.

If you've been paying attention to my blog, I don't delve into personal stuff often. Or at all, really. This is a reasonably big time in my life, so I thought I'd break with "tradition" - much as a barely-active 5-year old blog can have, at least - and talk a little bit about myself.

For starters, I'm now a student of Computer Science at the University of Texas at Dallas. As I type this, the second day of class has come to an end, and I'm sitting in an empty residence hall lobby late at night; the only sounds I can hear are the air conditioning, my fingers on the keyboard, and the celebratory welcome banners brushing against the walls in time with the A/C. I'm still suffering from the stubborn vestiges of a cold I came down with over the past weekend that also took out several of my newfound friends and colleagues. Additionally, the internet has been out in my dorm room since I've moved in, so I've had to use the lobby Wifi for email and the like instead. Regardless, college life has been fun so far. Plenty of friends to hang out with, lots of campus events to go to, friendly faculty to network with - you get the idea. Contrary to popular opinion, there's more to life than what's on the internet. I've barely noticed I don't have a dorm connection except when I have to do something I can't do from my phone.

From a high school student's limited perspective, college is an interesting place. You're given enough freedom that you're basically in a loosely-padded training environment for being an adult. You're given opportunities to seriously succeed, and to seriously fail, both by your own hands. You're given the freedom to stay up at 2AM to go to Walmart to get milk, or to sit in a lobby blogging, if you want. You're free to make your own choices, and live with the consequences; there's relatively few instances where there is someone looking over your shoulder, telling you what you should and shouldn't do. For someone who has potentially lived their entire life with someone else at the wheel, that kind of freedom can be a very scary feeling, and a daunting adjustment. But enough about that.

For someone who lived at home with family their whole life - like me, for example - being away from family and old friends can be a bitter pill to swallow. For almost 2 decades, I grew used to being able to see parents and siblings when I got up, to them being there when I went to bed. I grew used to having conversational partners in my siblings that I could engage whenever I wanted, who held interests similar to mine. Being away from people that you've built such powerful connections with can be a difficult ordeal, indeed. But, it's something you have to learn to cope with. You call friends and family as often as you can, and visit when possible. You find new friends that you can trust and open up to. Maybe even discover that you have family members living nearby and visit them.
Eventually, you might lose these people too. If I get a Masters out of state, and decide not to move back to the Dallas/Richardson area, this whole process of loss will repeat. And I'll start from square 0, again. I wouldn't blame anyone in my shoes for not feeling a little depressed over the whole situation, for wondering whether or not it was worth it to build all those precious connections in the first place, only to painfully tear them all down again.
However, at the time, I know my feelings were real. I know that the memories of elated moments with friends and family were genuine, were not fake; that they were caused by me being sincerely happy to be with them in that instant. And those memories - those fleeting, ephemeral instants in time - stick with you forever, regardless of how long your friendships last. And nobody can take them away from you. I don't want to stop making them because I might be afraid the people involved might eventually leave me. It's a common occurrence, with an element of certainty to it: People drift away over time, and finally, die. But until the day that you yourself leave this earth, those memories will faithfully serve as a living testament that the people you cherished so dearly ever existed. And I think that's a beautiful thing.

So here's to making many more memories, with the many people I've met and will meet. To my parents, thank you. To my dear old group of friends, goodbye, I wish you all the best. And to all my fellow incoming freshmen: Congratulations.

Life is a continuous cycle of losses, you see. But in order to lose, you must first gain. And the greater the joy at your gain, the greater the sorrow will be at your loss. But as long as we have the memories of those happy times - indeed, because we have those memories - people are able to go on.

FuzzBuzz and the Art of Programming

2016-05-19T21:35:00.002-05:00

Programming: It's as much an art as it is a science. The general distinction between art and science being the realm of subjectivity - in other words, while in the realm of science, "correctness" is still debated, nobody is under the illusion that there is more than one "correct" solution. Art, on the other hand, has no single "correct solution," though at the same time, the merits of each solution are generally weighed against each other solutions, debated, and so on.

Programming is a mix of both: There are certainly "solutions" - algorithms, programs, etc. - that can be objectively proven without a shadow of a doubt to be ineffective or outright incorrect. There is also the possibility of an algorithm that solves the solution, but is objectively inferior to another, such as an algorithm that runs in O(n) time versus one that runs in O(1) time. On the other hand, there is a fair amount of subjective nuance involved in arriving to a correct solution. Design methodologies, languages, and APIs all come and go in a technological blink of an eye, and each will have its pros and cons, its proponents and its skeptics.

Nobody - sane, at least - involved in all this is under the impression that there is One True Way to do things. You could lock several seasoned programmers up in separate rooms, all of whom subscribe to the exact same design methodologies and use the exact same programming languages, ask them to design a program, and they would all likely emerge with completely separate, and likely equally or near-equally satisfactory, solutions. So, without further ado, let's talk about a simple programming problem, and some possible solutions to solve it. At the end, we'll discuss for a quick minute about which of these solutions someone might actually end up using.

The problem in question is one I've named FuzzBuzz, which was definitely not a reference to an overused programming interview question. To solve FuzzBuzz, one must write a program that considers the numbers from 1 to 100, inclusive. For each number, if the number in question is even, the program must print "Fuzz." If the number is odd, the program must print "Buzz." If the number is a multiple of 3, the program should ignore its even or odd nature and print "FuzzBuzz" instead.

This is an intentionally simplified problem, both to make it easy to make multiple solutions, and to make the solutions themselves clearer. By the time you finish reading this sentence, I bet you'll probably have thought of at least one way to solve this problem, but let me go over a couple first.

First up is probably the one that came to everyone else's mind, just a simple restating of the problem's definition in code.

for x in xrange(1, 101): # range/xrange stops at the integer before the 2nd argument
    if x % 3 == 0:
        print "FuzzBuzz"
    elif x % 2 == 0:
        print "Fuzz"
    else:
        print "Buzz"

Not really a whole lot to talk about with this one; like I said earlier, it's probably the first one you thought up yourself - it was the first one I came up with, as well. Honestly, if this were a real-world problem, most people would (and probably should) stop here. But innovation never sprung from a contented mind, right?

So, for our next solution, let's limit ourselves in some way. The modulo operator is pretty central to the prior solution, so let's take that out next.

def is_multiple_of(number, mul):
    """Return True if number is multiple of mul, false otherwise.
    Okay, so here's how this works: Normally, Python - and most
    other programming languages - chops off the decimal part of the quotient
    if both terms are integers, i.e, integer division is performed. However,
    if one of the terms is a float, then the decimal part will be kept.
    Any number plus a decimal part is going to be greater than simply the number itself,
    so we compare the two. If number is in fact a multple of mul, then there will be
    no decimal part, and both sides of the expression will be equal."""
    if float(number) / mul == number / mul:
        return True
    else:
        return False


def multiple_of_three(number):
    return is_multiple_of(number, 3)


def multiple_of_two(number):
    return is_multiple_of(number, 2)
    
    
for x in xrange(1, 101):
    if multiple_of_three(x):
        print "FuzzBuzz"
    elif multiple_of_two(x):
        print "Fuzz"
    else:
        print "Buzz"

Still pretty straightforward. Not having access to the modulo operator means we had to basically reinvent one ourselves, though. But we could go further still, couldn't we? How about no floating-point math?

def is_multiple_of(number, mul):
    """Return True if number is multiple of mul, False otherwise.
    So again we're exploiting the properties of integer division.
    For any divisor, there are at least two pairs of two consecutive dividends that will return
    the exact same quotient under integer divison, and the smaller number in each
    is a multiple of the divisor. All we have to do is check that number
    is the smaller of the pair. We do that by seeing if the result changes
    if we subtract 1. If so, number is the smaller of the pair (i.e, number - 1
    is part of some other pair), so return True.
    Otherwise, we were given the larger of the pair, or alternatively,
    some completely different number, so return False.
    """
    if number / mul > (number - 1) / mul:
        return True
    else:
        return False


def multiple_of_three(number):
    return is_multiple_of(number, 3)


def multiple_of_two(number):
    return is_multiple_of(number, 2)
    
    
for x in xrange(1, 101):
    if multiple_of_three(x):
        print "FuzzBuzz"
    elif multiple_of_two(x):
        print "Fuzz"
    else:
        print "Buzz"

Now things are getting ever-so-slightly hairy. But let's get dangerous now. How about no Boolean logic whatsoever?

from collections import deque
sequence = deque(["Buzz", "Fuzz", "FuzzBuzz", "Fuzz", "Buzz", "FuzzBuzz"])


def next_in_sequence():
    """Return next string in FuzzBuzz sequence.
    The FuzzBuzz problem follows a pattern:
        * Odd
        * Even
        * Multiple of 3
        * Even
        * Odd
        * Multiple of 3
    All we have to do is keep track of where we are in the sequence.
    To accomplish this, we use a deque and a little bit of pushing and popping
    to keep from having to either pre-generate the entire sequence beforehand,
    or from having to use an iterator.
    """
    next = sequence.popleft()
    sequence.append(next)
    return next


for x in xrange(1, 101):
    print next_in_sequence()

Alright, now we're talking. Admittedly, you probably saw this one coming if you were running the programs alongside reading this article - once you recognize the pattern, this solution is pretty obvious. But let's make one final push: No modulo operator, no math, no Boolean logic, and no loops.

from collections import deque
import sys
# Python allows the user to manually set the recursion limit.
# So, we exploit this by setting it to 101, which simulates the number of
# loop iterations we'd execute if we were using a loop.
# Eventually, we'll exceed this number, and Python will throw
# a RuntimeError, essentially accomplishing the same thing as
# a regular for/while loop.
sys.setrecursionlimit(101)


sequence = deque(["Buzz", "Fuzz", "FuzzBuzz", "Fuzz", "Buzz", "FuzzBuzz"])


def next_in_sequence():
    """Print next string in FuzzBuzz sequence.
    This is identical to the deque solution in sequence.py, save for
    printing out the next string instead of returning it, and the recursion bit,
    so go see sequence.py for an explanation of what's going on here.
    """
    next = sequence.popleft()
    sequence.append(next)
    print next
    next_in_sequence()


try:
    next_in_sequence()
except RuntimeError:
    # swallow the eventual exception due to recursion taking one step too many
    pass

I feel now is a good time to stop and discuss the various solutions, now that we have an ample amount. Like I mentioned earlier on, if this were a real-world problem, I imagine most people using the first solution. It's clear, concise, and gets its point across without needing much in the way of hand-holding. It's easy to extend if we need to tackle additional multiples, as well.

Most of the rest of the solutions are pretty silly if we examine them without the restrictions we came up with. The modulo operator is a pretty useful tool for this problem, so it makes no sense to try to work around it, or reimplement it with our own code. I could see an argument being made for the non-recursive sequence solution - some might consider it more "elegant" than the original solution, though obviously that's up for debate.

Some might also think the sequence version is faster, considering the O(1) performance time of deque.popleft() and deque.append() (collections.deque is implemented as a doubly-linked list under the hood), but at least on my machine, the initial solution is faster when testing 10,000,000 numbers, completing in about 3.1 seconds versus 3.3 seconds for the sequence version. After some fiddling around, I was able to produce a variant on the sequence solution that ran in about 2.9 seconds on the same set:

from collections import deque
sequence = deque(["Buzz", "Fuzz", "FuzzBuzz", "Fuzz", "Buzz", "FuzzBuzz"])


# Binding oft-used functions to local scope results in a speed increase
# due to decreased variable lookup time
popleft = sequence.popleft
append = sequence.append


for x in xrange(1, 101):
    # Moving the function's code to the loop body removes the function call,
    # which actually increases performance slightly
    next = popleft()
    append(next)
    print next

By this point though, we've lost a fair amount of readability, which was one of the main reasons to use the sequence solution. On the upside, we have a reasonable increase in speed, which might be useful in a performance-critical application. On a side note, the performance of the other solutions is pretty abysmal in comparison; the modulo-less solution ran in about 8.5 seconds, whereas the float-less solution ran in about 6 seconds. As for the recursion-based solution, it segfaulted on my machine before it could actually complete the 10,000,000-iteration loop.

So, to summarize: Programming is about 50% art and 50% science; there's usually more than one demonstrably correct solution to the problem, and on top of that, there's often some fair room for debate about which of the solutions one should proceed with. We presented a simple programming problem, came up with some possible solutions for it, and discussed their pros and cons.

As an added bonus, I've set up a Github repository at http://github.com/DangerOnTheRanger/fuzzbuzz that contains all the solutions in this article, as well as reference output to test your own solutions with. If you come up with a solution you like, regardless of its performance or what language it's written in, send me a pull request, and I'll add it to the repo.

Ambiguous developer intention in fighting games

2013-09-05T14:42:00.000-05:00

This is a topic that really grinds my gears: Ambiguous intention from developers in fighting games. I believe this sort of thing is very harmful to fighting games, and I'll be using this article to explain what in particular I have against ambiguous intention, provide some examples of ambiguous intention in several fighting games, and how I think those games would be better of without such things.
First off, let me give a solid, simple definition of ambiguous intention. Basically, it is an instance in a fighting game where a restriction the game's mechanics impose on the players can be avoided via use of one or more mechanics/techniques the game provides. There are several issues with such a thing, in my opinion:

It adds an artificial execution barrier to the game. Anyone playing the game at a competitive level will have to manually perform something that would otherwise (and should) be done automatically or not at all.
It makes the developer's intent unclear - do they want the restriction to be respected or not?
It makes the game seem harder or "deeper" than it actually is.

The first example of unambiguous developer intent that comes to my mind, not to mention one of the most generally-accepted in the fighting game community, is nicknamed "chicken guarding" in the vs. Capcom series. Normally in those games, during a short period after you initiate a dash, regardless of what direction it's in or whether it's in the air or not, you are unable to block. Clearly, the developer's intent is to make dashing punishable if predicted.

However, it is possible to instantly cancel a dash into a jump in most games in the vs. Capcom series, and after you do so, to instantly block. This overrides the previously-implied restriction about punishable dashes. Now, I have absolutely nothing against making vs. Capcom dashes punishable or unpunishable; what I do have a problem with is the ambiguous developer intent: Do the developers want dashes to be punishable or not?

If the developers do want them to be punishable, then disallow blocking even after the dash-canceled jump, or if they do not want them to be punishable, then allow blocking directly during a dash, but by no means should they require the players to go through an unnecessarily-difficult input to accomplish a competitively mandatory technique which may or may not have been intended in the first place!

The second example, and one that I think is particularly egregious, is L-canceling in Super Smash Bros. Melee. L-canceling is a "mechanic" - I am not entirely certain it is deserving of the title - that allows you to cut the period of time your character cannot move or attack following an aerial attack in half by pressing the shield/block button a few frames before you hit the ground.

With the previous example of chicken guarding in addition to the 3 problems with ambiguous developer intent I listed earlier in mind, it should be immediately obvious what the problem with L-canceling is. If the developers want aerial attacks to have x amount of landing lag, then they should leave it at that; if they want aerial attacks to have 0.5x amount of landing lag, then they should do so, but they should not require players to press an extra button for something that could easily be done automatically. Artificial "difficulty" like this makes a fighting game seem more difficult and technical than it really is, and oftentimes hides serious problems behind a fake execution barrier.

The last example I'm going to give is of "kara throwing" in the Street Fighter series. Kara throwing essentially involves using an attack that moves your character forward, but before the part of the attack that actually hurts the opponent is activated, you cancel the attack into a throw, giving the throw more range than it otherwise would have. The window of opportunity to cancel the attack into a throw is usually very small - normally about 1 or 2 frames (1/60th to 1/30th of a second).

Again, it should be obvious what the issue with kara throwing is. If the developers want certain character's throw range to be longer, then they should be longer by default, or if they want throws to be shorter, then they should prevent kara canceling, but they should not require players to perform a difficult input for something which was not only unintended in the first place (kara throwing was originally a glitch in Street Fighter 3, but is intentionally in Street Fighter 4), but which obviously could be solved with a much simpler solution.

I've seen several arguments stating that mechanics like the ones I've listed are good things, the majority of which basically boil down to the opinion that the more buttons you have to press and the more complicated things are, the better. Personally, I believe the simpler things are, the more time and energy you can focus on the mental aspect of the game, which, in my opinion, is far deeper and more satisfying than any arbitrarily difficult "mechanic" could ever be.
It could also be argued that since no-one really complains about arbitrary difficulty and ambiguous intent, it is not a problem that needs to be solved. I believe there are a majority of reasons this is the case, the largest being that since practically all fighting games ever made have had some sort of arbitrary difficulty, with the recent ones intentionally and the older ones unintentionally, means that anyone who has played fighting games for any decent amount of time is quite used to this sort of thing by now. So yes, there is some complacency, but I think many fighting game players have merely failed to realize how deleterious ambiguous intent and arbitrary difficulty can really be, since they have had to live with it for as long as they have played fighting games. I think once more players realize how much simpler and easier to play fighting games would be without this sort of thing, the more they will warm up to the idea and let developers know about it.

I should note that I think there is a definite place for this sort of mechanic in video games; intentionally difficult single-player games like Devil May Cry and I Wanna Be The Guy thrive off this sort of thing, and many people play those sort of games just because of all the arbitrary difficulty - I'm one of them. However, I don't think arbitrary difficulty has a place in fighting games; I'm trying to fight my opponent, not the game, and the easier it is to do whatever it is I should to in a given situation, the better.

I think I should also make it clear that the presence of an arbitrary mechanic does not automatically ruin a game. I enjoy playing and watching many games that have a number of arbitrary mechanics, including the games I used as examples. All I'm saying is how much better those games would be than they already are without the arbitrarily difficult mechanics, in addition to a larger userbase, since a (somewhat correct) public notion of fighting games is that you have to spend a lot of time practicing arbitrary stuff in training mode before you can become good at them.

And you know what? Eliminating ambiguous developer intent and arbitrary difficulty would go a long way towards eliminating that notion.

Super Smash Bros., Combos, and You - an article on a theoretical combo system for SSB

2012-10-23T11:55:00.000-05:00

Super Smash Bros. has something of a mixed history as far as combos go. The original SSB64 had combos galore - most of the characters had combos that could take an as-of-yet untouched opponent and KO them without a single chance to dodge or counter - the definition of a "true" combo. Super Smash Bros. Melee significantly reduced the stun the average move does to your opponent, so combos (true combos anyway) became weaker and shorter compared to SSB64.

Now, with the latest release in the SSB series - Super Smash Bros. Brawl - true combos are practically non-existent, meaning the guaranteed damage the average character in the cast can do is typically limited to one or two moves in a row. This may not sound important, but it has several ramifications:

Lack of combos means mistakes aren't as costly. In the typical fighting game (SSBM and SSB64 included), 2-5 mistakes might mean a stock (or the game). SSBB, on the other hand, typically lets you get away with more than a dozen mistakes before you lose.
The skill ceiling is significantly lowered. SSBB pros can punish mistakes no harder than intermediate players, with the possible exception of what's called "gimping" - KO'ing your opponent with relatively weak move, usually used away from the stage edge.
Comebacks become a lot harder, since you'll typically have to repeatedly use one-off attacks, while only a few attacks (many times just one well-placed one) from your opponent will cost you the game.

What I suggest is to give SSB a combo system/mechanic. I'm not going to describe what I have in mind for the system upfront - rather, I'll evolve it over the course of this article, which I think will lead to a better understanding of how a "true" combo system would significantly help SSB and alleviate the aforementioned issues Brawl has. So let's get started!

The first step is to see what other fighting games use. The combo mechanic you most often see in other games is this thing called canceling. Basically, canceling let's you literally cancel one move's animation frames into an entirely different move altogether. Depending upon the game, you can only cancel a move if it touches the opponent (regardless of whether or not it does damage), and for the purposes of this article, we'll assume that's the case for our combo system. Additionally, you can typically only cancel regular attacks into heavier regular attacks (think SSB's standard attack/auto combo) or special attacks.

So now we have our base mechanic. How do we apply it to SSB? Let's assume we can cancel a standard attack into a tilt attack or special attack, we can cancel a tilt attack into a smash attack or special attack, and we can cancel a smash attack into a special attack. So now, every character has this basic combo: Standard attack > tilt attack > smash attack > special attack.

This is pretty good already, but this system suffers from the same issue Brawl has: There's little - if any - room for improvement over the simple combo I just mentioned. The skill ceiling issue remains basically untouched. We need to figure out a way to allow creativity within our little combo system. What can we do to fix this problem?

We have two options: We could further extend the boundary of what can or cannot be canceled, or we can come up with another method of extending combos. The first method will simply lengthen our previous combo, while the latter's benefit will be dependent upon how versatile the introduced sub-mechanic is. We'll try the latter approach, since it helps alleviate the skill ceiling issue Brawl suffers from. So without further ado, I give you: The launcher.

Simply put, a launcher is a move that, like the name implies, launches the opponent up into the air. You can also typically cancel the launcher's animation with a jump if the launcher hurts the opponent, and for our combo system, we'll also allow launchers to be canceled into a dash, which should allow for a bit more creativity as well.

We have to integrate launchers with the little cancelling hierarchy we defined earlier now. Let's insert launchers between smash attacks and special attacks, so that our standard combo now looks like: Standard attack > tilt attack > smash attack > launcher + aerial attack or special attack.

It's not much, but this combo system is already shaping up to be relatively open-ended. However, since we've been focusing on adding new features to it, we've been left with a lot of loose ends to tie up:

Knockback (not to mention DI) will effect the viability of a combo - i.e, an opponent might be knocked back far enough that even if you cancel, the new move will not connect. Something needs to be done about this.
Hitstun - the amount of time the opponent is stunned and unable to do anything after being hit with one of your moves - will probably need to be increased on average, at least relative to Brawl. Otherwise, even if you perfectly cancel a move into another move, your opponent might recover before the second move's active frames activate.
It's conceivable that the combo system could possibly be abused to the point of infinite combos. We obviously can't think of all possible combos, so we should put in some sort of mechanic that will prevent combos from becoming excessively long.

Let's tackle knockback first. We obviously can't take knockback completely out of the system - it's too much a part of SSB to remove it now. So I propose that the hitlag system normally reserved for dramatic, powerful attacks be applied to all moves in the game, with about 5-10 frames of hitlag per hit. Those frames will be the "cancel window" during which you can cancel to another move and be guaranteed that it will connect - no knockback will be applied until after the cancel window has passed. Launchers will have set vertical knockback to make comboing easier, similar to Falco's shine in SSBM.

For those not familiar with the hitlag (also known as "freeze frame") system, hitlag is basically a period of time after which a move connects with an opponent during which the game is, for all intents and purposes, frozen - neither the attacker nor the defender can move during the hitlag period.

This doesn't address the tricky problem that is DI (Directional Influence), however. It's possible in SSBM and SSBB (SSB64 didn't have DI) to use DI to shift the direction in which you get launched and thus escape combos. Whether or not that's a good thing probably comes down to personal opinion, but DI is here to stay in all likelihood, so I'd prefer to design our theoretical combo system to incorporate DI.

The way I see it, the easiest way to incorporate DI into our combo system is to allow DI only after the cancel window on each hit has passed. This brings up a dilemma: What about multi-hit moves? I want to keep comboing as offense-oriented as possible, and thus have relatively few options - if any - to break out of a combo, so I think having a cancel window after each hit, but no knockback and DI until after the cancel window for the final hit has passed is a good idea. Launchers probably shouldn't be DI-able for similar reasons, so let's assume that we can't DI out of a launcher as well.

Now we only have to deal with infinite combos, something a lot of fighting games don't get right. Even the borderline combo-less SSBB had a few infinites here and there, so we'd better take extra caution with our combo system, since it's a lot easier to chain multiple moves back-to-back. Again, taking a look at what other fighting games typically do to alleviate this, I now introduce: hitstun deterioration.

In a nutshell, hitstun deterioration gradually lowers the amount of hitstun of each successive hit in a combo. Eventually, the hitstun of each hit will be practically 0, and the opponent will be able to easily escape the combo. It's obvious that hitstun deterioration could possibly limit legitimate non-infinite combos, as well as put characters with lots of multi-hit moves at a disadvantage, but it's probably the best defense against infinite combos out there right now.

Let's take a quick review of what our combo system's main mechanics are (this is a good TL;DR paragraph if you skimmed over the other parts of the article):

Canceling. We can cancel moves into certain other moves in this order: Standard attack > tilt attack > smash attack > launcher + aerial attack or special attack.
Cancel windows. There's a small window of opportunity after each hit of a move (about 5-10 frames), during which you can cancel the current move into another move, as per our canceling mechanic defined earlier. No knockback is applied (and DI is impossible) until after the cancel window is over.
Launchers. We can perform a launcher, which is basically a move which launches the opponent into the air with set vertical knockback, to perform an aerial combo. If the launcher connects with the opponent, you can cancel the launcher into a jump or a dash. You cannot DI out of a launcher.
Hitstun deterioration. The hitstun of each successive hit in a combo gradually decreases to almost 0, thus effectively preventing infinite combos. Note that this doesn't prevent a zero-to-death combo, just infinite combos.

So there you have it. We evolved a theoretical combo system around SSB's unique mechanics, ironed out some minor kinks, and even introduced a few unique mechanics of our own along the way. It's probably not perfect, and might have some errors I've overlooked while I wrote this article, but it's a simple, solid combo system that should be enjoyable by newbies and professionals alike. So that said, it's time for closing statements. :)

Project M, a very popular mod for SSBB, includes a combo system for Lucario very similar to the one described here. The most notable differences is that the cancel window system described here is missing, and Lucario's launcher (his up smash) does not have set knockback, meaning it's better as a standalone kill move, but somewhat hard to use at times as a launcher. Hitstun deterioration is also missing, and all his combos are DI-able as far as I know, but the system is still quite similar to the one in this article.
It would be feasible to make a mod for SSBB that implements our theoretical combo system, but although I'm not lacking in programming skills, I'm still quite unfamiliar with modding SSBB in general, unfortunately.
SSB4 will be coming out sometime next year (2013), and Masahiro Sakurai (SSB's creator) has stated that the game will be taking a "new direction" compared to previous SSB titles. Namco - who develops the popular fighting game series Tekken - is heading development of SSB4, so who knows, you might see something like this in SSB4.

Thanks for taking the time to read through this giant wall of text!

How to use sys.meta_path with Python

2012-07-19T15:24:00.001-05:00

Update: See this article for a Python 3 take on import hooks.

I was asked on Reddit a short while ago as to how to use sys.meta_path, which is an extremely valuable tool that can be used to implement import hooks. Since it seems there's little documentation as to how to use it properly, I decided to write an example that shows basic usage of sys.meta_path. I posted the example on Reddit in the thread I linked to, but I've put it here for posterity - enjoy. :)

import sys


class VirtualModule(object):

   def hello(self):
      return 'Hello World!'


class CustomImporter(object):

   virtual_name = 'my_virtual_module'

   def find_module(self, fullname, path=None):
      """This method is called by Python if this class
         is on sys.path. fullname is the fully-qualified
         name of the module to look for, and path is either
         __path__ (for submodules and subpackages) or None (for
         a top-level module/package).

         Note that this method will be called every time an import
         statement is detected (or __import__ is called), before
         Python's built-in package/module-finding code kicks in.
         Also note that if this method is called via pkgutil, it is possible
         that path will not be passed as an argument, hence the default value.
         Thanks to Damien Ayers for pointing this out!"""

      if fullname ==  self.virtual_name:
      
         # As per PEP #302 (which implemented the sys.meta_path protocol),
         # if fullname is the name of a module/package that we want to
         # report as found, then we need to return a loader object.
         # In this simple example, that will just be self.

         return self

      # If we don't provide the requested module, return None, as per
      # PEP #302.

      return None

   def load_module(self, fullname):
      """This method is called by Python if CustomImporter.find_module
         does not return None. fullname is the fully-qualified name
         of the module/package that was requested."""

      if fullname != self.virtual_name:
         # Raise ImportError as per PEP #302 if the requested module/package
         # couldn't be loaded. This should never be reached in this
         # simple example, but it's included here for completeness. :)
         raise ImportError(fullname)

      # PEP#302 says to return the module if the loader object (i.e,
      # this class) successfully loaded the module.
      # Note that a regular class works just fine as a module.
      return VirtualModule()


if __name__ == '__main__':

   # Add our import hook to sys.meta_path
   sys.meta_path.append(CustomImporter())

   # Let's use our import hook
   import my_virtual_module
   print my_virtual_module.hello()

Note: The original PEP that defined the sys.meta_path protocol is PEP #302, which you can read here.

Reverse-engineering LEGO Racers 1 - part 1

2012-07-14T23:21:00.000-05:00

First off, a brief description of LEGO Racers 1: It basically was a racing game released back in 1999 (and then re-released in 2001) where you could make your own car (and driver) out of LEGO bricks and then race it against half-a-dozen or so AI cars or a friend, on one of about two dozen tracks. Sounds fun, right? It is.

Sadly, since the game was released so long ago, it can be difficult to run it on a modern OS - the 1999 release of the game had DRM that required the disk, for example - not to mention there's quite a bit of dummied-out content (several shortcuts are missing, most famously from the Knightmare-athon track) that I and probably several others would love to play around with and modify. And that's why I've decided to write a drop-in replacement for the LEGO Racers 1 game engine.

It's not a new idea - I can bring up forum threads from up to 2 years ago of people attempting to mod LEGO Racers 1. That being said, these people are trying to mod the game itself, not create a replacement for LEGO Racers 1's game engine. I want to create a game engine instead of attempting to mod the game itself for several reasons:

A mod of the game still would have issues with the DRM and compatibility issues I previously mentioned; one would still need the original game to play the mod.
A mod of the game would not be able to add new content without removing existing content; I doubt, for instance, you could add new circuits, bricks, chassis, or piece sets (you unlocked the pieces of each bosses car when you beat them) without removing existing content.
A mod of the game would still be subject to the limitations of the original game engine - AI cars still wouldn't follow shortcuts, the AI behavior itself couldn't be changed, no new racing modes could be added, racing with more than 2 human players wouldn't be possible (and even then, you'd need a joystick), etc. In other words, only content would be moddable, not game logic.

There would be many advantages to making a game engine as well:

Not only would only the data files from the original game be required to play, but it would be feasible to create a replacement for the original data files as well, so you wouldn't even need the original game to play (ala PrBoom + freedoom).
A replacement for LEGO Racers 1's game engine would allow everything to be moddable - powerups, tracks, circuits, AI behavior (imagine the AI being a script the game engine runs instead of being part of the engine itself), bricks, piece sets, cutscenes, animations - everything.
Making a replacement game engine instead of modding the original game makes modding much easier and more approachable to a new modder.

Right now, I intend to make the engine open-source (under the Apache License 2.0, probably), and host it on GitHub, to make it more accessible to potential developers. If you'd like to help out (that said, I haven't started yet), let me know. :) I'll try to keep updating my blog as progress on reverse-engineering LEGO Racers 1's data formats is made.

Minetest Triforce

2012-07-13T18:47:00.001-05:00

In case you're not familiar with it, Minetest is basically a free, open-source clone of Minecraft, plus some new stuff like Lua mod support. Anyway, I made a very large Triforce on gameboom's mt1.gameboom.net (port 30000) Minetest server some time ago, and this is the result:

Daytime

Nighttime

Evo 2012 - one hype event!

2012-07-09T23:45:00.001-05:00

In case you're not familiar with it, Evo is a international fighting game tournament (though Mario Kart was on the roster one year a few years back) held every year in Las Vegas, Nevada. It attracts thousands (if not tens of thousands) of participants and spectators every year from places like Taiwan, Japan, and South Korea. Indeed, many of the top players come from those countries. The live video stream of the tournament attracted almost a hundred thousand viewers from all over the world at its peak, which is an amazing thing.

I wasn't able to watch the tournament in person, but I was able to watch the livestream for the entirety of the Super Street Fighter IV AE 2012 and Ultimate Marvel vs. Capcom 3 events. There were quite a few exciting matches - for me, here are the highlights of the tournament:

Hopefully, I'll be able to watch Evo in person next year, and possibly compete in the Super Street Fighter IV AE 2012 section if I ever buy the game.

Switching to Blogger

2012-07-09T22:37:00.004-05:00

I exported my WordPress blog to Blogger - I think I'm going to enjoy the change. We'll see about that in the coming months, though. :)

Switching blog licenses.

2012-06-10T09:01:00.000-05:00

I've given it some thought, and I've made a decision: I'm switching the license for the articles I write on my blog from the CC-BY-NC-SA to CC-BY-SA. I'm switching for two reasons:

It lets my articles be re-used/re-posted in far more places I do want it to be used (Wikipedia, other places using the CC-BY-SA, etc...)

I don't think anyone's going to plagiarize my work anytime soon - besides, at the very least, they have to give me credit (the BY clause).

Besides, I think the more liberal licensing terms make the blog more attractive to FOSS enthusiasts, which is a big plus for me since I need all the publicity I can get. Hopefully I'll see the rewards of switching to the CC-BY-SA license in a year or so. :)

Adventures in optimization

2012-05-29T12:32:00.000-05:00

Yesterday was the first day I ever had to seriously optimize some code I had written. The code in question? A point cloud implementation.

When I started work on said implementation, I coded the library in pure Python, and tried it out on a 50,000-point volume. It took 15 seconds just to initialize and move the volume - completely unacceptable. I had a lot of optimization work to do.

The first technique I tried was just a faster algorithm. I switched from using a SQLite database to flat dictionaries, which gave a speed boost of around 2 seconds. Next I tried binding expensive non-local variables to local scope - an extra second or so. Now I was out of ideas (most of the performance-intensive code was just simple for loops) - I decided to re-write the code in Cython.

Simply compiling my library took the performance up to about 7 seconds; adding static typing in strategic spots took the performance even farther - to 2 seconds. A few Cython-specific optimizations further (binding class member variables to local scope, for one), and my library could process a 50,000-point volume in about 0.35 seconds.

Unfortunately, that's still way to slow for my needs - I need 0.1 seconds or less (preferably 0.05 or less) per 50,000 points, so I still have a ways to go. What I'll probably do is re-code some of the library in flat C++, and then access that from Cython. I'll try to keep my blog updated on whether or not I'm successful in this venture. :)

My toy scripts - now available on SourceForge!

2012-04-13T05:31:00.000-05:00

I've decided to make good use of SourceForge's per-developer project space - a really nice feature, by the way - and upload all the toy scripts I've made over the years to https://sourceforge.net/u/openblocks/code/. All the scripts are licensed under the 3-clause BSD license, so feel free to give them a look and make your own modifications. :)

Keep in mind that since I use Linux as my main operating system, some of the toys that are shell scripts won't work on Windows, unfortunately.

The hashcash algorithm

2012-04-11T06:15:00.000-05:00

Yeah, it has a weird name, but it's one of the most interesting (and effective) anti-spam algorithms out there: Hashcash. How does it defeat spam without the end-user even knowing a spam check is taking place? Read on.

No, this isn't a Bayesian filter-like algorithm; this is something completely different. Hashcash involves inserting a piece of Javascript code into your site's comment form that sends a server-generated key as part of the comment form's data. If the sent key doesn't match the one the server sent as part of the comment form, the comment is marked as spam and not posted. Amazingly simple. And this all happens behind the scenes, without your users even knowing a spam check is going on!

Now, this algorithm works solely on the premise that spambots are incapable of running Javascript, so if spammers come up with a spambot that is capable of running Javascript, Hashcash would be in hot water. But again, no such spambot currently exists, so Hashcash is currently a viable spam-blocking algorithm.

I think Hashcash is such I good idea, I'm going to give it a test run on this site using the WP Hashcash plugin for WordPress. Hopefully I won't have to manually deal with spam for a long time to come. :)

Do we need a new FOSS license?

2012-03-28T05:22:00.000-05:00

Come on, admit it: You've been involved in at least one (probably 3 or more) licensing "discussions" with your fellow FOSS peers. :P Like you probably already know (and like I covered in one of my previous posts), these "discussions" revolve mainly around whether or not the GPL is, in fact, a truly free license. I'm not going to discuss that here (for brevity), but what I will discuss is whether an "compromise" license that sits right between the GPL and a permissive license in terms of restrictions is possible.

First, we need to know what our new license can and cannot do. According to the Open Source Definition (which you can read for yourself here), here are some things our new license couldn't do:

Prevent commercial re-distribution. Commercial re-distribution is very different from proprietary re-distribution, by the way

Prevent (possibly derivative) products issued under the same license as ours

Restrict other software distributed on the same medium as ours

Okay, so now we now what we can't do, let's figure out what we can do.

The biggest problem most people have with the GPL is that it prevents other, non-GPL, FOSS software from using GPL'ed products (because the licenses are incompatible). We need to figure out a way to solve this issue in order to come up with a true "compromise" license.

The counter-argument against permissive licenses is that they allow proprietary derivatives of your project. Whether or not this is an issue is completely up to you, but for the purposes of this post, I'll assume the answer is "yes."

The Open Source Definition does allow open-source licenses to "pick and choose" what derivative projects can be licensed under; the most notable open-source license that does this is the GPL - you can derive from a GPL product and use the AGPL for your code if you so choose

What if we could allow anyone to derive from a product licensed under our hypothetical license as long as they used our new license, or an OSI-approved license? While this has the obvious disadvantage of preventing a BSD-licensed project which uses our hypothetical project from being used in a proprietary product, it's a pretty big step towards an actual "compromise" license.

That's as good a compromise as I can think of for now. Hopefully I can think of a better one in the future. :)

Setting up MinGW32 on Ubuntu

2012-03-11T03:56:00.000-05:00

Finally, a non-controversial topic :)

Having a cross-compiler that can compile Windows executables on a Linux machine can be an extremely useful tool to have, since it prevents you from having to install WINE or a virtual machine just to generate Windows executables for your software project. I find myself often in need of such a tool, and since there doesn't seem to be any up-to-date tutorial on how to set up a MinGW32 cross-compiler, I decided to write one. :)

Okay, so here we go: First, install the gcc-mingw32 package with the following command:

sudo apt-get install gcc-mingw32

And we're basically finished. To use your newly-installed cross-compiler, use the following command:

i586-mingw32msvc-gcc

As a final finishing touch, I add a soft link that allows me to access my new cross-compiler with shorter name, like so:

ln -s $(which i586-mingw32msvc-gcc) gcc-mingw32

If you execute that command in your $HOME/bin directory (assuming, of course, that that directory is on your $PATH), you can access your cross-compiler via gcc-mingw32 instead of the longer i586-mingw32msvc-gcc.

Let's test our cross-compiler on a simple piece of software: Lua 5.1. I'll assume you've already unpacked Lua into some convenient directory, and have opened a terminal there. I'll also assume you made the soft link I talked about earlier, but that's no big deal - if you didn't, just replace gcc-mingw32 with i586-mingw32msvc-gcc.

Lua makes things easy on us by having 99% of the compiler options already set up for us; we just have to tell it where to find our cross-compiler - otherwise, it will find the non-cross-compiling installation of GCC:

make mingw CC=gcc-mingw32

That wasn't too hard either. :) Here's what that command outputs on my system:

cd src && make mingw make[1]: Entering directory `/home/(my username)/Downloads/lua-5.1/src' make "LUA_A=lua51.dll" "LUA_T=lua.exe" \ "AR=gcc-mingw32 -shared -o" "RANLIB=strip --strip-unneeded" \ "MYCFLAGS=-DLUA_BUILD_AS_DLL" "MYLIBS=" "MYLDFLAGS=-s" lua.exe make[2]: Entering directory `/home/(my username)/Downloads/lua-5.1/src' gcc-mingw32 -O2 -Wall -DLUA_BUILD_AS_DLL -c -o lua.o lua.c gcc-mingw32 -O2 -Wall -DLUA_BUILD_AS_DLL -c -o lapi.o lapi.c gcc-mingw32 -O2 -Wall -DLUA_BUILD_AS_DLL -c -o lcode.o lcode.c gcc-mingw32 -O2 -Wall -DLUA_BUILD_AS_DLL -c -o ldebug.o ldebug.c gcc-mingw32 -O2 -Wall -DLUA_BUILD_AS_DLL -c -o ldo.o ldo.c gcc-mingw32 -O2 -Wall -DLUA_BUILD_AS_DLL -c -o ldump.o ldump.c gcc-mingw32 -O2 -Wall -DLUA_BUILD_AS_DLL -c -o lfunc.o lfunc.c gcc-mingw32 -O2 -Wall -DLUA_BUILD_AS_DLL -c -o lgc.o lgc.c gcc-mingw32 -O2 -Wall -DLUA_BUILD_AS_DLL -c -o llex.o llex.c gcc-mingw32 -O2 -Wall -DLUA_BUILD_AS_DLL -c -o lmem.o lmem.c gcc-mingw32 -O2 -Wall -DLUA_BUILD_AS_DLL -c -o lobject.o lobject.c gcc-mingw32 -O2 -Wall -DLUA_BUILD_AS_DLL -c -o lopcodes.o lopcodes.c gcc-mingw32 -O2 -Wall -DLUA_BUILD_AS_DLL -c -o lparser.o lparser.c gcc-mingw32 -O2 -Wall -DLUA_BUILD_AS_DLL -c -o lstate.o lstate.c gcc-mingw32 -O2 -Wall -DLUA_BUILD_AS_DLL -c -o lstring.o lstring.c gcc-mingw32 -O2 -Wall -DLUA_BUILD_AS_DLL -c -o ltable.o ltable.c gcc-mingw32 -O2 -Wall -DLUA_BUILD_AS_DLL -c -o ltm.o ltm.c gcc-mingw32 -O2 -Wall -DLUA_BUILD_AS_DLL -c -o lundump.o lundump.c gcc-mingw32 -O2 -Wall -DLUA_BUILD_AS_DLL -c -o lvm.o lvm.c gcc-mingw32 -O2 -Wall -DLUA_BUILD_AS_DLL -c -o lzio.o lzio.c gcc-mingw32 -O2 -Wall -DLUA_BUILD_AS_DLL -c -o lauxlib.o lauxlib.c gcc-mingw32 -O2 -Wall -DLUA_BUILD_AS_DLL -c -o lbaselib.o lbaselib.c gcc-mingw32 -O2 -Wall -DLUA_BUILD_AS_DLL -c -o ldblib.o ldblib.c gcc-mingw32 -O2 -Wall -DLUA_BUILD_AS_DLL -c -o liolib.o liolib.c gcc-mingw32 -O2 -Wall -DLUA_BUILD_AS_DLL -c -o lmathlib.o lmathlib.c gcc-mingw32 -O2 -Wall -DLUA_BUILD_AS_DLL -c -o loslib.o loslib.c gcc-mingw32 -O2 -Wall -DLUA_BUILD_AS_DLL -c -o ltablib.o ltablib.c gcc-mingw32 -O2 -Wall -DLUA_BUILD_AS_DLL -c -o lstrlib.o lstrlib.c gcc-mingw32 -O2 -Wall -DLUA_BUILD_AS_DLL -c -o loadlib.o loadlib.c gcc-mingw32 -O2 -Wall -DLUA_BUILD_AS_DLL -c -o linit.o linit.c gcc-mingw32 -shared -o lua51.dll lapi.o lcode.o ldebug.o ldo.o ldump.o lfunc.o lgc.o llex.o lmem.o lobject.o lopcodes.o lparser.o lstate.o lstring.o ltable.o ltm.o lundump.o lvm.o lzio.o lauxlib.o lbaselib.o ldblib.o liolib.o lmathlib.o loslib.o ltablib.o lstrlib.o loadlib.o linit.o strip --strip-unneeded lua51.dll gcc-mingw32 -o lua.exe -s lua.o lua51.dll -lm make[2]: Leaving directory `/home/(my username/Downloads/lua-5.1/src' make[1]: Leaving directory `/home/(my username/Downloads/lua-5.1/src'

Hey, looked like it worked. Let's try running our new executable with WINE:

wine src/lua.exe fixme:advapi:RegisterTraceGuidsW (0x100778a, 0x100a060, {485e7de8-0a80-11d8-ad15-505054503030}, 1, 0x33fe00, (null), (null), 0x100a068,): stub fixme:advapi:RegisterTraceGuidsW (0x100778a, 0x100a080, {485e7de9-0a80-11d8-ad15-505054503030}, 1, 0x33fe00, (null), (null), 0x100a088,): stub fixme:advapi:RegisterTraceGuidsW (0x100778a, 0x100a0a0, {485e7dea-0a80-11d8-ad15-505054503030}, 1, 0x33fe00, (null), (null), 0x100a0a8,): stub fixme:advapi:RegisterTraceGuidsW (0x100778a, 0x100a0c0, {485e7deb-0a80-11d8-ad15-505054503030}, 1, 0x33fe00, (null), (null), 0x100a0c8,): stub fixme:advapi:RegisterTraceGuidsW (0x100778a, 0x100a0e0, {485e7dec-0a80-11d8-ad15-505054503030}, 1, 0x33fe00, (null), (null), 0x100a0e8,): stub fixme:advapi:RegisterTraceGuidsW (0x100778a, 0x100a100, {485e7ded-0a80-11d8-ad15-505054503030}, 1, 0x33fe00, (null), (null), 0x100a108,): stub fixme:win:RegisterDeviceNotificationW (hwnd=0x14fae8, filter=0x65e93c,flags=0x00000001) returns a fake device notification handle! Lua 5.1 Copyright (C) 1994-2006 Lua.org, PUC-Rio >

And there you have it: We went step-by-step through the process of installing MinGW32, adding a convenience soft link, and compiling a simple piece of software - Lua.

Happy cross-compiling!

On piracy

2012-03-05T06:32:00.000-06:00

Yes, another controversial topic: Piracy. It's pretty prevalent today - you probably know at least 5 people that have pirated software, and you might have pirated some software yourself; I myself have pirated software in the past, albeit before I knew of the legal and moral issues surrounding it.

So what makes piracy so popular? The most obvious reason is you get a piece of software you'd normally have to pay for for free, and everyone loves that. Except the people who wrote the software, and that's what this short essay is going to be about.

First let's get our terms straight: What is piracy? It's basically the act of downloading a copy of a piece of software that was not made by the software's author - in other words, copyright infringement. Now, is there anything legally wrong with piracy? Quite a bit. How about morally? Er, that's not so clear in some cases.

When you pirate a piece of software that's still being distributed in some legal form, you're basically saying: "Mr. Software Author, I like the software you wrote, but I don't like xxx about it (price, DRM, etc...). Shame on you, so I'm going to take your software and use it for free." I can't really see any way to justify this line of thinking, at least from Mr. Software Author's point of view.

You might say you're not hurting Mr. Software Author by pirating his software, and you might be right, depending upon how rich he is. But that doesn't change the fact that you took something that wasn't yours. You're not entitled to Mr. Software Author's product, and if you don't want to use it the way he wants you to use it, don't use it at all.

If Mr. Software Author's product is tainted with DRM, and that's the reason you're giving for pirating his product, consider this: Will your circumvention of the DRM Mr. Software Author set in place cause him to remove the already present DRM? No - on the contrary, he'll probably add more! Instead of pirating a DRM'ed product (which is still taking something that isn't yours), support a non-DRM'ed one instead. Lack of customers will change Mr. Software Author's opinion on DRM more than anything else.

Now, what do you do in the (very common) case of software whose parent company is either defunct or no longer distributes the software in question? This is both a legal and moral gray area. I mean, technically you're still guilty of copyright infringement, but the copyright owner no longer exists, so it's a legal gray area. It's a morally gray area because while you're still taking something that really isn't yours, you're offered no alternative method to download the software in question.

What about if the parent company exists, but they no longer distribute the software in question? You often see this sort of thing happening with arcade ROMs. My personal take on this case - IANAL - is it's fine (morally so - this is a legal gray area) to pirate arcade ROMs, but only if the parent company offers no way to get the ROMs via other means. For example, Capcom offers Street Fighter Alpha 2 on the Wii Virtual Console platform, but not Street Fighter Alpha 3. So pirating Alpha 3 would be fine by me, but I wouldn't pirate Alpha 2.

I hope this essay on piracy has been an interesting read for you. I also hope it makes you think about the moral and legal implications of piracy a bit more - hey, you might even solve this whole legal mess. :)

On permissive and copyleft licenses

2011-12-28T06:22:00.000-06:00

As you probably know, FOSS licenses are a touchy subject - I've seen some people who won't use or contribute to a project if the project uses a license they do not like. In my humble opinion, licenses aren't a major issue (except for projects you lead) and I'll attempt to explain why below.

Let's start with the (A)GPL. This is probably the most loved/hated license of all time, for 2 major reasons:

The GPL is a strong copyleft license: Any derivative product must also be covered under the GPL.

Even if you leave the GPL code alone and build your own code on top of it - even with dynamic linking - your code must also be covered under the GPL.

For those of us who primarily develop GPL'ed code, this isn't really an issue (in such a case, you'd license your derivative product under the GPL anyway), but using the GPL prevents your code from being used in a non-GPL FOSS project, which may or may not be what you intended.

This is the biggest argument against the GPL in the license debate by far, and for obvious reasons. However, unlike permissive licenses, GPL'ed code cannot be used in a proprietary product, no matter how many times it is derived from, which is the point brought up by those who do like/use the GPL.

So if you don't want your code to be used in a proprietary product, or you only want your code to be used in other Free software projects, the GPL is the way to go. Otherwise, you might want to read on.

Now, let's take a look at permissive licenses. There are quite a few of these, most notably:

3-clause BSD

2-clause BSD

MIT/X

Apache

All of them differ in subtle ways (the Apache license includes a patent indemnification clause, for example), but they all have the same basic premise: Use, modify and redistribute this code however you like, but:

Don't change the license on the original code

Distribute a copy of this code's license along with your product

State unambiguously (none of the licenses specify where) that your product uses my code

Quite a bit more permissive (hence the name) than the GPL, don't you agree? These sort of licenses are used primarily by those who aren't concerned about possibly proprietary derivative products. As far as I've seen, the permissive licenses receive the most flak for not ensuring that derivative products are FOSS as well, something the GPL ensures. In my opinion, whether or not this is a problem is something that has to be decided by the author of the code in question: Am I fine with people making a proprietary (and most likely commercial) derivative of my project? The way I normally go about deciding between using the (A)GPL or a permissive license in a new project - yes, I use both :) - boils down to four questions:

Is the code I am going to license constitute a product that others could make a proprietary (and most likely commercial), standalone derivative of by copying verbatim?

How much time have I spent on the code I'm licensing?

Am I trying to use this code to advance the cause of FOSS, i.e, I'm making a FOSS application in a problem domain where no other FOSS application currently exists?

Are there already currently better FOSS applications in the problem domain for my new project?

Questions 1 and 3 are probably the most significant: If the answer to either question is "yes", unless the answer to Question 4 is "no", the code for the new project will almost certainly be under the GPL - if my project is the only FOSS application in its problem domain, I want to make sure that everything that touches it stays FOSS, for obvious reasons. Question 2 and 4 are somewhat less important, but still bear some weight: If the answer to Question 4 is "yes", I'm probably going to use a permissive license - there wouldn't be much point to someone ripping off my project with there being better FOSS applications in its problem domain anyway.

Besides, as long as I'm the sole copyright holder (or have all contributors sign a CLA assigning copyright to me), I can change the license in the future if my FOSS project becomes the best in its problem domain or other FOSS projects in my project's problem domain become defunct.

You've probably noticed by this point at least three of the four questions have to deal with whether someone will come along and make a proprietary, commercial spinoff of my project. I notice quite a few people in the FOSS community stating that since I wrote my code for "free", i.e, without any monetary fees being incurred from development of the code in question - which often isn't true anyway, otherwise so many projects wouldn't have a "Donate" button displayed conspicuously on their home page - I shouldn't care whether someone else makes money off of my code.

And that's where Question 2 comes into play: At the very least, whenever you write code, you spend time. Time isn't free - as a matter of fact, it can be considered more precious than money; time spent doing anything is part of your life you'll never get back, ever. And if you've spent a significant amount of time working on your FOSS project (or you're about to start a new one, and are pretty sure you're going to spend a lot of time on it), you probably want to think about whether you're fine with someone spending perhaps an hour re-branding your product and making a profit with it.

Enough theory. How do you license a FOSS project in practice? I have one major FOSS project that I lead at the moment - OpenBlox - so I'll go through how I came up with a licensing scheme for that project.

How does OpenBlox do with those four questions we talked about earlier?

Is the code I am going to license constitute a product that others could make a proprietary (and most likely commercial), standalone derivative of by copying verbatim? In its current state, no, but very soon (most likely in 6-8 months), yes.

How much time have I spent on the code I'm licensing? Quite a bit - I estimate I've spent around 5,500 hours on OpenBlox so far (and it's only existed for around 2 and a half years)

Am I trying to use this code to advance the cause of FOSS, i.e, I'm making a FOSS application in a problem domain where no other FOSS application currently exists? Yes, definitely. There is not a single other FOSS project in OpenBlox's problem domain, although there are at least two proprietary products sharing OpenBlox's problem domain.

Are there already currently better FOSS applications in the problem domain for my new project? Because of the answer to Question 3, the answer to this question is of course "no."

Due to how I answered each of the four questions, the GPL looks like a pretty good choice for OpenBlox, and due to its nature as a relatively extensible and unique game engine, a permissive license works great for bundled extensions/plugins. And that's exactly what I use for it now: GPL for the main engine, with plugins and support scripts (save for core plugins that handle things like rendering) under the Apache license.

Quick note before I sign off: The LGPL is a great compromise between a strong copyleft like the GPL and a permissive license like the BSD license: People can use your code without having to change the license of their code (just like a permissive license), but all their modifications to your code must fall under the LGPL, which is pretty neat.

What is programming?

2011-12-04T13:48:00.000-06:00

What is programming? I get that question a lot from my non-programmer friends. The interesting thing is, I had never really asked myself that question - and if my Google-fu hasn't failed me, no-one else has really answered this question, either (at least in a way a beginning programmer or non-technical person can understand ). So I'll do my best to do answer that now, and I'll voice my thoughts about the answer a little too :)

First off, what are programmers trying to accomplish when they write a program? They're trying to solve some problem. Even when you're writing a small one-page script for fun, you've set some problem up for yourself to solve. For example, I wrote an RSS reader in Python a while ago, that notified me of new news items via libnotify - a program/framework that lets you make pop-up alerts on Linux.

Even with such a small utility (it was under 300 lines of code, if I recall correctly), there was a problem for it to solve: Check a list of site's RSS feeds every so often, and if there are any new items, set up a libnotify alert. Yes, it's a small problem, but it is a problem to solve nonetheless.

So, now we know what programmers are trying to accomplish when they write a program, how do they write the program itself? This is another question I get often - how in the world do you write a program?

Well, now the programmer knows the problem he/she is trying to solve, the programmer must now come up with a solution to that problem. Going back to my RSS reader example, my solution was for my reader script to be called via cron (a UNIX/Linux program that runs other programs every so often), and save the etag (basically a hash of a RSS feed's contents) of each feed, and check them against the current etag of the feed we're trying to update. If the etags are different, that means there is new content, so fetch the new content and make a libnotify alert.

Note that the solution to our problem is not code, and it's not UML or anything. It's just a simple, step-by-step, human-readable solution. That's how solutions to the problems your program is trying to solve should be - simple sentences, nothing more. UML and code snippets are useful, but if your solution cannot be explained in plain English (or your native language, if that's not English), you most certainly will not be able to implement it in code. I like the way the Zen of Python puts it:

If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.

So, now we know what we're trying to solve, and how we're going to solve it, we have to convey our solution to the computer in a way it can understand and run - in other words, actually implement our solution in some programming language. This is the most complicated part of programming, and the most error-prone - most software bugs are created during this step, so programmers must use extra caution while performing this task.

Unfortunately, Murphy's Law often hits hard during this step.

No matter how careful you are while actually writing your solution in some programming language, you will make mistakes. When this happens, you'll have to find where the bug occurs, figure out why the bug occurs in the first place, and then devise a way to fix the bug. Again, going back to our RSS reader example, I made a mistake that caused the RSS reader to make a pop-up notification for every news item, even for old items. My solution was fine, but I made an error implementing my solution, so I had to fix it.

Now we've answered the question of what programmers are trying to accomplish when they write a program, and then how they convey that solution to the computer in a way it can understand, we can answer our original question: What is programming?

It is the act of finding a solution for a problem, and then implementing that solution in a way a computer can understand - nothing more, nothing less. Whenever you sit down to write code, you're trying to solve some problem. This is practically identical to what mechanical and electrical engineers do for a living, which is quite interesting: Are programmers "software engineers"? Should they treat every line of code they write with the same amount of professionalism and excellence a mechanical engineer treats the design of a suspension bridge he's made that some construction workers are going to build? Those questions have been the subject of much debate in the software industry for more than a decade.

If you're not a programmer, now you see there's no "black magic" in programming. All that a programmer does while he programs is solve problems. Who knows, now you know programming isn't as complicated as most non-programmers make it out to be (it's still not a breeze by any means :)), you just might want to try programming out for yourself. :)

If you're a programmer, the next time you sit down to write some code, think about the answers to the above questions, and remember what you're doing when you're writing code - you're solving a problem. You just might be surprised how much more carefully you write code. :)