Kermit Alexander II's blog: FOSS

Showing posts with label FOSS. Show all posts

Thursday, May 19, 2016

FuzzBuzz and the Art of Programming

Programming: It's as much an art as it is a science. The general distinction between art and science being the realm of subjectivity - in other words, while in the realm of science, "correctness" is still debated, nobody is under the illusion that there is more than one "correct" solution. Art, on the other hand, has no single "correct solution," though at the same time, the merits of each solution are generally weighed against each other solutions, debated, and so on.

Programming is a mix of both: There are certainly "solutions" - algorithms, programs, etc. - that can be objectively proven without a shadow of a doubt to be ineffective or outright incorrect. There is also the possibility of an algorithm that solves the solution, but is objectively inferior to another, such as an algorithm that runs in O(n) time versus one that runs in O(1) time. On the other hand, there is a fair amount of subjective nuance involved in arriving to a correct solution. Design methodologies, languages, and APIs all come and go in a technological blink of an eye, and each will have its pros and cons, its proponents and its skeptics.

Nobody - sane, at least - involved in all this is under the impression that there is One True Way to do things. You could lock several seasoned programmers up in separate rooms, all of whom subscribe to the exact same design methodologies and use the exact same programming languages, ask them to design a program, and they would all likely emerge with completely separate, and likely equally or near-equally satisfactory, solutions. So, without further ado, let's talk about a simple programming problem, and some possible solutions to solve it. At the end, we'll discuss for a quick minute about which of these solutions someone might actually end up using.

The problem in question is one I've named FuzzBuzz, which was definitely not a reference to an overused programming interview question. To solve FuzzBuzz, one must write a program that considers the numbers from 1 to 100, inclusive. For each number, if the number in question is even, the program must print "Fuzz." If the number is odd, the program must print "Buzz." If the number is a multiple of 3, the program should ignore its even or odd nature and print "FuzzBuzz" instead.

This is an intentionally simplified problem, both to make it easy to make multiple solutions, and to make the solutions themselves clearer. By the time you finish reading this sentence, I bet you'll probably have thought of at least one way to solve this problem, but let me go over a couple first.

First up is probably the one that came to everyone else's mind, just a simple restating of the problem's definition in code.

for x in xrange(1, 101): # range/xrange stops at the integer before the 2nd argument
    if x % 3 == 0:
        print "FuzzBuzz"
    elif x % 2 == 0:
        print "Fuzz"
    else:
        print "Buzz"

Not really a whole lot to talk about with this one; like I said earlier, it's probably the first one you thought up yourself - it was the first one I came up with, as well. Honestly, if this were a real-world problem, most people would (and probably should) stop here. But innovation never sprung from a contented mind, right?

So, for our next solution, let's limit ourselves in some way. The modulo operator is pretty central to the prior solution, so let's take that out next.

def is_multiple_of(number, mul):
    """Return True if number is multiple of mul, false otherwise.
    Okay, so here's how this works: Normally, Python - and most
    other programming languages - chops off the decimal part of the quotient
    if both terms are integers, i.e, integer division is performed. However,
    if one of the terms is a float, then the decimal part will be kept.
    Any number plus a decimal part is going to be greater than simply the number itself,
    so we compare the two. If number is in fact a multple of mul, then there will be
    no decimal part, and both sides of the expression will be equal."""
    if float(number) / mul == number / mul:
        return True
    else:
        return False


def multiple_of_three(number):
    return is_multiple_of(number, 3)


def multiple_of_two(number):
    return is_multiple_of(number, 2)
    
    
for x in xrange(1, 101):
    if multiple_of_three(x):
        print "FuzzBuzz"
    elif multiple_of_two(x):
        print "Fuzz"
    else:
        print "Buzz"

Still pretty straightforward. Not having access to the modulo operator means we had to basically reinvent one ourselves, though. But we could go further still, couldn't we? How about no floating-point math?

def is_multiple_of(number, mul):
    """Return True if number is multiple of mul, False otherwise.
    So again we're exploiting the properties of integer division.
    For any divisor, there are at least two pairs of two consecutive dividends that will return
    the exact same quotient under integer divison, and the smaller number in each
    is a multiple of the divisor. All we have to do is check that number
    is the smaller of the pair. We do that by seeing if the result changes
    if we subtract 1. If so, number is the smaller of the pair (i.e, number - 1
    is part of some other pair), so return True.
    Otherwise, we were given the larger of the pair, or alternatively,
    some completely different number, so return False.
    """
    if number / mul > (number - 1) / mul:
        return True
    else:
        return False


def multiple_of_three(number):
    return is_multiple_of(number, 3)


def multiple_of_two(number):
    return is_multiple_of(number, 2)
    
    
for x in xrange(1, 101):
    if multiple_of_three(x):
        print "FuzzBuzz"
    elif multiple_of_two(x):
        print "Fuzz"
    else:
        print "Buzz"

Now things are getting ever-so-slightly hairy. But let's get dangerous now. How about no Boolean logic whatsoever?

from collections import deque
sequence = deque(["Buzz", "Fuzz", "FuzzBuzz", "Fuzz", "Buzz", "FuzzBuzz"])


def next_in_sequence():
    """Return next string in FuzzBuzz sequence.
    The FuzzBuzz problem follows a pattern:
        * Odd
        * Even
        * Multiple of 3
        * Even
        * Odd
        * Multiple of 3
    All we have to do is keep track of where we are in the sequence.
    To accomplish this, we use a deque and a little bit of pushing and popping
    to keep from having to either pre-generate the entire sequence beforehand,
    or from having to use an iterator.
    """
    next = sequence.popleft()
    sequence.append(next)
    return next


for x in xrange(1, 101):
    print next_in_sequence()

Alright, now we're talking. Admittedly, you probably saw this one coming if you were running the programs alongside reading this article - once you recognize the pattern, this solution is pretty obvious. But let's make one final push: No modulo operator, no math, no Boolean logic, and no loops.

from collections import deque
import sys
# Python allows the user to manually set the recursion limit.
# So, we exploit this by setting it to 101, which simulates the number of
# loop iterations we'd execute if we were using a loop.
# Eventually, we'll exceed this number, and Python will throw
# a RuntimeError, essentially accomplishing the same thing as
# a regular for/while loop.
sys.setrecursionlimit(101)


sequence = deque(["Buzz", "Fuzz", "FuzzBuzz", "Fuzz", "Buzz", "FuzzBuzz"])


def next_in_sequence():
    """Print next string in FuzzBuzz sequence.
    This is identical to the deque solution in sequence.py, save for
    printing out the next string instead of returning it, and the recursion bit,
    so go see sequence.py for an explanation of what's going on here.
    """
    next = sequence.popleft()
    sequence.append(next)
    print next
    next_in_sequence()


try:
    next_in_sequence()
except RuntimeError:
    # swallow the eventual exception due to recursion taking one step too many
    pass

I feel now is a good time to stop and discuss the various solutions, now that we have an ample amount. Like I mentioned earlier on, if this were a real-world problem, I imagine most people using the first solution. It's clear, concise, and gets its point across without needing much in the way of hand-holding. It's easy to extend if we need to tackle additional multiples, as well.

Most of the rest of the solutions are pretty silly if we examine them without the restrictions we came up with. The modulo operator is a pretty useful tool for this problem, so it makes no sense to try to work around it, or reimplement it with our own code. I could see an argument being made for the non-recursive sequence solution - some might consider it more "elegant" than the original solution, though obviously that's up for debate.

Some might also think the sequence version is faster, considering the O(1) performance time of deque.popleft() and deque.append() (collections.deque is implemented as a doubly-linked list under the hood), but at least on my machine, the initial solution is faster when testing 10,000,000 numbers, completing in about 3.1 seconds versus 3.3 seconds for the sequence version. After some fiddling around, I was able to produce a variant on the sequence solution that ran in about 2.9 seconds on the same set:

from collections import deque
sequence = deque(["Buzz", "Fuzz", "FuzzBuzz", "Fuzz", "Buzz", "FuzzBuzz"])


# Binding oft-used functions to local scope results in a speed increase
# due to decreased variable lookup time
popleft = sequence.popleft
append = sequence.append


for x in xrange(1, 101):
    # Moving the function's code to the loop body removes the function call,
    # which actually increases performance slightly
    next = popleft()
    append(next)
    print next

By this point though, we've lost a fair amount of readability, which was one of the main reasons to use the sequence solution. On the upside, we have a reasonable increase in speed, which might be useful in a performance-critical application. On a side note, the performance of the other solutions is pretty abysmal in comparison; the modulo-less solution ran in about 8.5 seconds, whereas the float-less solution ran in about 6 seconds. As for the recursion-based solution, it segfaulted on my machine before it could actually complete the 10,000,000-iteration loop.

So, to summarize: Programming is about 50% art and 50% science; there's usually more than one demonstrably correct solution to the problem, and on top of that, there's often some fair room for debate about which of the solutions one should proceed with. We presented a simple programming problem, came up with some possible solutions for it, and discussed their pros and cons.

As an added bonus, I've set up a Github repository at http://github.com/DangerOnTheRanger/fuzzbuzz that contains all the solutions in this article, as well as reference output to test your own solutions with. If you come up with a solution you like, regardless of its performance or what language it's written in, send me a pull request, and I'll add it to the repo.

Friday, July 13, 2012

Minetest Triforce

In case you're not familiar with it, Minetest is basically a free, open-source clone of Minecraft, plus some new stuff like Lua mod support. Anyway, I made a very large Triforce on gameboom's mt1.gameboom.net (port 30000) Minetest server some time ago, and this is the result:

Daytime

Nighttime

Sunday, June 10, 2012

Switching blog licenses.

I've given it some thought, and I've made a decision: I'm switching the license for the articles I write on my blog from the CC-BY-NC-SA to CC-BY-SA. I'm switching for two reasons:

It lets my articles be re-used/re-posted in far more places I do want it to be used (Wikipedia, other places using the CC-BY-SA, etc...)

I don't think anyone's going to plagiarize my work anytime soon - besides, at the very least, they have to give me credit (the BY clause).

Besides, I think the more liberal licensing terms make the blog more attractive to FOSS enthusiasts, which is a big plus for me since I need all the publicity I can get. Hopefully I'll see the rewards of switching to the CC-BY-SA license in a year or so. :)

Wednesday, March 28, 2012

Do we need a new FOSS license?

Come on, admit it: You've been involved in at least one (probably 3 or more) licensing "discussions" with your fellow FOSS peers. :P Like you probably already know (and like I covered in one of my previous posts), these "discussions" revolve mainly around whether or not the GPL is, in fact, a truly free license. I'm not going to discuss that here (for brevity), but what I will discuss is whether an "compromise" license that sits right between the GPL and a permissive license in terms of restrictions is possible.

First, we need to know what our new license can and cannot do. According to the Open Source Definition (which you can read for yourself here), here are some things our new license couldn't do:

Prevent commercial re-distribution. Commercial re-distribution is very different from proprietary re-distribution, by the way

Prevent (possibly derivative) products issued under the same license as ours

Restrict other software distributed on the same medium as ours

Okay, so now we now what we can't do, let's figure out what we can do.

The biggest problem most people have with the GPL is that it prevents other, non-GPL, FOSS software from using GPL'ed products (because the licenses are incompatible). We need to figure out a way to solve this issue in order to come up with a true "compromise" license.

The counter-argument against permissive licenses is that they allow proprietary derivatives of your project. Whether or not this is an issue is completely up to you, but for the purposes of this post, I'll assume the answer is "yes."

The Open Source Definition does allow open-source licenses to "pick and choose" what derivative projects can be licensed under; the most notable open-source license that does this is the GPL - you can derive from a GPL product and use the AGPL for your code if you so choose

What if we could allow anyone to derive from a product licensed under our hypothetical license as long as they used our new license, or an OSI-approved license? While this has the obvious disadvantage of preventing a BSD-licensed project which uses our hypothetical project from being used in a proprietary product, it's a pretty big step towards an actual "compromise" license.

That's as good a compromise as I can think of for now. Hopefully I can think of a better one in the future. :)

Wednesday, December 28, 2011

On permissive and copyleft licenses

As you probably know, FOSS licenses are a touchy subject - I've seen some people who won't use or contribute to a project if the project uses a license they do not like. In my humble opinion, licenses aren't a major issue (except for projects you lead) and I'll attempt to explain why below.

Let's start with the (A)GPL. This is probably the most loved/hated license of all time, for 2 major reasons:

The GPL is a strong copyleft license: Any derivative product must also be covered under the GPL.

Even if you leave the GPL code alone and build your own code on top of it - even with dynamic linking - your code must also be covered under the GPL.

For those of us who primarily develop GPL'ed code, this isn't really an issue (in such a case, you'd license your derivative product under the GPL anyway), but using the GPL prevents your code from being used in a non-GPL FOSS project, which may or may not be what you intended.

This is the biggest argument against the GPL in the license debate by far, and for obvious reasons. However, unlike permissive licenses, GPL'ed code cannot be used in a proprietary product, no matter how many times it is derived from, which is the point brought up by those who do like/use the GPL.

So if you don't want your code to be used in a proprietary product, or you only want your code to be used in other Free software projects, the GPL is the way to go. Otherwise, you might want to read on.

Now, let's take a look at permissive licenses. There are quite a few of these, most notably:

3-clause BSD

2-clause BSD

MIT/X

Apache

All of them differ in subtle ways (the Apache license includes a patent indemnification clause, for example), but they all have the same basic premise: Use, modify and redistribute this code however you like, but:

Don't change the license on the original code

Distribute a copy of this code's license along with your product

State unambiguously (none of the licenses specify where) that your product uses my code

Quite a bit more permissive (hence the name) than the GPL, don't you agree? These sort of licenses are used primarily by those who aren't concerned about possibly proprietary derivative products. As far as I've seen, the permissive licenses receive the most flak for not ensuring that derivative products are FOSS as well, something the GPL ensures. In my opinion, whether or not this is a problem is something that has to be decided by the author of the code in question: Am I fine with people making a proprietary (and most likely commercial) derivative of my project? The way I normally go about deciding between using the (A)GPL or a permissive license in a new project - yes, I use both :) - boils down to four questions:

Is the code I am going to license constitute a product that others could make a proprietary (and most likely commercial), standalone derivative of by copying verbatim?

How much time have I spent on the code I'm licensing?

Am I trying to use this code to advance the cause of FOSS, i.e, I'm making a FOSS application in a problem domain where no other FOSS application currently exists?

Are there already currently better FOSS applications in the problem domain for my new project?

Questions 1 and 3 are probably the most significant: If the answer to either question is "yes", unless the answer to Question 4 is "no", the code for the new project will almost certainly be under the GPL - if my project is the only FOSS application in its problem domain, I want to make sure that everything that touches it stays FOSS, for obvious reasons. Question 2 and 4 are somewhat less important, but still bear some weight: If the answer to Question 4 is "yes", I'm probably going to use a permissive license - there wouldn't be much point to someone ripping off my project with there being better FOSS applications in its problem domain anyway.

Besides, as long as I'm the sole copyright holder (or have all contributors sign a CLA assigning copyright to me), I can change the license in the future if my FOSS project becomes the best in its problem domain or other FOSS projects in my project's problem domain become defunct.

You've probably noticed by this point at least three of the four questions have to deal with whether someone will come along and make a proprietary, commercial spinoff of my project. I notice quite a few people in the FOSS community stating that since I wrote my code for "free", i.e, without any monetary fees being incurred from development of the code in question - which often isn't true anyway, otherwise so many projects wouldn't have a "Donate" button displayed conspicuously on their home page - I shouldn't care whether someone else makes money off of my code.

And that's where Question 2 comes into play: At the very least, whenever you write code, you spend time. Time isn't free - as a matter of fact, it can be considered more precious than money; time spent doing anything is part of your life you'll never get back, ever. And if you've spent a significant amount of time working on your FOSS project (or you're about to start a new one, and are pretty sure you're going to spend a lot of time on it), you probably want to think about whether you're fine with someone spending perhaps an hour re-branding your product and making a profit with it.

Enough theory. How do you license a FOSS project in practice? I have one major FOSS project that I lead at the moment - OpenBlox - so I'll go through how I came up with a licensing scheme for that project.

How does OpenBlox do with those four questions we talked about earlier?

Is the code I am going to license constitute a product that others could make a proprietary (and most likely commercial), standalone derivative of by copying verbatim? In its current state, no, but very soon (most likely in 6-8 months), yes.

How much time have I spent on the code I'm licensing? Quite a bit - I estimate I've spent around 5,500 hours on OpenBlox so far (and it's only existed for around 2 and a half years)

Am I trying to use this code to advance the cause of FOSS, i.e, I'm making a FOSS application in a problem domain where no other FOSS application currently exists? Yes, definitely. There is not a single other FOSS project in OpenBlox's problem domain, although there are at least two proprietary products sharing OpenBlox's problem domain.

Are there already currently better FOSS applications in the problem domain for my new project? Because of the answer to Question 3, the answer to this question is of course "no."

Due to how I answered each of the four questions, the GPL looks like a pretty good choice for OpenBlox, and due to its nature as a relatively extensible and unique game engine, a permissive license works great for bundled extensions/plugins. And that's exactly what I use for it now: GPL for the main engine, with plugins and support scripts (save for core plugins that handle things like rendering) under the Apache license.

Quick note before I sign off: The LGPL is a great compromise between a strong copyleft like the GPL and a permissive license like the BSD license: People can use your code without having to change the license of their code (just like a permissive license), but all their modifications to your code must fall under the LGPL, which is pretty neat.

Pages