Don’t Do This

December 22nd, 2009
Tabs and spaces... mixed! The horror!

It’s like this in a lot of places throughout the project. :(

PAX: Day -1

September 4th, 2009

(In Which I Go On A Harry Potter Bar Crawl)

Josh, Bekka, and I caught wind of this pre-PAX bar crawl that we decided to check out.

Except it’s not just a bar crawl… everyone picks one of the four houses from Hogwarts and travels around in a group with their house. Things are scheduled so that each house meets up with each other house exactly once, at which point a battle occurs.

I’m not a drinker, so I was never very clear on the rules. Something about ordering shots for members of the opposing team, and I think maybe there was a point system or something. And somehow a winner is determined? I’m not very clear on the details.

Oh, and people dress as wizards.

Ravenclaw's Head Boy

So here’s the scenario: four groups of 50+ people are wandering the streets of Seattle dressed in Hogwarts house colors meeting up at different bars to battle each other via drinking games.

If you think this sounds completely awesome, you are right.

The bars are are apparently pre-informed of the event, and are good sports about the whole thing. One place was screening Harry Potter movies on a big projector on the wall, and another had Potter-themed drinks.

Feel free to order a Butter Beer, Pumpkin Juice, or Polyjuice Potion at this place

We decided to run with the Ravenclaws for a while, but eventually cut out to get back to the hotel. I seriously doubt I’ll still be awake by the time this whole thing comes to a conclusion.

The Ravenclaw banner. Go Ravenclaw!

Seattle: Days 0 and 1

September 3rd, 2009

Josh, Bekka, and I made it to Seattle yestarday around noon. PAX starts on Friday, so we’ve spent the last couple days taking in Seattle, which is absolutely a fantastic city.

We’re staying in a hotel in the middle of the downtown area (on 5th street a few blocks from Pike), so there’s a ton of stuff in walking distance.

Day 0

Right after arriving, we headed over to Pike Place Market for lunch. If you’ve never been, Pike is basically a large open-air seafood and farmer’s market with tons of little shops and restaurants.

Fresh seafood for sale at Pike Place Market

We had lunch at a little chowder place — very delicious. It’s easy to spend lots of time just walking around Pike, looking at the fresh seafood and produce, and talking with the artisans selling their products in the small shops around the market; this is exactly what we did.

For dinner, we headed to a local brewery called Rock Bottom. The food was decent, but the service was horribly slow. Bekka and Josh tell me the beer was pretty good, though.

After dinner, we hit up a couple bars around town, ending up at a small venue called The Green Room. Here we took in a set by Johanna Kunin. The show was great, check Kunin out if you’re into indie rock with fantastic vocals.

Day 1

Josh started the day (before I was awake) by going for a walk, and brought back some delicious cinnamon rolls from a nearby coffee shop. After we were all awake, we headed over to the Seattle Art Museum.

It’s a nice place, but it focuses pretty heavily on modern and contemporary art, with comparatively small spaces reserved for ancient through renaissance work (which interests me more). Also, no photography allowed in the galleries. :(

The Science Fiction Museum and Hall of Fame

After the art museum, it was back to Pike for lunch, then over to the Science Fiction Museum, one of the big places on my to-visit list.

The SciFi Museum was pretty awesome. (Although this museum also disallowed photography. What’s up with the photo-hating, Seattle?). The museum isn’t very large, but we didn’t visit the “Experience Music” portion.

The exhibits are what you’d probably expect from the name of the place: movie memorabilia, interesting books, discussions on the role of science fiction in society. Basically, a cool place to visit if you’re a nerd. Which I am.

Zeek's Pizza plus Space Needle

We ate dinner at this place called Zeek’s Pizza, which is delicious; fresh ingredients, well-made food. If you’re in Seattle and like pizza, definitely check this place out.

We’re at the hotel for now (everyone was feeling tired — lots of walking around, and Seattle is pretty much built on a giant hill), but I’m sure we’ll find something fun and exciting to do tonight.

And, of course, PAX begins tomorrow!

Why Bespin?

May 26th, 2009

If you haven’t heard of Mozilla Labs’ Bespin, the website is here.

Bespin is a web-based code editor, and seems to be generating some attention.

The question I have is: what does a web-based code editor give us? Why put a code editor in the cloud?

Most of the articles I’ve seen use a phrase like this: “…combine the speed and power of desktop-based development with the collaborative benefits of cloud computing.” A little short on details.

The articles then proceed to list all the ways Bespin will be just like a desktop editor.

Of course, a perfect emulation of a desktop editor doesn’t give us anything we don’t already have, since we already have desktop editors. So, again, what do we actually gain by having that editor in the cloud?

The Mozilla Labs blog post introducing Bespin is here, and contains a bulleted list of things Bespin will do. By my count, all but two of the bullet points are things that desktop editors and IDEs already do and Bespin needs to replicate.

Here’s what I think about the remaining two bullets:

  • Real Time Collaboration.

    I suppose there are developers out there who do real time remote pair programming or similar (although I am not one of them). I would guess there is already software for this kind of thing, but I see that Bespin could potentially make it easier.

    So, I’m willing to chalk up this advantage: easier real time code sharing.

    How often do you want to share a code session in real time, though? For “normal” online collaboration, I suspect a DVCS is what you really want.

    But maybe I just don’t understand the real time collaboration use case. What specifically would one do with this?
  • Accessible from Anywhere.

    This one is compelling — set up your coding environment once, and be able to use it anywhere.

    But think of the things you do on your desktop environment; scripts you write, OS settings you tweak, your VCS setup, etc. For Bespin to equal a desktop coding environment, these things will need to be reinvented on the server.

    Right now you can remotely access a machine via ssh, vnc, or rdp. For me, these things fulfill my “accessible from anywhere” wants. If you’re unable to use these protocols for some reason, perhaps Bespin will be useful.

I can also see how Bespin could be convenient when embedded in other web applications. A coworker suggested embedding it into Wordpress, for example. You could edit your php or whatever online, without the (small?) inconvenience of a manual FTP push to publish.

So, it seems to me that there are a handful of cases where a web-based code editor can be convenient. I doubt Bespin will ever be as flexible or as powerful as a real desktop coding environment, though.

I suppose time will tell.

Installing Cabal on Ubuntu Hardy

March 16th, 2009

I just went through a bit of a frustrating time getting cabal (the Haskell package management system) going on a fresh Ubuntu install. (Figuring out I needed to install the zlib package mentioned in step three below was the worst of it).

I thought I’d document the steps I went through in case I ever need to do it again:

  1. Install GHC.
    sudo apt-get install ghc
  2. Download and extract the cabal installation tool: cabal-install-0.6.2.tar.gz. (This page will probably keep an up-to-date link for future versions).
  3. Make sure this zlib package is installed:

    sudo apt-get install zlib1g-dev
  4. Run the bootstrap.sh shell script extracted in step two.
  5. Either add ~/.cabal/bin to your $PATH or copy it to a $PATH location.

The Cabal-Install instruction page is located here.

Does This Seem Strange To Anybody Else?

February 11th, 2009

It’s sort of a basic Java thing, but I think this seems counter-intuitive:

public class SomeClass {
 
    class InnerClass {
        private int privateInt = 0;
    }
 
    public void someMethod() {
        InnerClass innerClass = new InnerClass();  
 
        innerClass.privateInt = 1; // <-- this is legal        
    }
 
}

In the words of Firefly’s Jubal Early… does that seem right to you?

Explaining Things to Computer Scientists

January 20th, 2009

I love this introduction to git: Git for Computer Scientists.

The reason I like it is evident right in the abstract:

Quick introduction to git internals for people who are not scared by words like Directed Acyclic Graph.

Computer Science has such a rich vocabulary for describing data structures and their transformations, yet every other introduction to git I’ve seen uses mostly command line examples and hides technical details behind analogies.

The examples and analogies are helpful, but a little precise discussion of the way a system is put together can be even more helpful. It seems to me that most programmer targeted software introductions and tutorials tend to shy away from directly invoking basic computer science concepts.

It’s a shame, because you can communicate a lot of information in a short amount of time if you use the right vocabulary.

If you’re a programmer writing a document for other programmers, why not leverage this kind of common knowledge to communicate efficiently?

Atwood’s Unfinished Game Revisited

January 2nd, 2009

My previous post described how this puzzle doesn’t give enough information to be definitively solved. We need to know the mother’s strategy for revealing the gender of the first child.

In that earlier post, I approached things from the perspective of repeated simulations, giving some code to demonstrate. Here’s a more visual argument.

My thinking is that in order to make a judgment about the likeliness of different outcomes, we need to know the probability tree for the different events: the possible genders of the children and which child is revealed first. We can deduce almost the entire tree:

bggraph

But we are never told how the mother decides which child to reveal when she has both a boy and a girl.

In my last post, I gave two ways of completing this tree:

  • If the mother always reveals the gender of the female first whenever possible:

    bggraph1

  • In this case, the a priori chance of the second child being a boy is higher.

  • If the mother always randomly decides which child she will reveal the gender of first:

    bggraph2

    In this case, the a priori chance of the second child being a boy is 50%.

So, to answer the question using a given a probability tree, we examine the possible states we can be in (states in which the first child revealed is ‘G’). Then we prune off the other states and refigure the odds using the rules of conditional probability.

So, depending on which tree is being used, the answer to the question is different.

  • A girl is always chosen first if possible

    bggraph1_pruned

    Probability of a boy second = 2/3

  • First child is always chosen randomly

    bggraph2_pruned

    Probability of a boy second = 1/2

Jeff Atwood and the Unfinished Game

January 2nd, 2009

I don’t usually read Jeff Atwood, but Paul pointed out to me this post because he knows I like math. The post poses this problem:

Let’s say, hypothetically speaking, you met someone who told you they had two children, and one of them is a girl. What are the odds that person has a boy and a girl?

Commenters argued contentiously between 1:2 and 2:3 (and there were some arguments over semantics as well, but I’m focusing on the math right now). Atwood made this follow-up post claiming the correct answer was 2:3.

I’m not so sure this is necessarily correct.

Atwood’s reasoning is that the possible child combinations are: BB, GB, BG, and GG. We know one child is a girl, so that eliminates BB, and 2:3 remaining options have a boy. The sticking point many people have is: why are GB and BG two separate cases?

Atwood’s answer is to draw analogy with The Unfinished Game, a problem he formulates thusly:

Two players, Harry and Ted, place equal bets on who will win the best of 5 coin tosses. In each round, Harry always chooses heads (H), and Ted always chooses tails (T). Suppose they are forced to abandon the game after 3 coin tosses, with Harry ahead 2 to 1. What is the fairest way to divide the pot?

The “intuitive” answer is that the only possible continuations of the game are: H, TH, or TT (since Harry wins on the next H) so 2/3 of the pot should go to Harry since he wins 2:3 of the expected games. The real answer is that the HT and HH continuations affect the expected games equally “strongly” as TH and TT, so Harry actually has a 3:4 chance of winning. Jeff demonstrates this with a code simulation.

The implied argument is that, just like HT and HH are “equally strong” in the Unfinished Game, so are the GBs and BGs in the child gender problem.

Again, I don’t think this is necessarily the case.

In Atwood’s original problem, there is the complication of the mother deciding to tell you one of her children is female. I think the answer hinges on the behavior of the mother.

Here is my answer to Atwood’s original child gender problem. It’s a two-parter, depending on how the mother acts:

  • Scenario 1: When deciding which child’s gender to report, the mother always picks a female if she can.

This yields the 2:3 answer.

If we do a large number of simulations where we randomly generate two genders sequentially, we’ll get BB, GB, BG, or GG equal numbers of times.

We know the mother told us one child is a girl. So 100% of the time we generate a BB, it never makes it to the mother reporting a female gender. So, that gender generation doesn’t count and we throw it out.

We know the mother will always pick G if she can, so 100% of the BGs, BGs, and GGs meet the assumptions of our scenario — they make it to the “mother reported a girl” phase. So, Atwood’s original logic applies and 2:3 times the second child will be a boy.

To demonstrate, here’s a simulation in Python 3.0:

import random
 
genders = ["B","G"]
 
def spawn(n):
    return [random.choice(genders) for i in range(n)]
 
def run(times):
 
    boy = 0
    girl = 0
 
    valid = 0
 
    for i in range(times):
 
        children = spawn(2)
 
        if not "G" in children:
            # This scenario is was already thrown out
            # by our initial assumption -- doesn't count
            continue
        else:
            valid += 1
 
        if "B" in children:
            boy += 1
        else:
            girl += 1
 
    boyRatio = float(boy) / valid
    girlRatio = float(girl) / valid
 
    print("Boy: %f" % boyRatio)
    print("Girl: %f" % girlRatio)
>>> run(10000)
Boy: 0.669774
Girl: 0.330226
  • Scenario 2: The mother randomly picks a child whose gender to report.

This yields the 1:2 answer.

Imagine running the simulation as before. Just as in the previous simulation, we have to throw out scenarios where the mother reports B, since we were assuming we had arrived at the point where she reported G. Previously, since she always reported G in the BG and GB cases, we only had to throw out the BBs.

This time, however, when we get a BG, the mother will choose B or G to report at random. 50% of the time, the mother will report B, which means we will have to throw that simulation out — just as we do the BBs. Similarly, 50% of the BGs will not count. The BGs and GBs are both at “half strength” since half of them don’t make it to the “mother reported a girl” stage. So, we’ll get (BG or GB) the same amount of time we get GG, giving us a 50-50 shot at a boy.

Here’s the Python simulation for this case:

import random
 
genders = ["B","G"]
 
def spawn(n):
    return [random.choice(genders) for i in range(n)]
 
def run(times):
 
    boy = 0
    girl = 0
 
    valid = 0
 
    for i in range(times):
 
        children = spawn(2)
 
        random.shuffle(children)   
 
        childX = children[0]
        childY = children[1]
 
        if childX != "G":
            # This scenario is was already thrown out
            # by our initial assumption
            continue
        else:
            valid += 1
 
        if childY != "B":
            boy += 1
        else:
            girl += 1
 
    boyRatio = float(boy) / valid
    girlRatio = float(girl) / valid
 
    print("Boy: %f" % boyRatio)
    print("Girl: %f" % girlRatio)
>>> run(10000)
Boy: 0.495971
Girl: 0.504029

The crucial difference in the simulations is:

if not "G" in children:
            # This scenario is was already thrown out
            # by our initial assumption -- doesn't count
            continue

versus:

if childX != "G":
            # This scenario is was already thrown out
            # by our initial assumption -- doesn't count
            continue

So, as far as I can tell, the real answer is that we don’t know enough about the mother’s behavior to give a definitive answer of how this scenario plays out on average.

Of course, this kind of reasoning is pretty tricky to pull off correctly, so I wouldn’t be surprised if a good counter arises. But I’ve thought about this problem for a while, and, for now at least, I’m pretty thoroughly convinced that this solution is correct.

My Experiences With git-svn

December 30th, 2008

For the past two days, I’ve been giving git-svn a go at work. I was wanting a way to easily create task branches without touching the central subversion repository, and someone I worked with tipped me off to git’s standard svn interaction (thanks Eric). Plus I was just curious to learn more about git.

Here are my experiences and impressions so far.

My Setup

At work, I’m using IntelliJ Idea 8.0.1 on a Windows XP machine. The code I’m working on right now is a module contained within a larger framework, so I’m actually dealing with two subversion repositories; one for the framework, and another for the module. The larger project points to the module via an svn:externals definition.

Installing git With Cygwin

The first thing I did was get an install of git going. Since I’m using XP, I did this via Cygwin, which is the official route to git on Windows.

To actually be able to use git-svn, I had to install two Cygwin packages: git and subversion-perl.

I ran into some issues with the subversion perl bindings, but reinstalling subversion-perl fixed the problem.

I also needed to download Error.pm from CPAN and place it in <cygwin_root>/lib/perl5.

The final (minor) issue I hit with the installation was git-svn not being in a $PATH directory by default. (If you don’t want to mess with this, you can just say git svn (no hyphen) instead, and it’s all the same — as far as I can tell.)

All in all, the installation wasn’t a walk in the park, but it wasn’t too bad either.

Git-ting The Project

As I mentioned, the code I’m working on is an inner module for a larger piece of code. I toyed with the idea of trying to git the whole thing, but making the svn:externals definition work out seemed like it would be complicated at best. So, for now, I removed the externals definition on my local copy and checked out the inner module directly. (I’d be interested to hear about other ways to handle this.)

To do the checkout, I first prepared a directory:

git-svn init -s http://url.of.the/subversion/repository

(The -s means “stdlayout” — the svn repository has a normal trunk/branches/tags layout. The docs have more info on the different options).

Then to actually grab the repository:

git-svn fetch

Git stores the entire project history along with each repository (see the git tutorial if you want to know more about git), so the fetch can take a long time. This step took about 6 hours to get the ~12k revisions in my project.

If you don’t care about having the whole history, you can tell git to use any revision number as a starting point:

git-svn fetch -rREVISION

Using It

At this point, I could start using normal git commands to manipulate my local repository. To try it out, I created a local branch to do some work in:

rdickerson@rdickerson $> git branch -a     
* master
 
rdickerson@rdickerson$> git checkout -b aa_cache_fix
Switched to a new branch "aa_cache_fix"
 
rdickerson@rdickerson $> git branch -a
* aa_cache_fix
  master

(Again, see the git tutorial if you’re confused).

I made the changes I needed and committed back into the task branch:

rdickerson@rdickerson $> git commit -a

At this point I needed to merge back into the master, but I wanted to consolidate all the incremental changes I had made in the task branch into one big commit I could pass back to subversion. To bundle all of my branch commits together, I used the --squash merge option and made the large single commit to the master branch:

rdickerson@rdickerson $> git checkout master
rdickerson@rdickerson $> git merge aa_cache_fix --squash
rdickerson@rdickerson $> git commit -a -m "Fixed AA Cache"

Then I made sure my project was up-to-date and pushed the changes back out to subversion (the lines = redaction):

rdickerson@rdickerson $> git-svn rebase
Current branch master is up to date.
 
rdickerson@rdickerson $> git-svn dcommit
Committing to --------------- ...
Authentication realm: ------------------
Password for '---------': 
        M       ------- File #1
        M       ------- File #2
Committed r12457
        M       ------- File #1
        M       ------- File #2
r12457 = 09121633e5ca898cc96cd4f2264c987c7f29fc0f (git-svn)
No changes between current HEAD and refs/remotes/git-svn
Resetting to the latest refs/remotes/git-svn

And my commit shows up in both the git and subversion logs. Hooray!

Idea Integration

There are a couple Idea git plugins available, but none of them seem 100% perfect. I went with git4idea, which seems to be the standard.

Idea lets you specify version control systems on a per-directory level, so using git for my inner module and svn for everything else was no problem:

vcs2

The plugin has a nice, simple interface for switching branches:

git_checkout_idea1

Similar things exist for merging, stashing, pushing, pulling, etc.

I did run into some issues after I created a new file, added it to git, then renamed it — everything looked great through the console, but Idea’s file statuses were messed up, even after using the “Synchronize” and “Refresh” commands. I had to reopen the project to get things right again.

Another minor thing is that the commit dialog has no indication of what branch you’re in, so it would be pretty easy to accidentally commit to the wrong branch.

Impressions

So far, using git locally has been great. I can:

  • Create, switch, and merge branches quickly and easily
  • Perform basic git operations from Idea
  • Make commits and view project history even if I’m working offline or the subversion server is down. (Which admittedly happens rarely).

Keep in mind it’s only been two days, so who knows how I’ll feel after using it for a while. So far, though, I’ve learned a lot about git and set myself up with a swank local VCS in the process, so I’m going to say it was worth the effort.