Archive for the ‘math’ Category

Atwood’s Unfinished Game Revisited

Friday, January 2nd, 2009

My previous post described how this puzzle doesn’t give enough information to be definitively solved. We need to know the mother’s strategy for revealing the gender of the first child.

In that earlier post, I approached things from the perspective of repeated simulations, giving some code to demonstrate. Here’s a more visual argument.

My thinking is that in order to make a judgment about the likeliness of different outcomes, we need to know the probability tree for the different events: the possible genders of the children and which child is revealed first. We can deduce almost the entire tree:

bggraph

But we are never told how the mother decides which child to reveal when she has both a boy and a girl.

In my last post, I gave two ways of completing this tree:

  • If the mother always reveals the gender of the female first whenever possible:

    bggraph1

  • In this case, the a priori chance of the second child being a boy is higher.

  • If the mother always randomly decides which child she will reveal the gender of first:

    bggraph2

    In this case, the a priori chance of the second child being a boy is 50%.

So, to answer the question using a given a probability tree, we examine the possible states we can be in (states in which the first child revealed is ‘G’). Then we prune off the other states and refigure the odds using the rules of conditional probability.

So, depending on which tree is being used, the answer to the question is different.

  • A girl is always chosen first if possible

    bggraph1_pruned

    Probability of a boy second = 2/3

  • First child is always chosen randomly

    bggraph2_pruned

    Probability of a boy second = 1/2

Jeff Atwood and the Unfinished Game

Friday, January 2nd, 2009

I don’t usually read Jeff Atwood, but Paul pointed out to me this post because he knows I like math. The post poses this problem:

Let’s say, hypothetically speaking, you met someone who told you they had two children, and one of them is a girl. What are the odds that person has a boy and a girl?

Commenters argued contentiously between 1:2 and 2:3 (and there were some arguments over semantics as well, but I’m focusing on the math right now). Atwood made this follow-up post claiming the correct answer was 2:3.

I’m not so sure this is necessarily correct.

Atwood’s reasoning is that the possible child combinations are: BB, GB, BG, and GG. We know one child is a girl, so that eliminates BB, and 2:3 remaining options have a boy. The sticking point many people have is: why are GB and BG two separate cases?

Atwood’s answer is to draw analogy with The Unfinished Game, a problem he formulates thusly:

Two players, Harry and Ted, place equal bets on who will win the best of 5 coin tosses. In each round, Harry always chooses heads (H), and Ted always chooses tails (T). Suppose they are forced to abandon the game after 3 coin tosses, with Harry ahead 2 to 1. What is the fairest way to divide the pot?

The “intuitive” answer is that the only possible continuations of the game are: H, TH, or TT (since Harry wins on the next H) so 2/3 of the pot should go to Harry since he wins 2:3 of the expected games. The real answer is that the HT and HH continuations affect the expected games equally “strongly” as TH and TT, so Harry actually has a 3:4 chance of winning. Jeff demonstrates this with a code simulation.

The implied argument is that, just like HT and HH are “equally strong” in the Unfinished Game, so are the GBs and BGs in the child gender problem.

Again, I don’t think this is necessarily the case.

In Atwood’s original problem, there is the complication of the mother deciding to tell you one of her children is female. I think the answer hinges on the behavior of the mother.

Here is my answer to Atwood’s original child gender problem. It’s a two-parter, depending on how the mother acts:

  • Scenario 1: When deciding which child’s gender to report, the mother always picks a female if she can.

This yields the 2:3 answer.

If we do a large number of simulations where we randomly generate two genders sequentially, we’ll get BB, GB, BG, or GG equal numbers of times.

We know the mother told us one child is a girl. So 100% of the time we generate a BB, it never makes it to the mother reporting a female gender. So, that gender generation doesn’t count and we throw it out.

We know the mother will always pick G if she can, so 100% of the BGs, BGs, and GGs meet the assumptions of our scenario — they make it to the “mother reported a girl” phase. So, Atwood’s original logic applies and 2:3 times the second child will be a boy.

To demonstrate, here’s a simulation in Python 3.0:

import random
 
genders = ["B","G"]
 
def spawn(n):
    return [random.choice(genders) for i in range(n)]
 
def run(times):
 
    boy = 0
    girl = 0
 
    valid = 0
 
    for i in range(times):
 
        children = spawn(2)
 
        if not "G" in children:
            # This scenario is was already thrown out
            # by our initial assumption -- doesn't count
            continue
        else:
            valid += 1
 
        if "B" in children:
            boy += 1
        else:
            girl += 1
 
    boyRatio = float(boy) / valid
    girlRatio = float(girl) / valid
 
    print("Boy: %f" % boyRatio)
    print("Girl: %f" % girlRatio)
>>> run(10000)
Boy: 0.669774
Girl: 0.330226
  • Scenario 2: The mother randomly picks a child whose gender to report.

This yields the 1:2 answer.

Imagine running the simulation as before. Just as in the previous simulation, we have to throw out scenarios where the mother reports B, since we were assuming we had arrived at the point where she reported G. Previously, since she always reported G in the BG and GB cases, we only had to throw out the BBs.

This time, however, when we get a BG, the mother will choose B or G to report at random. 50% of the time, the mother will report B, which means we will have to throw that simulation out — just as we do the BBs. Similarly, 50% of the BGs will not count. The BGs and GBs are both at “half strength” since half of them don’t make it to the “mother reported a girl” stage. So, we’ll get (BG or GB) the same amount of time we get GG, giving us a 50-50 shot at a boy.

Here’s the Python simulation for this case:

import random
 
genders = ["B","G"]
 
def spawn(n):
    return [random.choice(genders) for i in range(n)]
 
def run(times):
 
    boy = 0
    girl = 0
 
    valid = 0
 
    for i in range(times):
 
        children = spawn(2)
 
        random.shuffle(children)   
 
        childX = children[0]
        childY = children[1]
 
        if childX != "G":
            # This scenario is was already thrown out
            # by our initial assumption
            continue
        else:
            valid += 1
 
        if childY != "B":
            boy += 1
        else:
            girl += 1
 
    boyRatio = float(boy) / valid
    girlRatio = float(girl) / valid
 
    print("Boy: %f" % boyRatio)
    print("Girl: %f" % girlRatio)
>>> run(10000)
Boy: 0.495971
Girl: 0.504029

The crucial difference in the simulations is:

if not "G" in children:
            # This scenario is was already thrown out
            # by our initial assumption -- doesn't count
            continue

versus:

if childX != "G":
            # This scenario is was already thrown out
            # by our initial assumption -- doesn't count
            continue

So, as far as I can tell, the real answer is that we don’t know enough about the mother’s behavior to give a definitive answer of how this scenario plays out on average.

Of course, this kind of reasoning is pretty tricky to pull off correctly, so I wouldn’t be surprised if a good counter arises. But I’ve thought about this problem for a while, and, for now at least, I’m pretty thoroughly convinced that this solution is correct.

RD.B Repost: The Limits of Proving Programs Correct

Tuesday, December 30th, 2008

This post originally appeared on http://robdickerson.blogspot.com. I’m no longer using that site, but I liked this post, so I’m reposting it here.

My university required senior computer science students to complete a capstone project before they graduated. The idea was to have every person perform a research project or internship that brought together all that he or she had learned over the last four years.

My capstone project was of the independent research variety. I’m very interested in formal logic and proof systems, and wanted to examine parallels between generating a formal proof and writing a computer program.

What I ended up doing was creating a system that started out with a formal program specification written in Z-notation, and simultaneously generated a working Spark-Ada program and a proof of that program’s conformity to the Z spec.

(Since each line of code transforms the program from one state to another, you can figure pre- and post- conditions of individual lines of code. So, the problem becomes a search for a list of statements whose pre- and post- conditions chain together to take the Z-specified preconditions to the Z-specified post conditions. The sequence of statements make up the working code, and the chain of line-by-line pre- and post- conditions are the correctness proof.)

The programs it could generate were very simple (they could only use a ridiculously small subset of Ada). It could write methods that did things like swapping the values of two variables. But it worked for the most part. And it was a lot of fun to work on, and I learned a lot.

The biggest thing I learned was the limited utility of proving code correct.

Something a fellow named Benjamin Russell said on the Haskell mailing list today got me thinking about software proofs (and inspired this post). He said that, “[b]asically, if you can prove that the program is correct, then you don’t need to test it.”

I disagree.

What I learned doing my senior capstone is that when you prove a program correct, what you’re actually proving is that the program conforms to some specification. Maybe it’s a specification formally defined in a language like Z, or maybe it’s a non-formal specification that resides in your head.

But really, a specification is prone to error just like program code. As your specification gets more detailed, it starts to become another programming language. With my capstone, all I really did is write a Z-to-Spark compiler. I’d shifted the opportunity for human error up a level — now, instead of making mistakes in Ada code, people were free to make mistakes in their Z spec.

It’s like saying you can write provably correct assembly code that matches a C specification. Maybe C or Z is an easier language to reason in, but you still have to take a lot of time to examine your specification, to make sure everything in the system meshes together — really, the formal specification becomes your new coding language.

On the flip side of the coin, as your specification becomes less formal, that opportunity for error shifts back toward the coding process. Less constraints in the specification mean more freedom in writing the code, which means more opportunity for mistakes in your actual program.

I guess what I’m saying is that in this approach:

1) come up with an over-arching specification for everything
2) write code that matches the specification

being able to prove code correct (prove it meets some spec) only gets you step two. Step one is just has hard (or harder). Creating a really formal specification means you have a new programming language to deal with, and a less-formal spec means you can have inconsistencies or missing information that bleed into your code base.

There is some essential complexity in developing software that cannot be eliminated by correctness proofs — you’re just moving opportunity for error from the “coding” side of the balance to the “specification” side.

All of that being said, if you’re doing something that has to be right, like air traffic control or data encryption, and you’re willing to spend a lot of time thinking about your formal specification, then I think provably correct software is the right goal. But correctness proofs are just one part of the process. Even if everything is proved correct, you still have to really test your software. If not to test the code, then to test the spec.

I think this is one of the reasons that test-driven development is such a practical idea for “normal” software development. It recognizes that your specifications are just as incomplete and subject-to-change as your code. With TDD, your formal specification is your suite of tests, and proving your code meets the spec is easy — just run the tests.

As far as program specifications go, unit tests are a nice balance between a formal approach (a machine checks conformity) and an informal approach (the tests are always changing and never complete).

For most programming, I think this balance is the best we have.

Turing… Again

Monday, June 23rd, 2008

It seems like I keep coming back to this earlier post. Maybe I have an obsessive personality? I’ve been trying for days to figure out. So far, I don’t think I do. I have some rigorous studies planned in the coming weeks; I’ll let you know my findings.

In the meantime, I came across this interesting paper called “The Origins of the Turing Thesis Myth,” which yet again knocks down the whole “Turing Machines equal Computers” thing.

When I went on my little rant, I talked about how I think Turing Machines are not the same as modern computers because for any given computer, there is a Turing Machine that can out-compute it; that is, the TM can perform computations the computer cannot.

This paper goes conversely, examining things computers can do that Turing Machines cannot. Mainly, incorporate user interaction. Previously unbeknownst to me, Turing himself actually defined the notion of a “choice machine”; so far, my understanding of this construct is that it exists to capture the non-algorithmic nature of interaction from sources “outside” of the TM (like users). I’m still getting to beknow the idea, though, so my understanding of it is likely to change in the near future.

After reading this paper, I had to change my thinking on TMs being “more powerful” than modern computers. Although the ideas presented are mostly new to me, so far the way a computer responds to external input does seems like an effective example of something a TM cannot model that a computer can do. (Of course, the added disjunct only further validates my TM != Computer thinking, so my larger worldview is still intact. Whew.)

Anyway, an interesting paper…

Illogical Arguments

Tuesday, November 13th, 2007

I came across this article which echoes some of the same frustrations I’ve mentioned before.

I’m glad I’m not the only one bothered by this…

Stop it With the Halting Problem

Monday, February 26th, 2007

As often happens, I spent some time today checking out what was new on http://programming.reddit.com. At the top of the list was this blog post: http://programmingkungfuqi.blogspot.com/2007/02/real-reason-why-building-software-is.html

I found it to be a pretty amusing post, and I think it was worth the read. But, near the end, the author says this:

Unfortunately for software developers, this place actually exists for us. The Halting Problem shows that we can never be certain of any result about any computation. It may run properly, it may become an endless loop, or it may give the completely wrong result. Adding concurrency adds on even more uncertainty into the situation. And we can PROVE this logically.

Well, I guess I should let you know that I have a little pet peeve about people incorrectly invoking halting problem undecidability to make a point. I’m new to blogging, but my understanding is that ranting about pet peeves is what it’s is all about, so get ready.

But, before I launch into my little rant, let me say that I am not an expert in the field of formal languages and computability theory. Maybe I fall into the “computability theory enthusiast” category, if such a thing exists. Anyway, the things I’m about to say are, to the best of my knowledge, correct, but please don’t get the impression I’m sitting over here with a Ph.D. in automata theory. (Although perhaps someday I shall posses that coveted degree).

So anyway, here are the two major fallacies I often hear people fall into when they discuss the halting problem. (Oh, and Wikipedia has a nice article on the halting problem if you’d like to read up on it). And a terminology note: when I say “computer,” I mean it in the modern, digital, Von Neuman, plastic and silicon way, and not the more ethereal “anything that does computations” way. So, here we go.

Incorrect Claim #1. Computers Are Turing Machines. Turing machines are a model for all computation, so all computers are Turing machines, right? Well, not quite. Turing machines have an infinite amount of memory. No modern computer (to the best of my knowledge) does. (A better model for an actual computer is a linearly bounded automaton).

For a given computer, there are things that a Turing machine can do that the computer cannot. (For example, any computation with a result that takes more memory to express than the computer has; an infinite amount of numbers, plus the pigeon hole principle, plus the computer’s finite memory guarantees such a result exists). The point I’m going for here is that, from the perspective of computational ability, Turing machine > computer.

And, when someone says “Computers are Turing machines,” it is often followed up with something like, “and so the halting problem says we can’t decide what programs will finish on a computer.” In other words, people tend to think that computer == Turing machine -> undecidability of halting problem on computers.

Well, technically, the halting problem is decidable for any program that can be run on a deterministic machine with finite memory (like a computer). Since such a machine has a finite (albeit huge) number of possible states, you can decide whether or not a program will halt by:

1. Start it running
2. Remember each state the machine has been in
3. If you see the same state twice, reject
4. If the machine halts, accept

Since the number of states is finite, 3 or 4 is guaranteed to happen (again, by the pigeon hole principle).

Of course, actually implementing a halting decider for a real computer is wildly impractical. The author of the aforementioned post (as well as anyone who makes similar claims) is likely trying to say that, for all practical purposes, you cannot decide the halting problem on a computer. Which is quite true — just remember the “for all practical purposes” part.

Incorrect Claim #2. Halting Behavior is Undecidable For Every Program. Again, not true. I often hear people making this claim, as the author of the post under discussion does: “The Halting Problem shows that we can never be certain of any result about any computation.” (Emphasis mine). Frankly, this is false. (Well, maybe you could wiggle out of this by having a debate on the semantics of “certain of,” but let’s not.)

There are, in fact, many programs for which the halting problem is decidable. Consider the trivial example of a program that, on any input, immediately halts. The result of that computation is a halt for every possible input. Or, consider the equally trivial example of a program that, on any input, immediately goes from one state to another and back again, in an infinite loop. That computation will never halt for any input, and the algorithm described above will correctly decide the machine as non-halting.

In fact, you can even make a recognizer for the class of problems that halt. (A “recognizer” is a program that returns “true” for any input in the class it is recognizing, and either returns “false” or fails to halt on any program not in that class. This is in contrast to a “decider,” which is not allowed to infinitely loop on “false” cases). The above algorithm, for example, is such a recognizer. It makes sense when you think about it; any program that terminates can be recognized as halting on the input it was given, since it, well… halted when given that input.

So, there are plenty of programs that can be determined to halt on given inputs. What the actual ‘undecidability of the halting problem’ result says is that a decider for the halting problem cannot be constructed that works on any arbitrary machine. You can make partial “deciders” that work for some machines, and even recognizers for all machines that halt, but never a decider that decides halting for all machines. Again, a little nit-picky, but if you’re going to talk all math-like, you’ve gotta play by the rules. Watch your existential quantifiers.

Back to the post I originally quoted — there are a couple of other things about the paragraph that bother me a little (I’m not sure how getting a “wrong” result relates to the halting problem, and I’m not sure what the author is claiming can be “proved logically”), but like I said, I think most people understand the point that is being made.

And I mean no disrespect for the author; the posting was a fun read, and I enjoyed his analogy. I just wish there was a little more attention to detail when a theoretical bomb like the Halting Problem is dropped into a discussion.

Please be careful.