Thursday, September 17, 2015

Computer Science and Libertarianism

I was first introduced to this article through a now defunct blog. Here's the essay, dug out of the internet archive:

I'm finally making good on my promise to post my "wild speculations" about
Computer Science IQ and Libertarian inclination.

First let me give some background on CS IQ.  I have taught at least 5,000
students how to program, which has given me a strong set of hunches about what
goes on in their heads.  But the most useful source of information came from my
work as Chief Reader for the Advanced Placement Exam in Computer Science.

AP programs allow high school students to take college-level courses at their
high schools and take a test that allows them to receive placement and usually
credit for their work.  As with all AP exams, the AP/CS Exam is divided into
two parts: multiple-choice and free-response.  In the free-response section,
students hand-write solutions to problems.  This has always been considered an
integral part of the AP program because of the (at least perceived) limitations
of multiple-choice tests.  The AP/CS exam had 50 multiple-choice and 5
free-response questions.  The free-response questions were all of the form,
"Write a piece of code that does the following..."

Obviously, the hand-written solutions need to be graded by real people.  Every
year about 60 CS teachers (called "readers") get together for 6 days to grade
10,000 exams.  As Chief Reader, I was responsible for choosing the 60 teachers,
managing their efforts for those 6 days, and setting the ultimate distribution
of AP grades.  In 1988 I made AP history by giving the all-time worst set of AP
grades ever given out (I failed almost half of them).  As a result, ETS
approved a request they had never approved before.  They gave me a diskette
(actually 2) with the raw scores for all 10,000 candidates so that I could
"study" it.  My undergraduate degree is in math with a statistics
specialization, so I'm the kind of person who likes to play with data.

One of the things I looked at was the set of correlations between various
multiple-choice questions   A high correlation between 2 test items indicates
that candidates performed similarly on those items (i.e., those who got one
right tended to get the other right and those who got one wrong tended to get
the other wrong).  I expected to find either virtually no correlations, because
there was little repetition on the test, or clusters of correlations.  If you
were to test people on math, for example, you might find that arithmetic
questions correlated highly with arithmetic questions, algebra questions
correlated highly with algebra questions, geometry questions with geometry
questions, and so on.  I expected a similar pattern based on various
programming constructs/skills.

What I found was highly puzzling.  Five multiple-choice questions were each
correlated with over a dozen other questions and I found virtually no other
correlations at all.  But there was no pattern to the correlations for these
five.  Let me describe the grandaddy as an example.  One had more correlations
than any other and I nicknamed it the "grandaddy."  It was highly correlated
with 25 other questions, yet the topic that it tested had nothing to do with
the topics covered by these other questions.

When I looked at correlations between multiple-choice and free-response, I
became even more puzzled.  There was definitely repetition between the two
halves of the test.  For example, we had a free-response question about a
technique called recursion and we also had several recursion multiple-choice
questions.  So were the multiple-choice recursion questions the most highly
correlated with the recursion free-response question?  Nope.  The grandaddy
was!  Even though the grandaddy had nothing to do with the topics being tested
in ANY of the five free-response items, it was the #1 correlated question for 4
of the 5 and was #2 for the fifth.  Furthermore, the group of five questions
mentioned above were in all cases among the top 6 correlated multiple-choice
questions for each of the free-response items, and usually they were the top
five.

Either I stumbled upon some kind of statistical fluke, or there was something
special about these 5 multiple-choice questions.  Flukes like this are highly
unlikely with a pool of 10,000.  Also, as I studied other aspects of the data,
I was surprised to find these same five questions appear in the answer to two
completely unrelated questions I pursued.  I won't bore you with the details,
but suffice it to say that I found even more evidence that these questions were
more "central" than the others.

My theory is that these 5 are CS IQ questions (particularly the grandaddy).  I
presented my data to CS faculty and students at Stanford, and they seemed to
agree with my conclusion.  They also gave me some interesting feedback about
the five questions themselves.  Everyone who looked at them agreed that they
"felt" like the kind of questions that would distinguish a computer scientist. 
One faculty member described them as "the intersection of logic and
programming."  A more apt description given by another faculty member who had
taught intro courses himself was that each question required a model of
computation, and in his experience, this was the prime distinction he had seen
between those who could program and those who could not.  It was also obvious
from the questions that logic and recursion are highly related to CS IQ.

Let me say a bit more about what I mean by a model of computation. 
Programmers are able to "play computer" in their head (sometimes requiring the
aid of a scrap of paper).  In other words, we have a model of exactly what the
computer does when it executes each statement.  For any given program, we have
a mental picture of the state the computer is in when execution begins, and we
can simulate how that state changes as each statement executes.  This is rather
abstract, so let me try to explain by giving a specific example.

Let me tell a story that is typical of those I heard from the TAs who worked
for me at the computing center.  A student comes up to the TA and says that his
program isn't working.  The numbers it prints out are all wrong.  The first
number is twice what it should be, the second is four times what it should
be,and the others are even more screwed up.  The student says, "Maybe I should
divide this first number by 2 and the second by 4.  That would help, right?" 
No, it wouldn't, the TA explains.  The problem is not in the printing routine. 
The problem is with the calculating routine.  Modifying the printing routine
will produce a program with TWO problems rather than one.  But the student
doesn't understand this (I claim because he isn't reasoning about what state
his program should be in as it executes various parts of the program).  The
student goes away to work on it.  He comes back half an hour later and says
he's closer, but the numbers are still wrong.  The TA looks at it and seems
puzzled by the fact that the first two numbers are right but the others don't
match.  "Oh," the student explains, "I added those 2 lines of code you
suggested to divide the first number by 2 and the second by 4."  The TA points
out that he didn't suggest the lines of code, but the student just shrugs his
shoulders and says, "Whatever."  The TA endeavors to get the student to think
about what change is necessary, but the student obviously doesn't get it.  The
TA has a long line of similarly confused students, so he suggests that the
student go sit down and think through his calculating procedure and exactly
what it's supposed to be doing.  Half an hour later the student is back again. 
"While I was looking over the calculating procedure, a friend of mine who is a
CS major came by and said my loop was all screwed up.  I fixed it the way he
suggested, but the numbers are still wrong.  The first number is half what it's
supposed to be and the second is one-fourth what it's supposed to be, but the
others are okay."  The TA considers for a moment whether he should bring up the
student on an honor code charge for receiving inappropriate help, but decides
that it isn't worth it (especially since that line of similarly confused
students is now twice what it was an hour ago).  He asks the student whether he
still has those lines of code in the printing routine that divide by 2 and 4
before printing.  "Oh yeah," the student exclaims, "those lines you said I
should put in.  That must be the problem."  The TA once more politely points
out that he didn't suggest the two lines of code, but the student again shrugs
and says, "Whatever.  Thanks, dude!"

The student in my hypothetical story displays the classic mistake of treating
symptoms rather than solving problems.  The student knows the program doesn't
work, so he tries to find a way to make it appear to work a little better.  As
in my example, without a proper model of computation, such fixes are likely to
make the program worse rather than better.  How can the student fix his program
if he can't reason in his head about what it is supposed to do versus what it
is actually doing?  He can't.  But for many people (I dare say for most
people), they simply do not think of their program the way a programmer does. 
As a result, it is impossible for a programmer to explain to such a person how
to find the problem in their code.  I'm convinced after years of patiently
trying to explain this to novices that most are just not used to thinking this
way while a small group of other students seem to think this way automatically,
without me having to explain it to them.

Let me try to start relating this to libertarian philosophy.  Just as
programmers have a model of computation, libertarians have what I call a model
of interaction.  Just as a programmer can "play computer" by simulating how
specific lines of code will change program state, a libertarian can "play
society" by simulating how specific actions will change societal state.  The
libertarian model of interaction cuts across economic, political, cultural, and
social issues.  For just about any given law, for example, a libertarian can
tell you exactly how such a law will affect society (minimum wage laws create
unemployment by setting a lower-bound on entry-level wages, drug prohibition
artificially inflates drug prices which leads to violent turf wars, etc.).  As
another example, for any given social goal, a libertarian will be able to tell
you the problems generated by having government try to achieve that goal and
will tell you how such a goal can be achieved in a libertarian society.

I believe this is qualitatively different from other predictive models because
of the breadth of the model and the focus on transitions (both of which are
also true of programming).  On newsgroups I often see questions like:
  If we were in situation A and government took action X, what would happen?
  If we were in situation B and a corporation took action Y, what would happen?
  If we were in situation C and an individual took action Z, what would happen?
Libertarians almost always quickly answer by saying, "I'll tell you exactly
what would happen..."  And, surprisingly, the libertarians tend to give the
same answer in most cases.

I think most people find this odd about libertarians.  They understand how an
economist might be able to predict the effect of a certain law on the economy
or how a social scientist might be able to predict how drug legalization might
affect the ghettos, but they don't understand how somebody could predict all of
these things, especially someone who has no formal training.  Libertarians, on
the other hand, don't seem to understand how someone could fail to have such a
model of interaction (it would almost be like having a Supreme Court judge who
had never thought about Roe vs. Wade--ha ha).  The nonlibertarians have no
comprehensive model of interaction, and as a result, they can't communicate in
a meaningful way with those who do.  Their attention is always focused on
misleading superficial problems rather than on the underlying causes of such
problems.

When I observe how most people approach politics, it reminds me of the way my
hypothetical student approached his program.  A person notices that some people
are making $1 and $2 an hour and are having difficulty managing financially on
such a sum.  This seems bad and they want to fix it.  But they have no model of
interaction that would allow them to reason about what might cause such a
result.  So they decide to pass a minimum wage law so the problem will go away.
And it does (apparently).  There aren't any poor people making $1 and $2 an
hour anymore.  But there are suddenly lots of unemployed people who have to
live off welfare (a new problem).  Does the person make the connection and
realize that they caused this problem?  Not without a model of interaction.  So
instead they say we have to fix the unemployment problem.  And then we have to
fix the new problems generated by the fix to the unemployment problem.  And
then we have to fix the new problems generated by the new fixes.  And so on.

If you suggest that eliminating minimum wage laws and the government
interference that made those people so poor in the first place would be a
better solution, they look at you incredulously and say you must be crazy. 
This is just like the situation with my TA and the student who had added 2
lines of code to make the numbers print out correctly ("Are you crazy?  Why
would I delete those lines of code when the numbers would then print out
incorrectly?"  Because the problem is elsewhere, and that's the problem you
should be addressing, but that's difficult to explain to someone who doesn't
have a model of how his program works).  Seriously, I think the credibility gap
that existed between my TAs and the students who sought their help is similar
to the credibility gap between libertarians and nonlibertarians.  And I also
suspect that the gap will continue to exist unless and until those other people
learn to think in terms of a comprehensive model of interaction.

As usual, I've talked more than I should.  I'm not sure that I've made my
point very well, but I think it would require a great deal more time for me to
make this more comprehensible.  I suspect that the programmers who read this
message will understand me, but the others might not.  Anyway, I think I'll
leave it mostly at that, but add a few related comments.

Don Knuth, who wrote the CS equivalent of The Bible, says that the thing that
most distinguishes computer scientists is their ability to "jump levels of
abstraction."  I mentioned that programmers can "play computer," but what good
is that when you are working on a 100,000-line program?  It would take so long
to simulate the thousands of instructions and the vast amount of data that
you'd never get anywhere.  But programmers get around this by using
abstraction.  A programmer can reason about the top-level execution of a
program, for example (a macro-view, if you will).  But when necessary, he can
focus in on a program module, or a single subprogram, or a single loop, or a
single line of code (more and more of a micro-view).  A programmer can even,
when necessary, reason about how that line of code will be translated into
machine-code and even what changes are likely to happen to the physical
hardware involved.  A programmer understands a program at all of these levels
of abstraction.  It is essential that he can jump quickly between levels, and
relate information at one level to information at another level, if he is to
be able to eliminate problems in his code.  I think libertarians also exhibit
this behavior.  A libertarian can comfortably tell you how governments
interact with each other, how governments interact with corporations, how
corporations interact with each other, how corporations treat individuals,
and how individuals interact with each other. It would be impossible to have
a model of interaction without these levels of abstraction and without being
able to jump between levels when necessary (e.g., saying, "If government A
passes law X, that is likely to pressure government B to also pass law X,
which causes the corporations controlled by government B to take action Y,
which causes individuals working for those corporations to take actions Z
and W.")

Another link between libertarianism and programming is that the principles of
good programming are closely related to libertarian ideals.  We call it
"top-down programming," but anyone who has studied structured programming knows
that "central planning" is quite different.  A well-structured programming will
have high-level modules that are loosely-coupled (i.e., as independent as
possible).  This means that at the highest level, a program should minimize
tasks so that it performs only those tasks that are essential.  In other words,
"that program is best that programs least."  This is the principle of
decentralized government.  As another example, the structured programming
concept of information hiding is really the libertarian belief in privacy. 
Information hiding says that the internal details of a subprogram should be
independent from other subprograms (in fact, the goal is to have them INVISIBLE
to other subprograms).  This is like saying that the private choices made by
one individual that affect only that individual should not be influenced by
other individuals (and would ideally be kept entirely confidential).

I mentioned the importance of logic to CS IQ.  I believe it is equally
important to libertarian philosophy.  From my observation, libertarians tend
to think that all political questions can be answered with an almost
mathematical certitude. There is no such thing as "a friendly disagreement"
in mathematics.  If two mathematicians disagree, then one is mistaken.
Similarly, if two libertarians disagree, each asserts that the other is either
operating from a false assumption or has a flaw in his logic.  I think
nonlibertarians are really turned off by this, particularly because it comes
across as obnoxious and egotistical.  But libertarians seem to thrive on it.
The community has a kind of intellectual-warrior ethos.


--
__/\__  Jonathan S. Haas              | Jake liked his women the way he liked
\    /  positron@primenet.com         | his kiwi fruit: sweet yet tart, firm-
/_  _\  Trimark Interactive, Inc.     | fleshed yet yielding to the touch, and
  \/    Don't Tread On Me             | covered with short brown fuzzy hair.

No comments: