# Computational Thinking

## 16:198:503, Fall 2006

### Particulars

Instruction team: Matthew Stone, Ken Shan and John Asmuth

Class Time: Thursdays 1:40-4:40pm

Place: Psych A139 (RuCCS Playroom)

Recitations: Monday 12:30-1:30pm or Wednesday 6-7pm

Place: Psych A102 (RuCCS Village)

### Announcements

• Nov 29
Here are some more source code materials derived from recent lectures. Here are some resources for doing Scheme web server programming.
• Nov 9
Here are a bunch of source code materials derived from recent lectures. Some readings and pointers.
• Oct 21

I'm late getting the midterm out, after discarding a further bunch of problems as too tedious or difficult. So you get an extra day too---it's now due Oct 31. Here is a PDF format description of the problems. Most of the relevant information is duplicated in this template scheme file for you to fill in with your answers. Question two concerns a new little "programming language" called UFO, and makes reference to a first and a second sample program.

• Oct 16

A reminder: we will have a take-home midterm, going out on Thursday of this week. Here is a representative set of questions to give a flavor for the kind of things you can be expected to know after this far in the class.

Some followups to our discussion of HTML.

Some useful archive information about what we've been doing.

• This scheme file contains functions to add a list of numbers, and some analogues, leading to an implemetation of the algorithm (described in class Sep 14 and in recitation Sep 25-27) to compute the diagonal of an n-dimensional box. It also develops the function to compute the output of a hotel switch in which n switches can each flip the same light on and off.
• This scheme file contains functions to determine the numerical value corresponding to a list of bits understood as a binary numeral, and to convert a numerical value to a binary numeral represented as a list of bits, as worked out in recitations Oct 2-4. It goes on to illustrate learning as search - a point mentioned briefly in class Sep 28. We can interpret a list of bits either as a binary numeral or as a boolean function. That means we can "count up" the boolean functions and check whether each one matches a set of training data.
• This scheme file contains a function to take the derivative of a mathematical expression, as worked out in recitations Oct 9-11. The resulting expressions contain lots and lots of ones and zeros; a followon function shows how you can simplify many of those expressions away in another step of tree transformation.
• This scheme file is a solution to the homework problem of implementing an interpreter for split.
• This scheme file implements a version of the heap construction algorithm we talked about in class Oct 12. There are a number of details of the implementation that we did not have time to cover (and in fact there are some ways of simplifying the algorithm and the implementation that it might be useful to talk about). So don't worry if you don't understand this completely yet. The most useful thing may be the definitions at the end of the file which trace out a sequence of instantiated heaps (in variables hi) by adding and removing elements. You can print out these structures and look at them.

• Oct 9

Get set up for XML in Scheme.

• Install SSAX and SXML. You can actually do this over the web. Invoke the File > Install .plt file... menu option in DrScheme. You will be prompted to enter a web address for the PLT archive you want. The first time, enter http://www.cs.rutgers.edu/~mdstone/class/503/ssax.plt then (once the installation is complete), invoke the same menu option again and enter http://www.cs.rutgers.edu/~mdstone/class/503/sxml.plt (you can cut and paste!).
• Download the test code based on what we did in class Thursday: the Scheme source testxml.scm and the two example XML files sentence.xml and lamp.xml. You should be to open and run testxml.scm in DrScheme and inspect the structure you get.
• Familiarize yourself further with XML to make sure you understand how it works. There's a huge amount of information about XML on the web. Here are two useful sources to start:

• Oct 1

Test your understanding of evaluation. Here's a homework statement and here's the Scheme code the homework asks you to finish writing.

• Sep 29
Languages

Today we wrote an interpreter (more precisely, many of them) for a language of logical formulas. The grammar below defines the abstract syntax and concrete syntax of this language.

The set of formulas S is the smallest set such that:

• If num is a number, then the abstract syntax (make-variable num) with the concrete syntax n is in S.
• If left and right are in S, and op is a binary operator, then the abstract syntax (make-binary left op right) with the concrete syntax (left op right) is in S.
• If sub is in S, then the abstract syntax (make-negate sub) with the concrete syntax (not op) is in S. (Here "not" is a literal symbol.)

The set of binary operators consists of two symbols:

• and
• or

For example, the following are three formulas:

• The abstract syntax
(make-variable 3)
with the concrete syntax
3
• The abstract syntax
(make-binary (make-variable 3) 'and (make-variable 2))
with the concrete syntax
(3 and 2)
• The abstract syntax
(make-binary (make-variable 3) 'and (make-negate (make-variable 2)))
with the concrete syntax
(3 and (not 2))

However, the concrete syntax (3) and ((3) and (2)) are not formulas.

The three interpreters on my computer, in their order of appearance, were:

Remember that you can use the diff program (or equivalent utilities whose names usually contain "diff") to compare these interpreters automatically.

Finally, you can find Guy Steele's talk "Growing a Language" online

• Sep 25
Circuits
Here is the circuit simulator demoed in class. We explored Circuits < Logic Families < RTL and Circuits < Combinatorial Logic. Play with the circuits we looked at and also some others, as described below. You can click on inputs to change their values and watch how the electrical states of the wires throughout the circuit change - green is hot.
• Check out Circuits < Combinatorial Logic < 1-of-4 Decoder and Circuits < Combinatorial Logic < 2-to-1 multiplexer. What does each of these circuits do? Write a scheme program that does the same thing. (Use a list to store the four outputs of the decoder circuit.)
• Check out Circuits < Combinatorial Logic < 2-bit comparator. The four inputs to this program are interpreted as two binary numbers. Using the meaning of comparison, write out the truth table for the logic this gate should compute. Try to figure out how the circuit works. Does considering the truth table help to understand why the circuit is designed as it is?
• Check out Circuits < Combinatorial Logic < 7 segment LED decoder. Intuitively, what does the circuit do? What do the inputs to the LED decoder mean?
• Sep 12
Here are some reading resources related to the class that you can use to follow up the material discussed in connection with the Sep 7 class meeting.
• Sep 8
For this week - by Wed evening, Sep 13 - please do the following:
1. Make sure you have a version of EMACS and DR SCHEME available somewhere you are comfortable working (ideally, the place you do all your other work). Instructions for installing windows versions of EMACS are available here at the FTP site. You want the full distribution. (You can also copy the unpacked distribution from somebody who has done this. That's what we did in class.) Macs come with a terminal version of emacs, but you can also download friendlier ones. Because emacs is free software there are many versions available. Google is your friend.
DR Scheme is available here. It is more civilized to install.
If you have any technical questions, contact the TA John Asmuth by email as user jasmuth on the Google mail service (if you are a human you should know how to do this - Google is your friend).
2. Make sure that at some point in your life you have done a good chunk of the EMACS tutorial. It is available by running M-x help-with-tutorial or (in most cases) C-h t or Help > Tutorial on the menu. Computer scientists inevitably have some advantage here; they are more likely to have used emacs before. I don't know how to avoid the leg up this gives them.
3. Study the two programs we wrote in class:
(defun indent-for-mail ()
(interactive)
(beginning-of-line)
(insert-string "> "))
(forward-line 1))
and
(defun indent-buffer-for-mail ()
(interactive)
(beginning-of-line)
(insert-string "> ")
(end-of-line))
(if (= 0 (forward-line 1))
(indent-buffer-for-mail)))
(Remember that you can install new definitions in EMACS using the command M-x evaluate-current-buffer.) Run the commands involved individually and in turn, play around and see what these functions do, see what simple variations do. Don't try to create a program that does something specific yet; just play. However, if, during the course of this homework, you find yourself doing editing that becomes routine, consider writing a program to automate that routine task. Think about what the routine is and what precise steps are required to follow it. If you can actually write a program to do this, let me know; I will be interested to see it.
4. Write two brief statements, of approximately a paragraph in length, as explained below. Use EMACS to edit the statement, so you get a further chance to play with the program and start to see how to use it for your purposes. (This part of the work, by contrast to the previous ones, will give a leg up to those with practice expressing themselves in English.)
• In your own words, summarize your motivations for coming to the class, your understanding of what you could expect to get out of the class and your sense of how you might put that knowledge to use.
• In your own words, reflect on the theme of Thursday's lecture. Why might it be important to think about formulating a specific set of abstract actions, and then be able to deploy them flexibly and creatively, when you and your colleagues need to collaboratively explore an intellectual space you care about? How might the ideas represented by EMACS - or by Newell and Simon's collaboration on Logic Theorist - be explained more simply and more relevantly for you? (Your suggestions here may merely point out deficiencies, not alternatives.)

Please email your responses to Matthew, Ken, and John. Include your two statements (as text attachments if you want to testify to your EMACS skills), and a brief summary of your schedule, indicating times you'd be available for recitations or meetings of groups for class work on Busch campus during the fall.

Some reminders about homework. First, nothing has been assigned unless I think it's absolutely necessary, either for you to keep pace with the topics of the class or for me to plan a class that works for you. There is no busywork. So I expect responses from everyone, whether you're taking the class for credit or not. Auditors who have not done the homework will be asked to leave. Others who have not done the homework will be expected to set up an appointment, explain the problem, and to help work out changes to the class and to their work habits so the problem is corrected.

### Overview

Goals. By taking this course, you will learn how to participate in interdisciplinary collaborations that depend on the techniques and results of computer science. Over the course, we will develop three important kinds of programs through hands-on case studies: an interpreter for a programming language; a web interface to a database; and a reinforcement-learning agent capable of perception, deliberation and action. We'll also present some of the theoretical background computer scientists use to understand such programs precisely - including representation, complexity, and computability - and introduce the history and culture of the field.

Requirements. Small in-class exercises, recitations and weekly homeworks will give you practice in understanding problems computationally, solving them, and critiquing solutions. Grades will be assigned based on homeworks, class participation, and take-home midterm and final exams. We'll focus on talking computationally as well as thinking computationally, so you understand how to play your part in computational projects - especially those where not everyone on the team has a hand in the programming. The central place of skills development in the class means that typical students will not be well served by attending as auditors.

Audience. The course is geared to students with some mathematical sophistication. You should be familiar with abstraction and proof from a course such as linear algebra (as 01:640:250), mathematics of probability (as 01:640:477 or 01:198:206), or formal logic (as 01:640:461 or 01:730:407). No background in computer science or programming will be presupposed; however, computer science students with interdisciplinary interests are very welcome in the class, as they are likely to benefit from our reflections on talk and teamwork in computer science. While the course will not count towards graduate requirements in computer science (other than the graduate school's general requirement that PhD students take 48 credits of coursework), students can take it together with its companion class CS 504 in the Spring to satisfy prerequisites for Rutgers's advanced graduate courses in Artificial Intelligence.

### Syllabus and Outline

There is no textbook for this class. Nothing quite like it has ever been taught before. The sketch below outlines what we will cover and aims to substitute for browsing class materials as a guide to what you will learn and how material will be presented.

As the course proceeds, you will get readings, exercises, references and other resources to complement in-class work, and they will be made available on-line during the course of the semester. Ultimately, as in any graduate class, you are responsible for your own learning. If you need something more, you have to ask.

• Sep 7
Computation.
Babbage, Emacs, Maya, DNA.

Computer science is not the study of computers. It's the general science of systems that evolve through discrete states according to finitely-specified rules. Computer scientists trace the field back through mechanical devices built in the mid-1800s (and earlier). They recognize their ideas today as readily in DNA as in silicon chips.

In fact, it takes work to understand why computers are computational devices. Good insight comes from scripting interfaces - artifacts that expose their states and actions in computational form for people to interact with, so that anything that happens in the system has a public name. Scripting is also an important principle that allows people with differing levels of technical expertise to collaborate together. Classic cases of scripting include the Unix shell, the Emacs word processor, and the Maya modeling engine. Scripting languages are your friends, and you should be ready to use them - and help design them!

• Sep 14
Programs.
McCarthy, Variables, Symbols, Scheme.

Programming languages aren't always big and complicated. The language Scheme, for example, goes back to work by McCarthy in the 1950s and has a basic syntax you can master in an hour or two. In many cases, you get the complete power of computation just with the ability to remember computed results, to take action conditionally, and to build larger programs out of smaller ones.

What's hard about writing programs is thinking. Specifying correct computations - like any mathematical problem solving - involves reasoning from data and conditions to discover the unknown. But computation requires a special attention to reasoning by cases because we often must analyze infinite discrete sets into an open-ended array of cases. We must also reason constructively, finding answers rather than merely showing that they exist. To explore computational ideas collaboratively, we all have to understand and be able to talk about what programming has in common with mathematical problem solving and what makes it a special skill on its own.

• Sep 21
Circuits.
Boole, Von Neumann, Dominoes, Counting.

Computation is not an intellectual construct - it is physically realized in the actual world. Circuits, for example, really have discrete states and finite rules for transitioning between them, and our computational analysis of circuits is an abstract characterization of their underlying physics.

The discrete nature of computation often leads to surprising insights into what's physically possible. That's how we earn the name computer scientists. The propositional logic of George Boole shows how to classify all possible circuits by the computation they perform. Such results enable us to harness computation for our own purposes - for example in the processor architecture pioneered by John von Neumann, where a central circuit can execute whatever instruction is specified by a control variable.

• Sep 28
Programs as data.
Turing, Evaluation, Environments.

The ideas we've seen so far have defined computer science since its beginning. What ties them together into a coherent discipline is the insight that one can write a program whose effect is to run programs. Actually, a more exact way of saying it is that one can specify computations whose effect is to execute specified computations. In fact, the idea goes back to Turing in the 1930s who showed that one person (a "computer"), by working with pencil and paper according to one specific ("universal") set of precisely specified rules, could simulate the steps of a person working with pencil and paper according to any set of precisely specified rules whatsoever. The insight is still used constantly in practice, for example in programs that run scripts in scripting languages.

This week we take time out to explore programs that run programs, and to reflect on the ideas and skills that let us talk about, think about and work with such programs. Once we have programs that run programs, we are only a short step away from trading in programs that write programs - as we will see in the rest of the class.

• Oct 5
Representation.
Newell, Marr, Chomsky, XML.

Some kinds of data lend themselves particularly well to computational analysis - for example the mathematical data and computational data we've seen so far. But what makes computation so profoundly transformative is the possibility to represent arbitrary information about the world in a computational system, so that the system becomes, in Allen Newell's term, a physical symbol system. Nowadays, XML offers general tools for using representations in programs, while philosophical analyses of naturalized content, notably the widely-held "causal-historical" view, allow us to interpret those representations precisely as an inherently meaningful outgrowth of a computational system's interactions with its environment.

Representation is a key ingredient of computational thinking and computational explanations. Cognitive Scientists have seized on this insight to introduce discrete structures that underpin computational explanations of human cognitive capacities - like Chomsky's parse trees that can represent syntactic judgments about natural language structures or Marr's visual scene descriptions that characterize the visual world. In engineering, this way of connecting systems to the world allows us to analyze computations in terms of the information systems have, not just the operations they carry out.

• Oct 12
Algorithms and insight.
Knuth, Binary search, GCD, Layout, HTML.

Representation is so central to computer science in part because we can write qualitatively better programs when we allow them to exploit implicit information about the subject-matter of their computations. A classic illustration of this is Euclid's algorithm for computing the greatest common divisor of two integers, which embodies a beautiful - and fundamentally geometrical - insight. Binary search is another elegant example, where invariants in the way an ordered set is represented allow us to determine membership in the set with almost shocking efficiency.

Many computer scientists consider Donald Knuth's study The Art of Computer Programming to epitomize the connections among representations, algorithms and insights. But all computer scientists associate Donald Knuth with the typesetting language Tex, which we all use for formatting beautiful and precise mathematical communication. Tex is one of the ancestors to the HTML language used to display web pages, and the connections among these ideas affort the opportunity to consider the representations and algorithms used in practice to describe complex multimedia presentations.

• Oct 19
Types.
Russell, Scott, Domains, Checking, Inference.

Much computer science aims at developing tools and practices for specifying computations over rich representations. A key starting point for this work is the idea of a type: a precisely specified subset of possible representations. Type theory - an extension of work by Bertrand Russell and others on the foundations of mathematics in the early 1900s - allows us to assign mathematical meanings to representations and (as Dana Scott pioneered in the 1970s) even to programs. In programming languages, type systems make it possible to check that a program has a specified type or even infer the type of a program mechanically.

Here we focus on types as an abstraction to help guide you in writing programs. If you know the type of output you need, you know a lot about the operations a program must perform to return the correct value. If you know the type of input you have, you know a lot about the conditions the program must check for and the subproblems it has to solve in each case. This makes thinking about types an important part of the skill of programming.

• Oct 26
Complexity.
Cook, Big-O, P and NP.

Representations underpin frameworks for analyzing programs as well as writing them. One important question to ask is how the size of a program output varies as a function of the size of the program input. Superficially quite similar programs behave staggeringly differently in this respect - but you can explain a lot by investigating which program outputs have a size that's asymptotically polynomial in the size of the input.

Since the work of Stephen Cook in the early 1970s, theoretical computer science has had a special place for programs that have to produce a polynomially-sized proof characterizing the solution to a problem. Such programs define the limits of tractability. If we have good heuristics for finding such solutions we can often do so in practice but if we need to search exhaustively for solutions such problems become hopeless. It is widely conjectured - but not established - that good heuristics do not exist for all such problems, so to solve them algorithmically must sometimes take more than a polynomial number of steps. This conjecture is the most famous open problem in the theory of computation.

• Nov 2
Computability.
Godel, Kleene, Halting, Python.

Another kind of theoretical analysis comes from looking at representations of programs. This introduces the possibility of self-reference: a program can take a representation of itself as input or produce a representation of itself as output! In fact, it's pretty easy to develop examples of such programs - particularly in the scripting language Python that is widely used to deal with representations on the Web.

Self-reference is the basis for showing that not all questions can be answered by computational methods. Such results go back to Kurt Gödel's work in the 1930s and fundamentally changed our understanding of the goals and methods of mathematics and the other sciences of reasoning. Interestingly, the constructs used in languages like Python for matching and transforming expressions can be traced directly back to the work of these theoreticians - notably Stephen Kleene - as they worked to explicate the interrelationships between computational thinking and the foundations of logic and mathematics.

• Nov 9
Distributed computation.
Milner, Continuations, Web Pages.

We are now able to take another step back, and actually write a Web interface to an information source. The now-familiar ingredients of such programs include the use of XML to represent your own data, the use of HTML to specify interactions, and the use of Python to access and transform XML and HTML documents. The new idea is the idea of a computation that unfolds through the interaction of multiple distributed computational devices. On the web, we have to distinguish between a server which prepares an HTML document and a client which executes the interaction specified by the HTML. Typically, we also have a separate data layer ("back end") which handles data storage and access for the server.

As you might expect, computer scientists have developed theoretical abstractions for this kind of distributed computation that can help us to think more clearly about it. Directly applicable to web programming is Robin Milner's idea of named channels that coordinate the exchange of values (including, with the expected geeky self-reference, names for other channels). So is the idea of continuations - representations of the future of a computation.

• Nov 16
Data as programs.
Shannon, Codes, Compression, Corpora, Examples.

Recall that representations allow programs to exploit arbitrary information about the world. You shouldn't be surprised to discover that the information about the world that a computation needs may not always be available or accessible when we first write a program. The remainder of this class will help you to think about and talk about specifying computations that get the information they need from data and act accordingly.

Compression is a good case study to introduce this perspective. If you want to encode a document in a way that makes the document as short as possible, you need to know what the document is first. The best code will fit that document specifically. Compression also lets us explore the historical development of the idea of information, going back to Claude Shannon, and to consider a simple, beautiful and widely-used algorithm, due to David Huffman.

• Nov 21
Perception and inference.
Bayes, Hinton, supervised learning

A more general computational problem involving data is to decide which of two categories X and Y describes a particular example on the basis of a set M of measurements of the example. The right algorithm may take a rather simple form, but it obviously depends on how likely the two categories X and Y are to begin with, as well as the evidence that the measurements can offer. These are empirical questions - a programmer may not know (or care) what the exact answers are - so it's natural to draw directly on empirical data - labeled training examples - in specifying the correct computation.

This way of thinking about classification goes back to Thomas Bayes in the 1700s, and people have continued to find inspiration by conceptualizing the commonsense cognitive problems of everyday life in these terms. Indeed, network implementations of these algorithms have biological plausibility, as pioneered among others by Geoff Hinton. Nevertheless, it's important not to let these suggestive analogies go to your head - you must avoid thinking of machine learning as a substitute for programmers' substantive insight. Learning automates the drudgery in devising algorithms to solve specified problems, much as compilers automate the drudgery in specifying algorithms down to the level at which hardware can actually execute them. In neither case can you escape fundamental computer science or the need for clear computational thinking.

• Nov 30
Models, learning and complexity.
Minsky, Pearl, perceptrons, independence, bias and variance

Once we think of learning the construction of a representation to match empirical data, we are free to draw on all the analyses of representations we have developed so far. For example, the simplest network representation of a classification algorithm - the perceptron - implicitly draws on real-world generalizations because it implements a particular independence assumption about how measurements provide evidence for categories. This means it can consider only particular hypotheses about the world - it is biased - but also means that it generalizes aggressively from each example it sees, rapidly reducing its uncertainty about what it should do - so its predictions have low variance. More expressive models have less bias and more variance - it is a fundamental tradeoff in learning.

The limitations of the perceptron were explored by Minsky and Papert in the late 1960s and came as a surprise to the field. Statistical learning methods languished until we had ways to learn with richer representations - Pearl's work on general network models of causal relationships was a watershed. At the cost of some transparency of implemention, Pearl showed how to reason about arbitrary probabilistic relationships among discrete variables, allowing model-builders to characterize specific problems with the right amount of expressive power.

• Dec 7
Agency.
Von Neumann and McCarthy (again), actions and outcomes, utility and reinforcement.

Most of the time, we're only indirectly interested in specifying computations that recognize situations in the world - ultimately, we just want our programs to achieve good real-world outcomes. Still, as before, to achieve those outcomes may require real-world information that is tedious and routine to acquire. Machine learning remains as important as ever.

To apply learning here, we need to have empirically-grounded representations of the actions available to our system and the outcomes they achieve. The general perspective of decision theory (to which von Neumann of architecture fame also greatly contributed) shows that we can represent outcomes through the abstraction of utility, while foundational research in AI (going back to McCarthy of LISP fame) invites us to represent real-world action with formalisms analogous to those we've already used to represent computational action. Together, these insights allow us to specify computations in reinforcement learning - the construction of programs with optimal outcomes based on automatically-acquired empirical models of what happens in the real world when such programs are run. Characterizing reinforcement learning - and its challenges - offers a chance to synthesize all of the ideas we've seen so far on programs and processes, data and representation, complexity, abstraction and self-reference.