151 lines
8.3 KiB
Plaintext
151 lines
8.3 KiB
Plaintext
|
Draft for ACM SIGPLAN Patterns (Language Trends)
|
||
|
|
||
|
1996
|
||
|
|
||
|
Why GAWK for AI?
|
||
|
|
||
|
Ronald P. Loui
|
||
|
|
||
|
Most people are surprised when I tell them what language we use in our
|
||
|
undergraduate AI programming class. That's understandable. We use
|
||
|
GAWK. GAWK, Gnu's version of Aho, Weinberger, and Kernighan's old
|
||
|
pattern scanning language isn't even viewed as a programming language by
|
||
|
most people. Like PERL and TCL, most prefer to view it as a "scripting
|
||
|
language." It has no objects; it is not functional; it does no built-in
|
||
|
logic programming. Their surprise turns to puzzlement when I confide
|
||
|
that (a) while the students are allowed to use any language they want;
|
||
|
(b) with a single exception, the best work consistently results from
|
||
|
those working in GAWK. (footnote: The exception was a PASCAL
|
||
|
programmer who is now an NSF graduate fellow getting a Ph.D. in
|
||
|
mathematics at Harvard.) Programmers in C, C++, and LISP haven't even
|
||
|
been close (we have not seen work in PROLOG or JAVA).
|
||
|
|
||
|
Why GAWK?
|
||
|
|
||
|
There are some quick answers that have to do with the pragmatics of
|
||
|
undergraduate programming. Then there are more instructive answers that
|
||
|
might be valuable to those who debate programming paradigms or to those
|
||
|
who study the history of AI languages. And there are some deep
|
||
|
philosophical answers that expose the nature of reasoning and symbolic
|
||
|
AI. I think the answers, especially the last ones, can be even more
|
||
|
surprising than the observed effectiveness of GAWK for AI.
|
||
|
|
||
|
First it must be confessed that PERL programmers can cobble together AI
|
||
|
projects well, too. Most of GAWK's attractiveness is reproduced in
|
||
|
PERL, and the success of PERL forebodes some of the success of GAWK.
|
||
|
Both are powerful string-processing languages that allow the programmer
|
||
|
to exploit many of the features of a UNIX environment. Both provide
|
||
|
powerful constructions for manipulating a wide variety of data in
|
||
|
reasonably efficient ways. Both are interpreted, which can reduce
|
||
|
development time. Both have short learning curves. The GAWK manual can
|
||
|
be consumed in a single lab session and the language can be mastered by
|
||
|
the next morning by the average student. GAWK's automatic
|
||
|
initialization, implicit coercion, I/O support and lack of pointers
|
||
|
forgive many of the mistakes that young programmers are likely to make.
|
||
|
Those who have seen C but not mastered it are happy to see that GAWK
|
||
|
retains some of the same sensibilities while adding what must be
|
||
|
regarded as spoonsful of syntactic sugar. Some will argue that
|
||
|
PERL has superior functionality, but for quick AI applications, the
|
||
|
additional functionality is rarely missed. In fact, PERL's terse syntax
|
||
|
is not friendly when regular expressions begin to proliferate and
|
||
|
strings contain fragments of HTML, WWW addresses, or shell commands.
|
||
|
PERL provides new ways of doing things, but not necessarily ways of
|
||
|
doing new things.
|
||
|
|
||
|
In the end, despite minor difference, both PERL and GAWK minimize
|
||
|
programmer time. Neither really provides the programmer the setting in
|
||
|
which to worry about minimizing run-time.
|
||
|
|
||
|
There are further simple answers. Probably the best is the fact that
|
||
|
increasingly, undergraduate AI programming is involving the Web. Oren
|
||
|
Etzioni (University of Washington, Seattle) has for a while been arguing
|
||
|
that the "softbot" is replacing the mechanical engineers' robot as the
|
||
|
most glamorous AI testbed. If the artifact whose behavior needs to be
|
||
|
controlled in an intelligent way is the software agent, then a language
|
||
|
that is well-suited to controlling the software environment is the
|
||
|
appropriate language. That would imply a scripting language. If the
|
||
|
robot is KAREL, then the right language is "turn left; turn right." If
|
||
|
the robot is Netscape, then the right language is something that can
|
||
|
generate "netscape -remote 'openURL(http://cs.wustl.edu/~loui)'" with
|
||
|
elan.
|
||
|
|
||
|
Of course, there are deeper answers. Jon Bentley found two pearls in
|
||
|
GAWK: its regular expressions and its associative arrays. GAWK asks
|
||
|
the programmer to use the file system for data organization and the
|
||
|
operating system for debugging tools and subroutine libraries. There is
|
||
|
no issue of user-interface. This forces the programmer to return to the
|
||
|
question of what the program does, not how it looks. There is no time
|
||
|
spent programming a binsort when the data can be shipped to /bin/sort
|
||
|
in no time. (footnote: I am reminded of my IBM colleague Ben Grosof's
|
||
|
advice for Palo Alto: Don't worry about whether it's highway 101 or 280.
|
||
|
Don't worry if you have to head south for an entrance to go north. Just
|
||
|
get on the highway as quickly as possible.)
|
||
|
|
||
|
There are some similarities between GAWK and LISP that are illuminating.
|
||
|
Both provided a powerful uniform data structure (the associative array
|
||
|
implemented as a hash table for GAWK and the S-expression, or list of
|
||
|
lists, for LISP). Both were well-supported in their environments (GAWK
|
||
|
being a child of UNIX, and LISP being the heart of lisp machines). Both
|
||
|
have trivial syntax and find their power in the programmer's willingness
|
||
|
to use the simple blocks to build a complex approach.
|
||
|
|
||
|
Deeper still, is the nature of AI programming. AI is about
|
||
|
functionality and exploratory programming. It is about bottom-up design
|
||
|
and the building of ambitions as greater behaviors can be demonstrated.
|
||
|
Woe be to the top-down AI programmer who finds that the bottom-level
|
||
|
refinements, "this subroutine parses the sentence," cannot actually be
|
||
|
implemented. Woe be to the programmer who perfects the data structures
|
||
|
for that heapsort when the whole approach to the high-level problem
|
||
|
needs to be rethought, and the code is sent to the junkheap the next day.
|
||
|
|
||
|
AI programming requires high-level thinking. There have always been a few
|
||
|
gifted programmers who can write high-level programs in assembly language.
|
||
|
Most however need the ambient abstraction to have a higher floor.
|
||
|
|
||
|
Now for the surprising philosophical answers. First, AI has discovered
|
||
|
that brute-force combinatorics, as an approach to generating intelligent
|
||
|
behavior, does not often provide the solution. Chess, neural nets, and
|
||
|
genetic programming show the limits of brute computation. The
|
||
|
alternative is clever program organization. (footnote: One might add
|
||
|
that the former are the AI approaches that work, but that is easily
|
||
|
dismissed: those are the AI approaches that work in general, precisely
|
||
|
because cleverness is problem-specific.) So AI programmers always want
|
||
|
to maximize the content of their program, not optimize the efficiency
|
||
|
of an approach. They want minds, not insects. Instead of enumerating
|
||
|
large search spaces, they define ways of reducing search, ways of
|
||
|
bringing different knowledge to the task. A language that maximizes
|
||
|
what the programmer can attempt rather than one that provides tremendous
|
||
|
control over how to attempt it, will be the AI choice in the end.
|
||
|
|
||
|
Second, inference is merely the expansion of notation. No matter whether
|
||
|
the logic that underlies an AI program is fuzzy, probabilistic, deontic,
|
||
|
defeasible, or deductive, the logic merely defines how strings can be
|
||
|
transformed into other strings. A language that provides the best
|
||
|
support for string processing in the end provides the best support for
|
||
|
logic, for the exploration of various logics, and for most forms of
|
||
|
symbolic processing that AI might choose to call "reasoning" instead of
|
||
|
"logic." The implication is that PROLOG, which saves the AI programmer
|
||
|
from having to write a unifier, saves perhaps two dozen lines of GAWK
|
||
|
code at the expense of strongly biasing the logic and representational
|
||
|
expressiveness of any approach.
|
||
|
|
||
|
I view these last two points as news not only to the programming language
|
||
|
community, but also to much of the AI community that has not reflected on
|
||
|
the past decade's lessons.
|
||
|
|
||
|
In the puny language, GAWK, which Aho, Weinberger, and Kernighan thought
|
||
|
not much more important than grep or sed, I find lessons in AI's trends,
|
||
|
AI's history, and the foundations of AI. What I have found not only
|
||
|
surprising but also hopeful, is that when I have approached the AI
|
||
|
people who still enjoy programming, some of them are not the least bit
|
||
|
surprised.
|
||
|
|
||
|
|
||
|
R. Loui (loui@ai.wustl.edu) is Associate Professor of Computer Science,
|
||
|
at Washington University in St. Louis. He has published in AI Journal,
|
||
|
Computational Intelligence, ACM SIGART, AI Magazine, AI and Law, the ACM
|
||
|
Computing Surveys Symposium on AI, Cognitive Science, Minds and
|
||
|
Machines, Journal of Philosophy, and is on this year's program
|
||
|
committees for AAAI (National AI conference) and KR (Knowledge
|
||
|
Representation and Reasoning).
|