Linguistics 684.01: Introduction to CL I (Winter 06)
This introduction for graduates and advanced undergraduates provides
an introduction to theory-driven computational linguistics (sometimes
referred to as ``symbolic CL''), focusing on syntax/parsing. The
course emphasizes linking the formal and theoretical issues to
practical experience implementing algorithms and small grammars, based
on Prolog.
The course is part of the two course introduction to CL. The second
half, 684.02, focuses on data-intensive, statistical CL and is offered
by Chris Brew in Spring.
Instructor: Detmar Meurers
-
Email: dm@ling.osu.edu
- Office hours: Mondays 2:00-3:00 (or just make an
appointment by email)
- Office: 201A
Oxley
Hall (enter through 201 computer lab; if locked, knock loudly)
Course meets: Tuesday and Thursday 3:30-5:18pm in 340
Central Classrooms
Course website:
http://purl.org/net/dm/06/winter/684.01/
The updated syllabus, assignments, slides, etc. will be posted there,
so check it regularly.
Course email: 684.01@ling.osu.edu
Mail sent to this address is forwarded to the official email addresses
(Name.Number@osu.edu) of all students enrolled in the class
and the instructor. Note that you should read email sent to your
official osu account on a daily basis---it'll also helps you avoid
library fines!
Anonymous feedback: If you have comments, complaints, or
ideas you'd like to send me anonymously, you can use the web form at
http://purl.org/net/dm/feedback/ to do so. Please send me
ordinary email for anything that you'd like to receive a reply
to---there really is no way for me to find out who sent me something
via the anonymous feedback form!
Students with Disabilities:
Students who need an accommodation based on the impact of a disability
should contact me to arrange an appointment as soon as possible to
discuss the course format, to anticipate needs, and to explore
potential accommodations. I rely on the Office of Disability Services
for assistance in verifying the need for accommodations and developing
accommodation strategies. Students who have not previously contacted
the Office for Disability Services are encouraged to do so
(292-3307; http://www.ods.ohio-state.edu).
Academic Misconduct: To state the obvious, academic
dishonesty is not allowed. Cheating on assignments will be reported to
the University Committee on Academic Misconduct. The most common form
of misconduct is plagiarism. Remember that any time you use the ideas
or the materials of another person, you must acknowledge that you have
done so in a citation. This includes material that you have found on
the Web. The University provides guidelines for research at
http://gateway.lib.ohio-state.edu/tutor/.
Course prerequisites: An understanding of the basics of
linguistic analysis, syntax (LING 602.01 or equiv.), and formal
foundations (LING 680 or equiv.).
Successful course participation involves:
-
Regular attendance and active participation (20% of grade)
- Taking reading assignments serious and completing five (or
six) homework assignments. Some ``paper and pencil'', some
programming in Prolog; usually handed out Thursday and due in
Tuesday's class (60% of grade).
- Final project implementing a grammar fragment for a short (10
sentences) text of your choice, to be handed in Friday, March 10
(You should start on this at the end of February!) (20%).
Topics:
-
In the course, you will see three recurring aspects, with
increasing complexity:
-
data structures used to encode/model linguistic objects
- formalisms for expressing grammars using these data structures
- parsing algorithms for processing with those grammars
- The specific topics we will cover are the following:
-
Finite state machines and regular languages
(handout, exercise sheet 1)
- Implementing finite state machines in Prolog
(handout, exercise
sheet 2, exercise sheet 2a)
- Towards more complex grammar formalisms: Basic formal language
theory
(handout)
- From context free grammars to definite clause grammars
(handout)
- What to encode in a grammar: A DCG for English
(handout, exercise sheet 3)
- How to process with a grammar: Intro to Parsing
(handout, animated slides)
- More efficient parsing strategies
(handout,
animated slides, exercise sheet 4)
- Remembering sub-results: Well-formed substring tables
(handout,
animated slides, exercise sheet 5)
- Remembering sub-computations: The active chart
(handout, exercise sheet 6, project)
- (Chart) parsing with complex categories
(handout)
After the lectures, the 4-up copies of the slides and the homework
sheets are posted on the web page in pdf format.
Reading material
-
There is a course reader, available from the course web page or directly
at
The reader is intended as a basic guideline for the material covered
in this course. It is a revised version of the module workbook for
``Techniques in Natural Language Processing 1'' by Chris Mellish, Pete
Whitelock and Graeme Ritchie, 1994, Department of Artificial
Intelligence, University of Edinburgh. I would like to thank them for
permitting me to adapt their material for this course.
- General background reading material:
-
Gerald Gazdar and Chris Mellish (1989): Natural
Language Processing in Prolog. Wokingham, England et al.:
Addison-Wesley.
- Fernando Pereira and Stuart Shieber (1987): Prolog
and Natural-Language Analysis. Stanford: CSLI Publications.
- Daniel Jurafsky and James H. Martin (2000): Speech
and Language Processing. Upper Saddle River, NJ: : Prentice
Hall.
Reading assignment No. 1: Chapter 1 of Jurafsky & Martin (2000)
On-line materials
Most materials for our course, as well as links to several UNIX
introductions, Prolog manuals and tutorials are available from the
course web page or directly at
To log into the restricted area, you'll need the ID and password
mentioned in class.
This document was translated from LATEX by
HEVEA.