Linguistics 684.01: Introduction to CL I (Winter 05)
This introduction for graduates and advanced undergraduates provides
an introduction to theory-driven computational linguistics (sometimes
referred to as ``symbolic CL''), focusing on syntax/parsing. The
course includes some formal background and emphasizes linking the
theoretical discussions to practical experience implementing
algorithms and small grammars, based on PROLOG
The course is part of the two course introduction to CL. The second
half, 684.02, focuses on data-intensive, statistical CL and is offered
by Chris Brew in Spring.
Instructor: Detmar Meurers
-
Office: 201a
Oxley
Hall (enter through 201 computer lab; if locked, knock loudly)
- Phone: 292-0461 (usually email works better)
- Email: dm@ling.osu.edu
- Office hours: Tuesday 11:00-12:00, and by appointment
Course meets: Tuesday and Thursday 3:30-5:18pm in 340
Central Classrooms
Course website:
http://purl.org/net/dm/05/winter/684.01/
The updated syllabus, assignments, slides, etc. will be posted there,
so check it regularly.
Course email: 684.01@ling.osu.edu
Mail sent to this address is forwarded to the official email addresses
(Name.Number@osu.edu) of all students enrolled in the class
and the instructor. Note that you should read email sent to your
official osu account on a daily basis---it'll also helps you avoid
high library fines!
Anonymous feedback: If you have comments, complaints, or
ideas you'd like to send me anonymously, you can use the web form at
http://purl.org/net/dm/feedback/ to do so. Please send me
ordinary email for anything that you'd like to receive a reply
to---there really is no way for me to find out who sent me something
via the anonymous feedback form!
Students with Disabilities:
Students who need an accommodation based on the impact of a disability
should contact me to arrange an appointment as soon as possible to
discuss the course format, to anticipate needs, and to explore
potential accommodations. I rely on the Office of Disability Services
for assistance in verifying the need for accommodations and developing
accommodation strategies. Students who have not previously contacted
the Office for Disability Services are encouraged to do so
(292-3307; http://www.ods.ohio-state.edu).
Academic Misconduct: To state the obvious, academic
dishonesty is not allowed. Cheating on assignments will be reported to
the University Committee on Academic Misconduct. The most common form
of misconduct is plagiarism. Remember that any time you use the ideas
or the materials of another person, you must acknowledge that you have
done so in a citation. This includes material that you have found on
the Web. The University provides guidelines for research at
http://gateway.lib.ohio-state.edu/tutor/.
Course prerequisites: An understanding of the basics of
linguistic analysis, syntax (LING 602.01 or equiv.), and formal
foundations (LING 680 or equiv.).
Successful course participation involves:
-
Regular attendance and active participation (20% of grade)
- Taking reading assignments serious and completing five/six
homework assignments (10% each), some paper and pencil, some
programming in Prolog (handed out Thursday, completed Tuesday's
class).
- Final project implementing a grammar fragment for a short (10
sentences) text of your choice, to be handed in Friday, March 11.
(20%)
Topics:
-
In the course, you will see three recurring aspects, with
increasing complexity:
-
data structures used for linguistic signs
- formalisms for expressing grammars using these data structures
- parsing algorithms for processing with those grammars
- The specific topics we will cover are the following:
-
Finite state machines and regular languages
(handout, exercise sheet 1)
- Implementing finite state machines in Prolog
(handout, exercise sheet 2)
- Towards more complex grammar formalisms: Basic formal language
theory
(handout)
- From context free grammars to definite clause grammars
(handout)
- What to encode in a grammar: A DCG for English
(handout, exercise sheet
3)
- How to process with a grammar: Intro to Parsing
(handout)
- More efficient parsing strategies
(handout, exercise sheet
4)
- Remembering sub-results: Well-formed substring tables
(handout, animated slides)
- Remembering subcomputations: The active chart
(handout, exercise sheet 5)
- More complex data structures: From atomic symbols to first order
terms to feature structures
- Term and feature structure unification
- PATR-II Parsing with complex categories
- Chart-Parsing with complex categories
- Implementing a grammar in a typed feature structure based
parsing system
After the lectures, the 4-up copies of the slides and the homework
sheets are posted on the web page in pdf format.
Reading material
-
There is a course reader, available from the course web page or directly
at
The reader is intended as a basic guideline for the material covered
in this course. It is a revised version of the module workbook for
``Techniques in Natural Language Processing 1'' by Chris Mellish, Pete
Whitelock and Graeme Ritchie, 1994, Department of Artificial
Intelligence, University of Edinburgh. I would like to thank them for
permitting me to adapt their material for this course.
- General background reading material:
-
Gerald Gazdar and Chris Mellish (1989): Natural
Language Processing in Prolog. Wokingham, England et al.:
Addison-Wesley.
- Fernando Pereira and Stuart Shieber (1987): Prolog
and Natural-Language Analysis. Stanford: CSLI Publications.
- Daniel Jurafsky and James H. Martin (2000): Speech
and Language Processing. Upper Saddle River, NJ: : Prentice
Hall.
Reading assignment No. 1: Chapter 1 of Jurafsky & Martin (2000)
On-line materials
The code used in the course, as well as links to several UNIX
introductions, Prolog manuals and tutorials are available from the
course web page or directly at
This document was translated from LATEX by
HEVEA.