LING 520: Computational Analysis of English

Course Information

Instructor:

Nick Pendar

Office:

355 Ross Hall

Office Hours:

MWF 2:00-3:00 or by appointment

Phone:

515-294-3368

Email:

pendar (at iastate)

Course Website:

http://pendar.public.iastate.edu/ling520

Required Texts:

  • Jurafsky, D. and J. H. Martin (2000). Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition. Upper Saddle River, N.J.: Prentice Hall.

  • Downey, A.B. (2007). How to Think Like a (Python) Programmer. Wellsley, MA: Green Tea Press.

    Available online at http://www.greenteapress.com/thinkpython

  • Bird, S., J. Curran, E. Klein and E. Loper (2006) Introduction to Natural Language Processing.

    Available online at: http://nltk.sourceforge.net/index.php/Book

  • Other reading assigned as needed

Useful References:
Useful Links:

This course is an introduction to computational linguistics with emphasis on corpus processing, symbolic natural language parsing, and grammar engineering. Topics include corpora, text/corpus processing, syntactic parsing, Lexical Functional Grammar, grammar engineering, and Python programming.

Evaluation

Evaluation in this course is based on a series of assignments, a final project and a final take-home examination. The assignments and project count for 70% of your grade and the take-home exam counts 30%.


Task

Weight

Due Date

Seven assignments

35% (5% each)

(periodic)

Course project & presentation

35%

April 23

Take-home examination

30%

TBA

Recommended Reading

Cormen et al. (1990) is an excellent introduction to computer science, and its mathematical foundations. More advanced students will also find Manning and Schütze (1999) a valuable resource on statistical natural language processing. Heift and Schulze (2007) provide a thorough overview of the use of parsers in intelligent computer-assisted language learning. Shermis and Burstein (2003) is a collection of papers on automated essay scoring and its related pedagogical considerations. Sampson & McCarthy (2004) is a collection of seminal papers in corpus/computational linguistics.

Syllabus

Week

Date

Topic

Reading

1

Jan. 16

Syllabus, Introduction, Getting started with Unix

JM, Ch. 1

2

Jan. 23

Python programming (Assignment 1)

Downey, Ch. 1, 2, 3, 5

3

Jan. 30

Python programming (Nick away)

tokenizer starter code

Downey, Ch. 6, 7, 8, 9 (Assignment 1 due)

4

Feb. 6

Python programming (Assignment 2)

Downey, Ch. 10, 11, 12

5

Feb. 13

Python programming

Downey, Ch. 13, 14, 15 (Assignment 2 due Tuesday)

6

Feb. 20

Python programming

Downey, Ch. 16, 17, 18

7

Feb. 27

Regular expressions and automata

JM, Ch. 2, 3; Appendix A, B

8

Mar. 5

Corpora (slides, handout) (Assignment 3, updated 03/08/08)

JM, Sec. 6.1-6.2; BLK, Ch. 13

9

Mar. 12

Context-free grammars & parsing

(Assignment 4)

JM, Sec. 8.1, 8.2; Ch. 9, 10 (Assignment 3 due Tuesday)

10

Mar. 19

SPRING BREAK


11

Mar. 26

Lexicalized & probabilistic parsing

JM, Ch. 12 (Assignment 4 due 3/31)

12

Apr. 2

13

Apr. 9

Feature-based grammars; LFG

JM, Ch. 11; BLK, Ch. 11; Falk, TBA

14

Apr. 16

LFG Grammar Engineering

XLE Intro, XLE Documentation, read section 1.1 Walkthrough

15

Apr. 23

Student presentations


16

Apr. 30

Student presentations


References

Cormen, T. H., C. E. Leiserson, and R. L. Rivest (1990). Introduction to Algorithms. Cambridge, MA: The MIT Press.

Downey, A.B. (2007). How to Think Like a (Python) Programmer. Wellsley, MA: Green Tea Press.

Heift, T. and M. Schulze (2007). Errors and Intelligence in Computer-Assisted Language Learning: Parsers and Pedagogues. New York: Routledge.

Jurafsky, D. and J. H. Martin (2000). Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition. Upper Saddle River, N.J.: Prentice Hall.

Manning, C. D. and H. Schütze (1999). Foundations of Statistical Natural Language Processing. Cambridge, MA: The MIT Press.

Sampson, G. and D. McCarthy (Eds.) (2004). Corpus Linguistics: Readings in a Widening Discipline. New York: Continuum.

Shermis, M.D. and J.C. Burstein (Eds.) (2003). Automated Essay Scoring: A Cross-disciplinary Perspective. Mahwah, N.J.: Lawrence Erlbaum Associates.