Natural Language Processing
This module provides you with a grounding in both rule-based and statistical approaches to Natural Language Processing (NPL) and combines theoretical study with hands-on work employing widely used software packages.
Machine processing of natural language is a key target for the application of Data Science techniques. It has a range of specialised techniques that are being developed in a large and growing research field of NLP. This module focuses on text processing and does not deal with speech or multi-modal communication.
- History of NLP and its applications
- Language processing and Python
- Curated corpora and raw data sources
- Corpus readers, stemmers and taggers
- Classification tasks: e.g. gender identification, sentiment analysis, joint/sequence classification
- Classification methods: decision trees, Naïve Bayes, MaxEnt
- Information extraction: chunking and NER (Named Entity Recognition)
- Formal grammars and parsing
- Grammars and parsing: probabilistic parsing, feature-based grammars
- Ethical and social issues around NLP
15 (150 hours)
- Coursework (30%)
- Written examination (70%)