There has been remarkable progress in the capabilities of computers to process and generate language in the last 5 years. This course is about learning and applying Deep Learning approaches in Natural Language Processing (NLP), from word vectors to Transformers, focused on Large Language Models such as GPT-2,3,4,4o, Llama 2,3 and many others that have exploded on to this scene. It teaches the fundamentals of neural-network-based NLP and gives students the opportunity to pursue a unique project.
The course lecture material begins with the basics of word-level embeddings – their properties and training. These form the basis of neural-network-based classifiers employed to do classification of sentiment, named entity recognition and many other language tasks. A significant part of the course is about the Transformer architecture – its structure, training and intuition behind these. This will include the use of the transformer as a classifier, but also in generative mode, in which language is produced in response to input language. This will include the agentic approach to using the latest LLMs, in which multiple instances of an LLM are used to divide a problem into do-able pieces, when a single model cannot do the task.
Much of the learning will be applied in four hands-on programming assignments. Students will also do a major group project of their own choosing to make use of these capabilities.
Instructor:
Jonathan Rose (Jonathan.Rose@utoronto.ca) - Department of Electrical and Computer Engineering. Office: Engineering Annex, Room 319.
Teaching Assistants:
Jiading Zhu
Zafar Mahmood
Weizhou Wang
Mohammadreza Safavi
Soliman Ali
Text
Speech and Language Processing, 3rd Edition Draft (August 2024 version):
https://web.stanford.edu/~jurafsky/slp3/ed3bookaug20_2024.pdf
This text is missing a first chapter, but you can grab a reasonable version of the first chapter from the second edition, here: https://github.com/rain1024/slp2-pdf
Lectures
Lecture 0 - Introduction to the Course and Pre-requisites, Course Structure Notes Video
Lecture 1 - Introduction to Word Embeddings Notes Video
Lecture 2 - How Word Embeddings are Trained/Created Notes Video
Lecture 3 - Classification of Language Using Word Embeddings Notes Video
Lecture 4 - Intro to Language Models and Transformers/Project Structure Part 1 Part 2 Video
Lecture 5 - The Core Mechanisms of Transformers & Assignment 3 Notes Video
Lecture 6 - Language Generation and Project Ideation Part 1 Part 2 TA Ideas Video
Lecture 7 - LLM Scaling, Prompt Engineering, RAG, Tokenization Notes Video
Lecture 8 - Proposal Presentations Notes Video
Lecture 9 - How Large Language Models Became Good at doing what is asked: RLHF & DPO Notes Video
Lecture 10 - Agentic Commentary, The Super-Prompt and Final Course Deliverables Notes Video
Assignments (tentative)
# Date Assigned Assignment Due
1 10-Sep Word Embeddings – Properties, Meaning and Training 23-Sept
2 24-Sep Subjective/Objective Sentence Classification Using MLP and CNN 7-Oct
3 8-October Training a Transformer Language Model and using it for Classification 21-Oct
4 26- October Generation of Probability Trees, Prompt Engineering and Agentic Systems 13-Nov