ECE 1786:  Creative Applications of Natural Language Processing

University of Toronto/Fall 2022

There has been remarkable progress in the capabilities of computers to process and generate language in the last five years.  This course is about learning and applying Deep Learning approaches in Natural Language Processing (NLP), from word vectors to Transformers, including GPT-3, GPT-4 and chatGPT.    It is a project-based course that teaches the fundamentals of neural-network-based NLP and gives students the opportunity to pursue a unique project.


The course lecture material begins with the basics of word-level embeddings – their properties and training. These form the basis of neural-network-based classifiers employed to do classification of sentiment, named entity recognition and many other language tasks. A significant part of the course is about the Transformer architecture – its structure, and training. This will include the use of the transformer as a classifier, but also as in generative mode, in which language is produced in response to input language. Much of the learning will be applied in three or four hands-on programming assignments. Students will also do a major project of their own choosing to make use of these capabilities.


Instructor:

Jonathan Rose (Jonathan.Rose@ece.utoronto.ca) - Department of Electrical and Computer Engineering.  Office: Engineering Annex, Room 319.


Teaching Assistants:

Andrew Brown  (andrewm.brown@mail.utoronto.ca)

Zining Zhu (zining@cs.toronto.edu)


Text

Speech and Language Processing, 3rd Edition Draft (January 2022 version)

 https://web.stanford.edu/~jurafsky/slp3/ed3book_jan122022.pdf


This text is missing a first chapter, but you can grab a reasonable version of the first chapter from the second edition, here: https://github.com/rain1024/slp2-pdf


Lectures

Lecture 0 - Introduction to the Course and Pre-requisites, Course Structure Notes Video

Lecture 1 - Word Embedding/Vector Properties and Meanings Notes Video

Lecture 2 - Training of Word Embeddings Notes Video

Lecture 3 - Classification of Language using Word Embeddings Notes Video

Lecture 4 - Introduction to Transformers and Project Structure Notes  Project Slides  Video

Lecture 5 - The Core Mechanism of Transformers: Attention Notes Video

Lecture 6 - Language Generation Using Transformers and Project Ideation Notes  Project Slides Video

Lecture 7 - Understanding Transformers & Tokenization  Notes  Video

Lecture 8 - Proposal Presentations

Lecture 9 - How GPT-3 is Trained to Respond to Human Intent  Notes  Video

Lecture 10/11 - Project Consultations Notes

Lecture 12 - Project Presentations Notes


Assignments

#    Date Assigned               Assignment                                                                                                Due

1     13-Sep                        Word Embeddings – Properties, Meaning and Training                            26-Sept

2     27-Sep                        Classification of Subjective/Objective Text                                               10-Oct

3     11-Oct                         Understanding, Training and Using Transformers for Classification         25-Oct

4    25-Oct                          Generation of Text Using Transformers: Question Answering                  15-Nov



Class Projects


A.I. Meet Yu-Gi-Oh! - Generate Card Text for Yu-Gi-Oh! Game

Aladdin Recommender - Movie Recommendation System based on Other Movies

Antiqu-ator - Generate Shakespearian Language from Regular English

Argument Gate - Rate how convincing an argument is

Artistic GENREator - Generate Song Lyrics in a Specific Song Genre

Caption Me - Generate Captions of Pictures with Dogs

Coding Challenge Generator - Generate Software Coding Challenge Problems

DeCo - Classify Businesses into Multiple Categories for Investment Purposes

Destructive Language Rater - Identify Different Levels of Toxic Language

Emojimotion - Generate Emojis for a given text and specified emotion

Eng2Py - Generate Sorting Algorithms with Natural Language Queries

EZPaperSearch - Search for Papers related to a Given Paper

Fairytale - Generate Fairytales that Illustrate Specific Values and at Specific Grade Level

Fake News Detection - that’s what it does

GOTalk - A Choose-your-own Chat with Game of Thrones Characters

Grammar Error Correction - Correcting Errors in Writing

Hive Mind Investor - Stock Market Prediction Based on Social Media Stock Sentiment

IELTS composition score predictor - Predicting the score of a English language Essay

MI-Con - Classify if a Therapy is Consistent with the Motivational Interviewing Behaviour Change Approach

Newsify - Generate a News Article

Noffence - Recognizing Hate Speech

PyOverflow - Answer Questions about Python Using the Information from the Python Manual

Remy - Generate Recipes for Cake and Cocktails

SafeChat - Identify Chat Text that has Suicidal Ideation

Sensify - Measure the Emotion in a Text

Song Genre Classifier - Classify Song Genre based on the Lyric Text

SoulsGen - Generate Cards in the Dark Souls card game

Sum(Text) - Automatic Text Summarization of News Articles

The Survey Insider - Automating Training Survey Response Classification

TrainAssist - Automating Training Survey Response Classification