CSCI 5541, NLP

Spring 2025, Tuesdays and Thursdays , 4:00pm to 5:15pm, Lind Hall L125

Course Information

Summary The purpose of this course is to provide an overview of the computational techniques developed to enable computers to interpret and respond appropriately to ideas expressed using natural languages, rather than formal languages, such as C++ or Python. This course will cover text classification, distributional representation methods of language, large language models, and advanced techniques in chatGPT. The course will cover a wide range of topics related to NLP, including theories, computational models, and applications with their societal and ethical impacts. Prerequisite: Maturity in linear algebra, calculus, and basic probability. Familiarity with Python. 5521 (recommended) or grad,

Natural Language Processing (NLP) is an interdisciplinary field that is based on theories in linguistics, cognitive science, and social science. The main focus of NLP is building computational models for applications such as machine translation and dialogue systems that can then interact with real users. Research and development in NLP therefore also includes considering important issues related to real-world AI systems, such as bias, controllability, interpretability, and ethics. This course will cover a broad range of topics related to NLP, from theories to computational models and applications to data annotation and evaluation. Students will read papers on those topics, create an annotated dataset, and implement algorithms on applications they are interested in. There will be a semester-long class project where you collect your own dataset, ensure it is accurate, develop a model using existing computing tools, evaluate the system, and consider its ethical and societal impacts.

The grade will be evaluated based on the course project, participation, and programming and reading assignments. All class material will be posted on the class site. We will use Canvas for homework and project submissions and grading, and Slack for discussion and QA. Email inquiries will be not be replied.

Instructors: James Mooney
Instructor

Risako Owan
Graduate TA

Bin Hu
Undergraduate TA

Junhan Wu
Undergraduate TA
Class meets: Tuesday and Thursday, 4PM to 5:15PM, Lind Hall L125
Office hours: James: Friday 3pm - 3:30pm via Zoom; Risako: Wednesday 10-10:30AM Shepherd 159; Bin: Monday 10-10:30AM Keller 1-213; Junhan: Tuesday 1:30-2PM Keller 1-213
Class page: https://jimtmooney.github.io/Courses/S25/index.html
Slack: https://csci5541s25.slack.com/
Canvas: canvas.umn.edu/courses/483164

Grading and Late Policy

Grading

60% Homework (hw1/2/3/6 for individual, hw4/5 for team)
30% Project (team)
10% Class Participation (individual)

Late policy for deliverables

Each student will be granted 5 late days to use for homeworks over the duration of the semester. After all free late days are used up, penalty is 1 point for each additional late day. The late days and penalty will be applied to all team members for group homework and project.

Schedule

We will cover basic NLP representations g(x), to build text classifiers P_theta(y|g(x)) , language models P_theta(g(x)), and large language models P_{theta is large}(g(x)). Based on knowledge you gain during the class, your team will develop your own NLP systems during the semester-long project. Pay attention to due dates and homework release. Lecture slides and homework/project description will be available in .

Date	Lectures and Dues	Readings
Jan 21	Class Overview
Jan 23	Intro to NLP HW1 out (Jan 26)
Jan 27	Recitation on computing basics (Junhan) Colab+JupyterNotebook Tutorial
Jan 28	Text Classification Tutorial on Scikit-Learn and PyTorch (Risako) Scikit-Learn Pytorch	Determining the sentiment of opinions From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank Text classifier with NLTK and Scikit-Learn
Jan 30	Text Classification (2) Tutorial on Finetuning & vLLM (Bin) Huggingface vLLM	Does BERT Learn as Humans Perceive? Understanding Linguistic Styles through Lexica Style is NOT a single variable: Case Studies for Cross-Style Language Understanding Beyond Accuracy: Behavioral Testing of NLP Models with CheckList Blog post on Pre-training vs Fine-tuning in LLM: Examples Tutorial on Text classification using HuggingFace's Transformers
Feb 4	Distributional Semantics and Word Vectors HW2 out Project description out	From Frequency to Meaning: Vector Space Models of Semantics Efficient Estimation of Word Representations in Vector Space Linguistic Regularities in Continuous Space Word Representations GloVe: Global Vectors for Word Representation Retrofitting Word Vectors to Semantic Lexicons Gensim's word2vec tutorial
Feb 6	Distributional Semantics and Word Vectors (2) HW1 due Project Team Formation due
Feb 11	Language Models (1): Ngram LM, Neural LM Colab Pro	Chapter 3 of Jurafsky and Martin A Neural Probabilistic Language Model
Feb 13	Project Guideline Language Models (2): RNNs, LSTMs and Sequence-to-Sequence HW2 due (Feb 16) HW3 out	Recurrent neural network based language model Long Short-Term Memory Multivariable chain rule, simple version Long Short-Term Memory Sequence to Sequence Learning with Neural Networks
Feb 18	Language Models (3): Search and Decoding Project brainstorming due	The Curious Case of Neural Text Degeneration Mutual Information and Diverse Decoding Improve Neural Machine Translation Sequence Level Training with Recurrent Neural Networks An Actor-Critic Algorithm for Sequence Prediction Training language models to follow instructions with human feedback
Feb 20	No Class
Feb 25	Project Proposal Pitch (1) Slides Deck for Group A	Group A: Saint Lingual (Ismail, Lily, Chiemeka, Taha) → Mentors: (Risako/Bin) Audi Quattro (Malak, Gehad, Amoligha, Abhi) → Mentors: (Risako/Junhan) NLPeak (Yiu, Lulin, Wan, Yu-Tong) → Mentors: (James/Junhan) 404 Not Found (Erina, Arunachalam, Saeid, Hahnemann) → Mentors (James/Bin) Epoch Explorers (Anna, Nipun, Mete, Jun) → Mentors: (Risako/Bin) Nvida & Chill (Vaibhav, Vivek, Ankit, Sai) → Mentors: (James/Bin) The Parsing Pals (Chi, Huong, Bang) → Mentors: (Risako/Bin) Mmmmmmmmm (Ryan, Mark) → Mentors: (James/Bin)
Feb 27	Project Proposal Pitch (2) Slides Deck for Group B	Group B: InterAgent Communication Lab (Isaac, Daniel, Benat, Joshua) → Mentors: (Risako/Bin) The Tokenizers (Anthony, Evan, Ajitesh, Tanmay) → Mentors: (James/Bin) Noob LP (William, Joseph, Ryan, John) → Mentors: (Risako/Junhan) InkSight (MJ, Yassin, Jordan, Akshat) → Mentors: (Risako/Junhan) Big Brains Generating Knowledge (Ben, Brandon, Gunnar, Kyle) → Mentors: (James, Junhan) Pickachu (Pranay, Aditya, Samra) → Mentors: (Risako/Junhan) The Not So Professional Linguists (Hady, Luka, Shesha, Share) → Mentors: (James/Junhan) Golden Data Retrievers (Kylie, Lucas, Jiyu, Ziqi) → Mentors: (James/Junhan) Calvin York (Calvin) → Mentor: (James)
Mar 4	Language Models (4): Evaluation and Applications Contextualized Word Embeddings HW4 out HW3 due	Neural Machine Translation by Jointly Learning to Align and Translate Perplexity of fixed-length models BLEU: a Method for Automatic Evaluation of Machine Translation ROUGE: A Package for Automatic Evaluation of Summaries Deep contextualized word representations BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding A Primer in BERTology: What we know about how BERT works
Mar 6	Transformers (In Depth) Proposal Report due	Attention is All you Need Tutorial on Illustrated Transformer Language Models are Unsupervised Multitask Learners Language Models are Few-Shot Learners Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Mar 18	Scaling and Pretraining	Scaling Laws for Neural Language Models On the Opportunities and Risks of Foundation Models On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?
Mar 20	Prompting	Chain-of-Thought Prompting Elicits Reasoning in Large Language Models Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing Prefix-Tuning: Optimizing Continuous Prompts for Generation
Mar 25	Instructing and Augmenting LLMs Data Annotation	Training language models to follow instructions with human feedback Augmented Language Models: a Survey Toolformer: Language Models Can Teach Themselves to Use Tools Internet-augmented language models through few-shot prompting for open-domain question answering Annotation Artifacts in Natural Language Inference Data Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics Everyone's Voice Matters: Quantifying Annotation Disagreement Using Demographic Information ChatGPT Outperforms Crowd-Workers for Text-Annotation Tasks
Mar 27	Efficiency HW4 due	Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer The Power of Scale for Parameter-Efficient Prompt Tuning
Apr 1	LLMs as Agents (Zae) HW5 out	ReAct: Synergizing Reasoning and Acting in Language Models MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework Generative Agents: Interactive Simulacra of Human Behavior WebArena: A Realistic Web Environment for Building Autonomous Agents
Apr 3	Modern Evaluation	HellaSwag: Can a Machine Really Finish Your Sentence? MEASURING MASSIVE MULTITASK LANGUAGE UNDERSTANDING Training Verifiers to Solve Math Word Problems Evaluating Large Language Models Trained on Code
Apr 8	DeepMind Guest Lecturer Project midterm office-hour due
Apr 10	Parallelism and Scaling	ZeRO: Memory Optimizations Toward Training Trillion Parameter Models The Ultra-Scale Playbook: Training LLMs on GPU Clusters Parallelism methods
Apr 15	Alignment (Karin De Langis & Ryan Koo)	Learning to summarize from human feedback Deep Reinforcement Learning from Human Preferences Proximal Policy Optimization Algorithms Direct preference optimization: Your language model is secretly a reward model Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback Dynamic Multi-Reward Weighting for Multi-Style Controllable Generation Benchmarking Cognitive Biases in Large Language Models as Evaluators
Apr 17	Multimodal NLP HW6 out	AN IMAGE IS WORTH 16X16 WORDS: TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE Learning Transferable Visual Models From Natural Language Supervision Visual Instruction Tuning Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models Scaling Rectified Flow Transformers for High-Resolution Image Synthesis The Llama 3 Herd of Models
Apr 22	Reasoning HW5 due	Understanding Reasoning LLMs DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning WebGPT: Browser-assisted question-answering with human feedback Training Verifiers to Solve Math Word Problems Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision
Apr 24	Interpretability	Interpreting Language Models with Contrastive Explanations BERT Rediscovers the Classical NLP Pipeline Zoom In: An Introduction to Circuits INTERPRETABILITY IN THE WILD: A CIRCUIT FOR INDIRECT OBJECT IDENTIFICATION IN GPT-2 SMALL Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space Towards Monosemanticity: Decomposing Language Models With Dictionary Learning
Apr 29	Final Project Poster (1)	Posters for Group B InterAgent Communication Lab"A Study on Information Transmission and Transformation in LLM-Based Multi-Agent Systems for Effective Collaboration" The Tokenizers "Optimizing Small Language Models for Cryptographic Applications" Noob LP "The Role of Data Variety: Observing Cross-Skill Impacts Through Targeted LLM Unlearning" InkSight "Righting Writing" Big Brains Generating Knowledge "White Hat Hacking: Detecting AI Generated Text" Pickachu "3D Vision-Language Model for Generalized Robotic Manipulation" The Not So Professional Linguists "Personalized Joke Identification and Generation Model Based on User Preference" Golden Data Retrievers "Using Sentiment Analysis to measure perception of mental health across different platforms."
May 1	Final Project Poster (2) HW6 due (May 5) Project final report due (May 8)	Posters for Group A 404 Not Found "Multimodal Sarcasm Detection Using Vision-Language Models" Saint Lingual "Creating New Benchmarks to Test Effectiveness of NLP Models with Code-Switching" Audi Quattro "Testing Unexplored Minority Languages - Sourashtra" NLPeak "Towards Smarter Segmentation: Improving SAM's Anomaly Understanding with RLHF" Epoch Explorers "Identifying News Bias Using Simulations with LLM Agents" Nvidia & Chill "Do LLMs Have Personality Bias? An Analysis using Demographic Profiles and Prompt Engineering" The Parsing Pals "Speech-to-Text: Translate Dialects to Official Vietnamese Language" Mmmmmmmmm "Circuits of Persistent States in LLMs"

Homework Details (60%)

All questions regarding homework MUST be communicated with the lead TA over Slack homework channels (e.g., #hw1, #hw2) or during their office hours. Homework 1, 2, 3, and 6 should be done individually, while homework 4 and 5 are team-based (maximum of 4 people). Your team for homework 4 and 5 should be the same for the project team. The use of outside resources (books, research papers, websites, etc.) or collaboration (students, professors, chatGPT, etc.) must be explicitly acknowledged in your report. Check out the notes for academic intergrity.

The deadline for all homework is by midnight (11:59PM) of the due date. Due to a tight schedule, there will be no deadline extension, but you can still use your late days. For the delayed team homework, late days for every team member will be counted. Check out the homework description and link to canvas for submission:

Here are homework assignments with dues:

HW1: Building MLP-based text classifier with pytorch (5 points, Individual, due: Feb 4) (, )
HW2: Finetuning text classifier using HuggingFace (10 points, Individual, due: Feb 11) (, )
HW3: Authorship attribution using language models (LMs) (10 points, Team, due: Mar 4) (, )
HW4: Generating and evaluating text generated from pretrained LMs (15 points, Team, due: Mar 27) (, )
HW5: Prompting with large language models (LLMs) (15 points, Team, due: Apr 22) (, )
HW6: Essay writing with ChatGPT (5 points, Individual, due: May 5) (, )

Project Details (30%)

First, carefully read the project description , as most project information, dues, rubric, and answers to your questions are in the description document. It is your responsbililty to miss any information regarding the project. Your team (maximum of 4 people) should submit their report, link to code (or a zipped code), and presentation slides/poster to Canvas before the deadline. Use official ACL style templates (Overleaf or links). Here are some dues you have to submit for project (note that some dues are during week days):

Team formation (1 point, due: Feb 6) ()
Project brainstorming (1 point, due: Feb 18) ()
Proposal pitch (3 points, due: Feb 25 and 27) (Slides decks for Group A and Group B)
Proposal report (5 points, due: Mar 6) ()
Midterm office hour participation (5 points, due: Apr 8) ()
Poster presentation (5 points, due: Apr 29 and May 1) ()
Final report (10 points, due: May 8) () (evaluation rubric)

You can find some selected project reports and posters from the previous years' NLP classes below. Some projects are extended and published top-tier workshop and conferences:

[CSCI 5541 S23] Simulating Everyone's Voice: Exploring ChatGPTs Ability to Simulate Human Annotators
[CSCI 5541 S23] Vision & Language-guided Generalized Object Grasping
[CSCI 5541 S23] Generalizability of FLAN-T5 Model Using Composite Task Prompting
[CSCI 5541 S23] Comparing the Effectiveness of Fine-tuning vs. One-Shot Learning on the Kidz Bopification Task
[CSCI 5980 F22] Generating Controllable Long-dialogue with Coherence → Published in AAAI 2024
[CSCI 8980 S22] Understanding Narrative Transportation in Fantasy Fanfiction → Published in Workshop on Narrative Understanding (WNU) @ACL 2023

Class Participation (10%)

Your class participation is thoroughly evaluated. Put your profile picture on Canvas and Slack so we can match you for the final evaluation. The following metrics will be used to grade your participation:

Participation and discussion in class
Discussion on Slack and during Office Hours for both instructor and TAs
Discussion and QA during the presentation of the project proposal and poster

We explicility count the number of your offline and online participation, and (min/max) normalize them at the end of the class. Your participation score will be zero if you haven't participated in class, Slack or other discussions..

Prerequisites

Required: CSCI 2041 Advanced Programming Principles

Recommended: CSCI 5521 Introduction to Machine Learning or any other course that covers fundamental machine learning algorithms.

Furthermore, this course assumes:

Good coding ability, corresponding to at least a third or fourth-year undergraduate CS major. Assignments will be in Python.
Background in basic probability, linear algebra, and calculus.

Notes to students

Academic Integrity

Assignments and project reports for the class must represent individual effort unless group work is explicitly allowed. Verbal collaboration on your assignments or class projects with your classmates and instructor is acceptable. But, everything you turn in must be your own work, and you must note the names of anyone you collaborated with on each problem and cite resources that you used to learn about the problem. If you have any doubts about whether a particular action may be construed as cheating, ask the instructor for clarification before you do it. Cheating in this course will result in a grade of F for course and the University policies will be followed.

Students with Disabilities

If you have a disability for which you are or may be requesting an accommodation, you are encouraged to contact both your instructor and Disability Resources Center (DRC).

COVID-19

All students are expected to abide by campus policies regarding COVID-19 including masking and vaccination requirements. This is an in-person class with daily in-person activities, but we may consider a hybrid or online option. If you're feeling sick, stay at home and catch up with the course materials instead of coming to class!

Book

Textbook is not required but the following books are primarily referred:

Jurafsky and Martin, Speech and Language Processing, 3rd edition [online]
Jacob Eisenstein. Natural Language Processing

Resources

Some course materials are inspired by the slides of Chris Manning at Stanford, Carlos Guestrin at Stanford, David Bamman at UC Berkeley, and Graham Neubig at CMU.