July 28-August 7, 2025 (M-Th) 10:00am-11:30am EDT
In this two-week
course, students will learn a variety of natural language processing methods for analyzing and extracting meaning from text data. The course will start with an introduction to text data, including text preprocessing and exploratory methods. The topics that follow will include machine learning models used for topic modeling, clustering, classification, sentiment analysis, and word embeddings. Students will also be introduced to web scraping. Considerations to both long and short texts of various subject matter. Class examples will be demonstrated primarily in R. This course assumes a bachelors-level background in Statistics or related field and knowledge of R or Python; no prior knowledge of text analysis is assumed.
Robyn Ferg is a senior statistician at Westat. Her doctoral and postdoctoral research focused on developing methods for extracting insights from social media data. She has taught graduate and short courses at the Joint Program in Survey Methodology (University of Maryland) and given talks on text analysis at the Census Bureau, University of Maryland, University of Michigan, Michigan State University, and several national and international conferences. She has published original research on this topic in several peer reviewed journals. She has a PhD in Statistics from the University of Michigan.
The
Summer Institute in Survey Research Techniques provides rigorous and high quality graduate level training in all phases of survey research. The noncredit courses are open to all. The courses are live online via Zoom.
Registration and
payment are required.
Course fees are based on the total number of hours assigned to each course, the hours are listed on the course description. The
2025 schedule lists additional courses. If you have any questions regarding the application process, please use the
online contact form or email the Summer Institute at
[email protected] .
