This is the first course in a two-term introduction to the integration of social science and statistical science approaches to the design, collection, and analysis of surveys. The seminar will focus on six to eight areas of statistical and methodological literature that have benefited from alternative approaches. Students demonstrate mastery of those literatures through critical review papers, ideas for extensions of the literature, and empirical projects related to research reviewed.
Doctoral Seminar I
Fundamentals of Inference
This course is one of the fundamental 3 courses required by all students in the Master’s Program in Survey Methodology, and focuses on the fundamentals of statistical inference in the finite population setting.
The course is design to overview and review fundamental ideas of making inferences about populations. It will emphasize the basic principles of probability sampling; focus on differences between making predictions and making inferences; explore the differences between randomized study designs and observational studies; consider model-based vs. design-based analytic approaches; review techniques designed to improve efficiency using auxiliary information; and consider non-probability sampling and related inferential techniques.
Fundamentals of Computing and Data Display
Empirical social scientists are often confronted with a variety of data sources and formats that extend beyond structured and handleable survey data. With the emergence of BigData, especially data from web sources play an increasingly important role in scientificresearch. However, the potential of new data sources comes with the need for comprehensive computational skills in order to deal with loads of potentially unstructured information. Against this background, the first part of this course provides an introduction to web scraping and APIs for gathering data from the weband then discusses how to store and manage (big) data from diverse sources efficiently. The second part of the course demonstrates techniques for exploring an dfinding patterns in (non-standard) data, with a focus on data visualization. Tools for reproducible research will be introduced to facilitate transparent and collaborative programming. The course focuses on R as the primary computing environment, with excursus into SQL and Big Data processing tools.
Applications of Statistical Modeling
Applications of Statistical Modeling, designed and required for students on all three tracks of the two programs in survey methodology, will provide students with exposure to applications of more advanced statistical modeling tools for both substantive and methodological investigations that are not fully covered in other MPSM or JPSM courses. Modeling techniques to be covered include multilevel and marginal modeling techniques for clustered or longitudinal data (with applications to methodological studies of interviewer effects and modeling trends in the Health and Retirement Study), structural equation modeling (with an application of latent class models to methodological studies of measurement error), and classification trees (with an application to prediction of response propensity). Discussions and examples of each modeling technique will be supplemented with methods for appropriately handling complex sample designs when fitting the models. The class will focus on essential concepts, practical applications, and software, rather than extensive theoretical discussions.
Statistical Modeling and Machine Learning I
This is the first in a two term sequence in applied statistical methods covering topics such as regression, analysis of variance, categorical data, and survival analysis.
Cognition, Communication, and Survey Measurement
Survey data are only as meaningful as the answers that respondents provide. Hence, the processes that underlie respondents’ answers are of crucial importance. This course draws on current theorizing in cognitive and social psychology pertaining to issues such as language comprehension, information storage and retrieval, autobiographical memory, social judgment, and the communicative dynamics of survey interviewing, to understand how respondents deal with the questions asked and how they arrive at an answer.
Fundamentals of Data Collection I
This course is the first semester of a two-semester sequence that provides a broad overview of the processes that generate data for use in social science research. Students will gain an understanding of different types of data and how they are created, as well as their relative strengths and weaknesses. A key distinction is drawn between data that are designed, primarily survey data, and those that are found, such as administrative records, remnants of online transactions, and social media content. The course combines lectures, supplemented with assigned readings, and practical exercises. In the first semester, the focus will be on the error that is inherent in data, specifically errors of representation and errors of measurement, whether the data are designed or found. The psychological origins of survey responses are examined as a way to understand the measurement error that is inherent in answers. The effects of the mode of data collection (e.g., mobile web versus telephone interview) on survey responses also are examined.