Machine Learning for Social Sciences
Special Seminars
These special seminar courses address specific research problems currently under study by faculty members. The topics will be announced in the University Time Schedule each academic term.
Doctoral Seminar II
This is the second course in a two-term seminar designed to develop skills in the identification of research problems, specification of hypotheses/theorems to extend current understanding of the field, and planning for original research. A common set of readings in four to six advanced research activities of the faculty are studied, with the faculty engaged in research discussing areas of potential innovation. Students present and critique oral and written proposals for research.
Survey Management
This course describes modern practices in the administration of large-scale surveys. It reviews alternative management structures for large field organizations, supervisory and training regimens, handling of turnover, and multiple surveys with the same staff. Practical issues in budgeting of surveys are reviewed with examples from actual surveys. Scheduling of sequential activities in the design, data collection, and processing of data is described.
Practical Tools for Study Design and Inference
This course is a statistical methods class appropriate for second year Master’s students and PhD students. The course will be a combination of hands-on applications and general review of the theory behind different approaches to sampling and weighting. Topics covered include:
– Sample size calculations using estimation targets based on relative standard error, margin of error, and power requirements;
– Use of mathematical programming to determine sample sizes needed to achieve estimation goals for a series of subgroups and analysis variables;
– Resources for designing area probability samples;
– Methods of sample allocation for multistage samples;
– Steps in weighting, including computation of base weights, nonresponse adjustments, and uses of auxiliary data;
– Nonresponse adjustment alternatives, including weighting cell adjustments, formation of cells using regression trees, and propensity score adjustments;
– Weighting via poststratification, raking, general regression estimation, and other types of calibration.
Topics in Survey Methodology
This course is an advanced course in selected topics in survey sampling. Topics to be covered include: estimation and imputation approaches, small area estimation, and sampling methods for rare populations. A selection of additional topics, chosen by the instructor, will also be covered. Examples of such additional topics include: sample designs for time and space, panel and rotating panel survey designs, maximizing overlap between samples, controlled selection and lattice sampling, sampling with probabilities proportionate to size without replacement, multiple frame sampling, adaptive cluster sampling, capture-recapture sampling, sampling for telephone surveys, sampling for establishment surveys, and measurement error models. Both applied and theoretical aspects of the topics will be examined.
Inference from Complex Samples
Inference from complex sample survey data covers the theoretical and empirical properties of various variance estimation strategies (e.g., Taylor series approximation, replicated methods, and bootstrap methods for complex sample designs) and how to incorporate those methods into inference for complex sample survey data. Variance estimation procedures are applied to descriptive estimators and to analysis techniques such as regression, analysis of variance, and analysis of categorical data. Generalized variances and design effects are presented. Methods of model-based inference for complex sample surveys are also examined, and the results are contrasted with the design-based type of inference used as the standard in the course. The course will use real survey data to illustrate the methods discussed in class. Students will learn the use of computer software that takes account of the sample design in estimation. Students will carry out a research and analysis project, using techniques and skills learned during the course. A paper describing the student’s research will be submitted at the end of the course, and each student will give a short presentation of his/her findings.
Analysis of Complex Sample Survey Data
This introductory course on the analysis of data from complex sample designs covers the development and handling of selection and other compensatory weights; methods for handling missing data; the effect of stratification and clustering on estimation and inference; alternative variance estimation procedures; methods for incorporating weights, stratification, clustering, and imputed values in estimation and inference procedures for complex sample survey data; and generalized design effects and variance functions.
Statistical Modeling and Machine Learning II
This builds on the introduction to linear models and data analysis provided in Statistical Methods I. Topics include: Multivariate analysis techniques (Hotelling’s T-square, Principal Components, Factor Analysis, Profile Analysis, MANOVA); Categorical Data Analysis (contingency tables, measurement of association, log-linear models for counts, logistics and polytomous regression, GEE); and lifetime Data Analysis (Kaplan-Meier plots, logrank test, Cox regression).
Design Seminar
This is a wide-ranging graduate seminar in which several program faculty members join with the students in attempting to solve design issues presented to the seminar by clients from the private, government, or academic sectors of research. Readings are selected from literatures not treated in other classes, and practical consulting problems are addressed.