Pennsylvania State University

Pennsylvania State University

Principles of Causal Inference

Course Descripton

DS 560. Principles of Causal Inference.

Course Staff

The Fall 2022 offering of Principles of Causal Inference is taught by Professor Vasant Honavar.

Course Schedule

Lectures:Tuesday, Thursday - 3:05pm to 4:20pm, W219, Westgate Building

Office Hours:

   Vasant Honavar: Tuesday 4:30pm - 5:30pm or by appointment.

Course Overview

Course Description: Representing, reasoning with, and learning causal effects and causal models from observational and experimental data is a hallmark of intelligence. It is also central to all scientific endeavors. Many of the difficulties in establishing the validity, generalizability, and reproducibility of scientific findings can be traced to inadequate attention to their causal underpinnings. Modern machine learning methods have been incredibly successful in building predictive models (e.g., for health risks from genetics, lifestyle, and other factors) from observational data. However, they are fundamentally about finding and using complex correlations between a set of predictive variables (e.g., genetics, lifestyle) and the outcome of interest (e.g., health risk). Consequently, they are fundamentally incapable of answering causal questions e.g., How would one’s risk of heart disease change if one were to quit smoking? More importantly, predictive models constructed solely from observational data can yield misleading conclusions (e.g., about the effectiveness of a drug to cure a disease). Drawing valid conclusions in such settings calls for principled methods and tools for causal modeling and causal inference. Fortunately, we there has been more progress on foundations and methods of causal modeling and causal inference in the past 3 decades than the rest of human existence.

This course will give students a rigorous yet accessible treatment of the theoretical underpinnings, and practice of causal inference from observational and experimental data. Topics covered include: pitfalls of standard machine learning algorithms when applied to observational data; causal inference in the absence of randomized control trials; causal effects and counterfactuals; eliciting causal effects from observations; the Causal Bayesian Network framework for causal inference - do-calculus, identifiability of causal effects from observations and experiments; the Potential Outcomes framework for causal inference - matching and propensity score-based methods and their advanced variants for counterfactual inference; the relationship between the Potential Outcomes and causal Bayes Networks; and learning causal models from observations and experiments. The course will give a principled treatment to confounders as well as practical approaches to cope with them. Additional topics to be covered include mediation analysis; advanced machine learning methods for causal effect estimation; causal transportability; selection bias; and meta-analysis.

Learning Objectives

Upon successful completion of the course, students will demonstrate a broad understanding of the principles of causal inference, including the Potential Outcomes and causal Bayesian networks frameworks, as well as their applications in the data sciences. Students will understand the implementation, adaptation, and applications of several causal inference algorithms in a high-level programming language (e.g., Python). Students will be able to identify, formulate, and solve causal inference problems that arise in the empirical sciences. Students with the necessary computational and mathematical background will also be prepared to pursue advanced research on the foundations of, and methods for causal inference in Data Sciences and Artificial Intelligence.

Target Audience

The primary audience for the course includes graduate students and advanced undergraduates in Informatics, Computer Science and Engineering, Data Sciences, Mathematics, and quantitatively inclined students in empirical sciences (Life Sciences, Health Sciences, Behavioral Sciences, Environmental Sciences, Learning Sciences, Cognitive Sciences, Social Sciences, Public Policy, and related areas).

Recommended Preparation

Recommended preparation for the course include basic proficiency in programming, elements of probability theory and statistics, discrete mathematics, and (optionally) machine learning. If you are not sure whether you have the necessary background, please talk to the instructor.