Syllabus Part I (The Basics): *Fundamental Information Measures *Entropy, Relative Entropy, and Mutual Information (for discrete distributions) *Basic properties, convexity, log-sum inequality *Definitions for general distributions *f-divergence *Key properties: tensorization, data-processing inequality, variational definition ---- * Compression and Gambling *Definition of source codes *Non-singular, uniquely-decodable, and instantenously-decodable codes *Kraft's inequality, Optimal compression rate, achievability *Connections to gambling on horse races *Operational meanings of entropy, relative entropy, and mutual information ---- ---------------------------------------- Part II (Applications): Application 1: From Compression to Sequential Inference *Universal Compression and gambling *Principle of testing by betting *Constructing Confidence Sequences *Constructing sequential nonparametric tests ---- Application 2: Error exponents in hypothesis testing *Basics of large deviations principle (LDP) *Introduction to the method of types *Sanov's Theorem and information projection *Chernoff Stein Lemma ---- Application 3: Information Theoretic analysis of ML algorithms *Mutual Information based generalization bounds *Regret Analysis of Thompson Sampling [IF TIME PERMITS] ***OR*** Review and discussion of omitted topics ----