TILP2600 From data to model (5 cr)
Description
The course focuses on statistical modeling and estimation. The course starts with a linear regression model with one predictor and proceeds to a regression model with multiple predictors and model selection. Uncertainty related to estimation is described using the sampling distribution and confidence intervals. At the end, the basics of Bayesian statistics and machine learning are discussed.
Learning outcomes
A student who has successfully completed the course:
- recognizes data-generating processes and is familiar with different types of data, such as text, image, and sound,
- understands the different goals of modeling: prediction, causal inference, and explanation,
- knows the concepts of expected value and variance and the basic properties of the normal distribution,
- can fit a linear model in a simple application and interpret the model,
- knows the principles of linear regression modeling with multiple predictors (least squares method),
- can examine the suitability of the model to the data and select the predictors of the regression model according to the goal of modeling,
- understands the importance of random variation and the role of the sampling distribution in statistics,
- can determine a confidence interval and interpret the results,
- knows the principles of statistical testing and can interpret the results of tests,
- understands the basic idea of Bayesian statistics,
- understands how machine learning is fundamentally statistics.
Additional information
Lectures are given in Finnish. Please contact the examiner well before the exam by email, to get the chapters that are included in the exam.
Description of prerequisites
TILP2400 Data visualization and analysis
Study materials
The exams for foreign students are based on book Moore & McCabe (& Craig): Introduction to the practice of statistics. Please contact the examiner well before the exam by email, to get the chapters that are included in the exam.