Applied Data Analysis and Machine Learning: Introduction to the course, Logistics and Practicalities

Morten Hjorth-Jensen [1, 2]

[1] Department of Physics, University of Oslo
[2] Department of Physics and Astronomy and National Superconducting Cyclotron Laboratory, Michigan State University

Aug 19, 2020












Overview of first week











Lectures and ComputerLab











Course Format











Teachers

Teachers :











Deadlines for projects (tentative)

  1. Project 1: September 28 (graded with feedback)
  2. Project 2: November 2 (graded with feedback)
  3. Project 3: December 7 (graded with feedback)
Projects are handed in using Canvas. We use Github as repository for codes, benchmark calculations etc. Comments and feedback on projects only via Canvas.











Recommended textbooks











Prerequisites

Basic knowledge in programming and mathematics, with an emphasis on linear algebra. Knowledge of Python or/and C++ as programming languages is strongly recommended and experience with Jupiter notebook is recommended. Required courses are the equivalents to the University of Oslo mathematics courses MAT1100, MAT1110, MAT1120 and at least one of the corresponding computing and programming courses INF1000/INF1110 or MAT-INF1100/MAT-INF1100L/BIOS1100/KJM-INF1100. Most universities offer nowadays a basic programming course (often compulsory) where Python is the recurring programming language.











Learning outcomes

This course aims at giving you insights and knowledge about many of the central algorithms used in Data Analysis and Machine Learning. The course is project based and through various numerical projects, normally three, you will be exposed to fundamental research problems in these fields, with the aim to reproduce state of the art scientific results. Both supervised and unsupervised methods will be covered. The emphasis is on a frequentist approach, although we will try to link it with a Bayesian approach as well. You will learn to develop and structure large codes for studying different cases where Machine Learning is applied to, get acquainted with computing facilities and learn to handle large scientific projects. A good scientific and ethical conduct is emphasized throughout the course. More specifically, after this course you will











Topics covered in this course: Statistical analysis and optimization of data

The course has two central parts

  1. Statistical analysis and optimization of data
  2. Machine learning
These topics will be scattered thorughout the course and may not necessarily be taught separately. Rather, we will often take an approach (during the lectures and project/exercise sessions) where say elements from statistical data analysis are mixed with specific Machine Learning algorithms

Statistical analysis and optimization of data.

The following topics will be covered











Topics covered in this course: Machine Learning

The following topics will be covered

Hands-on demonstrations, exercises and projects aim at deepening your understanding of these topics.











Extremely useful tools, strongly recommended

and discussed at the lab sessions.











Other courses on Data science and Machine Learning at UiO

The link here https://www.mn.uio.no/english/research/about/centre-focus/innovation/data-science/studies/ gives an excellent overview of courses on Machine learning at UiO.

  1. STK2100 Machine learning and statistical methods for prediction and classification.
  2. IN3050 Introduction to Artificial Intelligence and Machine Learning. Introductory course in machine learning and AI with an algorithmic approach.
  3. STK-INF3000/4000 Selected Topics in Data Science. The course provides insight into selected contemporary relevant topics within Data Science.
  4. IN4080 Natural Language Processing. Probabilistic and machine learning techniques applied to natural language processing.
  5. STK-IN4300 Statistical learning methods in Data Science. An advanced introduction to statistical and machine learning. For students with a good mathematics and statistics background.
  6. INF4490 Biologically Inspired Computing. An introduction to self-adapting methods also called artificial intelligence or machine learning.
  7. IN-STK5000 Adaptive Methods for Data-Based Decision Making. Methods for adaptive collection and processing of data based on machine learning techniques.
  8. IN5400/INF5860 Machine Learning for Image Analysis. An introduction to deep learning with particular emphasis on applications within Image analysis, but useful for other application areas too.
  9. TEK5040 Deep learning for autonomous systems. The course addresses advanced algorithms and architectures for deep learning with neural networks. The course provides an introduction to how deep-learning techniques can be used in the construction of key parts of advanced autonomous systems that exist in physical environments and cyber environments.
  10. STK4051 Computational Statistics
  11. STK4021 Applied Bayesian Analysis and Numerical Methods
© 1999-2020, Morten Hjorth-Jensen. Released under CC Attribution-NonCommercial 4.0 license