Syllabus

Data Science & Machine Learning

Code
GIK2KM
Points
7.5 Credits
Level
First Cycle
School
School of Information and Engineering
Subject field
Information Systems (IKA)
Group of Subjects
Informatics/Computer and Systems Sciences
Disciplinary Domain
Technology, 100%
This course can be included in the following main field(s) of study
Information Systems1
Microdata Analysis2
Progression indicator within (each) main field of study
1G1F
2G1F
Approved
Approved, 24 September 2020.
This syllabus is valid from 11 November 2020.

Learning Outcomes

The overall goal of the course is for students to acquire in depth knowledge and skills in the use and development of Data Science software as well as basic Data Science knowledge: i.e. an interdisciplinary approach to find, extract and discover patterns in data using methods of analysis, domain competence and technology.

Knowledge and understanding
Upon completion of this course, students will be able to:
  • Explain the data science life cycle
  • Explain Big Data and data analysis
  • Explain methods for data preparations

Skills and abilities
Upon completion of this course, students will be able to:

  • Apply unsupervised and supervised machine learning algorithms for problem-solving
  • Apply fundamental concepts in statistics and probability theory, including key concepts like probability distributions, statistical significance, hypothesis testing and regression
  • Use programming languages for data science/data analysis
  • Perform data extraction from text
  • Use exploratory data analysis (EDA) to describe the data using summary statistics and visualisation techniques

Values and attitudes
Upon completion of this course, students will be able to:
  • Interpret and analyse the results of a data extraction process, as well as evaluate the effects of choices made during the process

Course Content

The course covers the process of data science, i.e. a multidisciplinary approach to find, extract and discover patterns in data through a fusion of analytical methods, domain expertise and technology. In this context, the fields of data mining, forecasting, machine learning, predictive analytics, statistics and text analytics are covered.

Within the framework of the iterative data science process, business understanding is treated with problem identification for the specification of key variables that are to function as model goals and identification of relevant data sources. It also includes the formulation of questions that define business goals and that can be quantified by computer science technicians.

To control data quality, the acquisition of raw data, data processing (ETL), examination of data and modeling are included. To facilitate the development of model(s) and to find the model that best answers the initial questions, so called feature engineering is used, when raw data is extracted and distinctive features are created.

Finally, the evaluation of modeling and analysis, presentation of results and commissioning are discussed.

Assessment

Hand-in assignments (2.5 credits), laboratories (3 credits) and mini-tests (2 credits).

Forms of Study

Lectures, laboratory work, and assignments.

Grades

The Swedish grades U–VG.

The final grade is determined by weighting of the hand-in assignments and mini-tests.

Prerequisites

  • Object-Oriented Programming 7.5 Credits, First cycle or other course in Fundamentals of Programming
  • Statistical Analysis 7.5 credits