|  | 
|  | 
| Module code:  PIM-DS | 
|  | 
| 3V+1U (4 hours per week) | 
| 6 | 
| Semester: 1 | 
| Mandatory course: yes | 
| Language of instruction: German
 | 
| Assessment: Written exam, Duration 120 min.
 
 [updated 13.10.2024]
 
 | 
| DFI-DS (P610-0280) Computer Science, Master, ASPO 01.10.2018
, semester 1, mandatory course
 KIM-DS (P221-0051) Computer Science and Communication Systems, Master, ASPO 01.10.2017
, optional course, informatics specific
 PIM-DS (P221-0051) Applied Informatics, Master, ASPO 01.10.2017
, semester 1, mandatory course
 
 | 
| 60 class hours (= 45 clock hours) over a 15-week period. The total student study time is 180 hours (equivalent to 6 ECTS credits).
 There are therefore 135 hours available for class preparation and follow-up work and exam preparation.
 
 | 
| Recommended prerequisites (modules): None.
 
 | 
| Recommended as prerequisite for: PIM-DL Deep Learning
 
 
 [updated 15.11.2021]
 
 | 
| Module coordinator: Prof. Dr. Klaus Berberich
 | 
| Lecturer: Prof. Dr. Klaus Berberich
 
 
 [updated 29.07.2024]
 
 | 
| Learning outcomes: After successfully completing this module, students will be able to use suitable methods of data analysis to gain knowledge for decision-making in practical questions. Students will become familiar with important data analysis procedures. They will be familiar with different types of characteristics (e. g. nominal, ordinal, metric) and can preprocess data appropriately (e. g. by normalization or standardization). Students will be able to select appropriate decision-making procedures (e.g. regression or classification) for specific problems. They will be able to implement the procedures they have learned in a suitable programming language (e. g. Python) or use an available implementation. Students will be able to systematically determine the parameters of the applied methods on the basis of available data and critically assess the quality of their results. They will be able to prepare the knowledge gained from the data appropriately (e. g. in the form of visualization) in order to make it understandable for a technically trained or non-technically trained audience (e. g. decision-makers in the company).
 
 [updated 13.10.2024]
 
 | 
| Module content: 1. Introduction
 
 2. Regression
 2.1 Linear regression
 2.2 Feature transformation
 2.3 Regularization
 
 3. Classification
 3.1 Logistic regression
 3.2 Decision trees
 3.3 Naive Bayes
 3.4 Support vector machines
 
 4. Cluster analysis
 4.1 Representative method (k-Means und k-Medoids)
 4.2 Hierarchical method
 4.3 Density-based method
 
 5. 5.3 Association rule learning
 5.1 Finding frequent item sets (Apriori and FP-Growth)
 5.2 Determining association rules
 5.3 Finding frequent sequences (GSP and PrefixSpan)
 5.4 Finding frequent strings
 5.5 Finding frequent subgraphs
 
 6. Neural Networks
 6.1 Perceptron
 6.2 Multi-layer neural networks (MLPs)
 6.3 Convolutional neural networks (CNNs)
 6.4 Recurrent neural networks (RNNs)
 
 7. Data visualization
 
 
 [updated 13.10.2024]
 
 | 
| Teaching methods/Media: Transparencies, practical and theoretical exercises
 
 [updated 24.02.2018]
 
 | 
| Recommended or required reading: Aggarwal C.: Data Mining - The Textbook, Springer, 2015
 
 Harrington P.: Machine Learning in Action, Manning, 2012
 
 Kelleher J., Mac Namee B. und D"Arcy A.: Fundamentals of Machine Learning for Predictive Data Analytics, MIT Press, 2015
 
 Provost F. und Fawcett T.: Data Science for Business, O"Reilly, 2013
 
 Raschka S.: Machine Learning mit Python, mitp, 2017
 
 Zaki Mohammed J. und Meira Wagner Jr: Data Mining and Analysis: Fundamental Concepts and Algorithms, Cambridge University Press, 2020
 
 [updated 13.10.2024]
 
 | 
| Module offered in: WS 2024/25, 
WS 2023/24, 
WS 2022/23, 
WS 2021/22, 
WS 2020/21, 
...
 |