The course consists of the following contents. Data representation in multidimensional heterogeneous datasets. Dataset cleaning and feature engineering. Application of R and/or Python for machine learning based visualization. Cluster analysis (e.g. k-nearest-neighbor, hierarchical cluster analysis, spectral clustering, naive Bayes, and classification e.g. logistic regression, support vector machines, decision trees, random forests, deep neural networks, recurrent neural networks. Cloud services for big data analysis is also introduced, including tools for administration of a server platform.
The course objective is the application of open source data acquisition and analysis tools on a given problem and demonstrate and document a practical application per project group. Examples on problem areas might be analysis of dataflow from social media, from sensor measurements e.g. captured via Arduino processor, off-line analysis of data, etc.
The overall objective of the course is to train an engineering approach in relation to technical system development and project management in small project groups, along with using and evaluating a range of practical techniques for structured program development and documentation. The course begins with planning and control, emphasizing the importance of project management, risk assessment, and role division for effective time estimation. It delves into the intricacies of requirement specification, covering aspects like discovery, description, control, validation, and prioritization of requirements.