Notes of the Data Science courses

Brought to you by the PMDS staff

Data Mining

Collection of questions from the past exams. Data representation, Data exploration, Data preparation, Association rules, Regression, Classification, Model selection, Decision trees, Classification rules, Naive Bayes, k-Nearest Neighbors, Ensemble methods, Clustering, Hierarchical clustering, Representative-based clustering, Density-based clustering, Clustering validation

Machine Learning

Linear Regression, Basis Functions, Direct Approaches, Discriminative Approaches, Regularization Techniques, Bayesian Linear Regression, Linear Classifiers, Discriminant Functions, Probabilistic Discriminative and Generative Approaches, Bias-Variance Tradeoff, Model Ensembles, PAC-Learning and VC-Dimensions, Kernel Methods, Support Vector Machines (SVM), Reinforcement Learning, Markov Decision Processes, Dynamic Programming for MDPs, Monte Carlo Methods, Temporal Difference Learning, Model-Free Control, Multi-Armed Bandits. Topics not covered: Gaussian Processes, Radial Basis Functions and Continous RL

Model Identification and Data Analysis 1

Stochastic processes, White Noise, MA, AR, ARX, ARMA, ARMAX, Prediction problem, PEM methods, Least Squares method, Maximum Likelihood method, Model complexity selection, Cross-validation, Final Prediction Error (FPE), Akaike Information Criterion (AIC), Minimal Description Length (MDL), Durbin-Levinson algorithm, Recursive Least Squares

Model Identification and Data Analysis 2

Black Box non-parametric systems identification of I/O systems using state space models, Parametric Black Box System Identification of I/O Systems (using a frequency domain approach), Kalman Filters, SW-sensing with Black Box Models, Gray Box System System Identification, Minimum Variance Control (MVC), Discretization of Analog systems

Natural Language Processing

Error Correction, N-gram Language Models, Part-Of-Speech Tagging, Formal Grammars, Syntactic Parsing, Statistical Parsing, Dependency Parsing, Representation of sentence meaning, Semantic Analysis, Summarizzation, Coreference Resolution, Discourse Coherence, Dialogue Systems, Advanced Dialogue Systems, Lexicons for Sentiment, Affect and Connotation, Computational Phonology, Text-To-Speech, Machine Translation

Soft Computing

Introduction to Machine Learning, Maximum Likelihood Estimation, Perceptron, Hebbian learning, Feedforward Neural Networks, Multi Layer Perceptron, Grandient Descent, Backpropagation, Early Stopping, Weight Decay, Recurrent Neural Networks, Feedforward with Delayed Input, Elman Networks, Vanishing Gradient, LSTM, Deep Learning, Neural Networks Autoencoders, Convolutional Neural Networks

Unstructured and Streaming Data Engineering

Introduction to Big Data, Introduction to NoSQL, Graph Stores, Neo4J, Key-Value Stores, Redis, Columnar Databases, Cassandra, Document Databases, MongoDB, Introduction to Streaming, EPL, Kafka, KSQL, Spark, Flux, Web APIs, Web Scraping, Data Wrangling, Crowdsourcing

Recommender Systems

Non-Personalized Recommenders, Quality of RS, Content Based Filtering, Collaborative Filtering, Memory Based and Model Based techniques, Association Rules, SLIM, Matrix Factorization, Hybrid Recommender Systems, Context Aware Recommender Systems, Factorization Machines, Graph-Based Recommender Systems

Performance Evaluation and Applications

Performance Modelling, Workload Characterization, Workload Modelling, Probability Distributions, Trace Generation, State Machines, Stochastic Processes, Markov Chains, Phase Type Distribution, Markov Arrival Processes, Queueing Systems, Non-Markovian Queueing Systems, Queueing Networks, Separable and Multi-class Models, Advanced Queueing Network Features

Uncertainty in AI

Fuzzy Systems, Evidence Theory, Graphical Models, Probabilistic Reasoning, Bayesian Networks, Bayesian Network Inference, Dynamic Bayesian Networks.

Bioinformatics and Computational Biology

Mendelian Genetics, Molecolar Genetics, Biomolecular Sequence Analysis, Gene Expression Measurement and Analysis, Introduction to Biological Networks, Biomolecular Databanks, Bio-Terminologies and Bio-Ontologies,