Duration: 2 Days

Overview: While many machine learning tasks, such as propensity modelling, have become standardised to the point of near automation, detecting anomalies in large complex datasets remains a fundamental challenge often requiring bespoke, creative solutions. There are, however, a core set of techniques and design patterns that can be built upon for anomaly detection problems in domains such as fraud detection, risk identification, and classification of rare events. Through presentations, real world examples, discussions, and workshops this workshop introduces the most important of these. This course can be delivered in R, Python, or SAS.

At Course Completion: This course has been designed to equip delegates with the most important anomaly detection techniques and design patterns, and an understanding of how they should be applied to build real-world-relevant solutions. After completing the course delegates will be able to:

  • Frame a wide range of problems as anomaly detection problems and determine the appropriate techniques and patterns to use to solve to them
  • Apply appropriate techniques to perform univariate outlier detection
  • Select and apply appropriate techniques for detecting anomalies in time series data
  • Perform anomaly detection in multivariate data using machine learning techniques
  • Design and implement solutions for anomaly detection in datasets of specific formats such as graph data or transactional data
  • Evaluate the performance of anomaly detection techniques

Who Should Attend: This course is aimed at people who are familiar with the use of machine learning techniques for standard tasks like classification and forecasting, that would like to expand their repertoire with other techniques can be used for less standard anomaly detection problems. This course is ideally suited to people working in data analyst, data science, business analyst, statistician, or similar roles wishing to add time series modelling skills to their repertoire.

Prerequisites: To attend this course delegates should be familiar with fundamental concepts in data manipulation, descriptive statistics, and machine learning. Specifically, delegates should be comfortable building and evaluating classification models (using techniques such as logistic regression, decision trees, support vector machines, or random forests). In addition, delegates should be capable of using the technology through which the course is delivered (a list of specific aspects of the technology used with which delegates should be familiar is available on request).

Outline: The course will run over two days and will broadly follow the timetable shown below. The course will be delivered through presentations, real world examples, discussions, and workshops.

Day Time Topic
Day 1 Morning Introduction to Anomaly Detection
Univariate Anomaly Detection Techniques
Afternoon Detecting Anomalies in Time Series Data
Day 2 Morning Multivariate Anomaly Detection Techniques
Afternoon Anomaly Detection Techniques for Graph and Transactional Data