Presentations

WTC-2019

Here is the Markdown version -- made with M2MD -- of the presentation notebook:

"Anomalies, breaks, and outliers detection in time series".

Here is the abstract:

In this presentation we show, explain, and compare methods for finding anomalies, breaks, and outliers in time series. We are interested in finding anomalies in both a single time series and a collection of time series.

We (mostly) employ non-parametric methods. First, we look at some motivational examples from well known datasets. Then we look into definitions of anomalies and definitions for measuring success of time series anomaly detection.

For a single time series we apply both WL built-in algorithms and additional, specialized algorithms. We discuss in more detail algorithms based on K-Nearest Neighbors (KNN), Dimension Reduction, Linear Regression, Quantile Regression, Prefix Trees.

For collections of time series we discuss: transformations into uniform representations, simple outlier finding based on variables distributions, anomalous trends finding, anomalies finding with KNN, and other related algorithms.

We are going to discuss how anomalies finding helps in producing faithful simulations of multi-variable datasets.

Concrete, real life time series are used in the examples.

See the related dedicated MathematicaVsR project "Time series anomalies, breaks, and outliers detection".

UseR!-2020

Here is a video recording:

"How to simplify Machine learning workflows specifications - useR! 2020 Conference" (YouTube).

Here are the slides for the (lightning) talk: HTML, Markdown, Rmd.

Here is the (extended) abstract of the proposed presentation:

"How to simplify Machine learning workflows specifications?".

Versions: HTML, Rmd.

Here is the submitted abstract:

In this presentation we discuss a systematic approach of software development that gives us the ability to rapidly specify Machine Learning (ML) computations using both programming Domain Specific Languages (DSL's) and natural language commands. We present in detail the selection of programming paradigms, languages, and packages.

A central topic of the presentation is the transformation of sequences of natural commands into corresponding DSL pipelines for ML computations.

We use monadic programming and code generation for implementation of ML packages. We use Raku (Perl 6) for grammar specifications, parsers generation, and interpreters.

Numerous examples are used based on ML packages written in R and English-based grammar rules. We look into code generation of ML workflows for supervised learning, time series analysis, latent semantic analysis, and recommendations. We show how with the same natural commands pipelines in other programming languages can be generated.

Finally we discuss the extensions of the presented approach to (1) handling wrong commands and spelling mistakes, (2) using multiple natural languages, and (3) making conversational agents.

RStudio::global (2021)

Here is the call for talks page: "rstudio::global() call for talks".

Here is the submitted abstract:

TITLE: Multi-language Data Wrangling Translations

ABSTRACT:

This presentation discusses how to facilitate the rapid specification of data wrangling programming code using natural language commands.

We want to do that because:

Often we have to apply the same data wrangling workflows within different programming languages and/or packages
It might be time consuming to express those workflows with the concrete language/package logic and syntax
Natural language workflows are "universal"

We demonstrate data transformation code generation for different programming languages/packages. We focus on these three: R-base, R-tidyverse, Python-pandas.

In addition to code generation examples we also outline the utilized software strategy and architecture and the unit testing procedures.

Name		Name	Last commit message	Last commit date
parent directory ..
DSSG-SouthFL-2025		DSSG-SouthFL-2025
Greater-Boston-useR-Group-Meetup-2022		Greater-Boston-useR-Group-Meetup-2022
PyData-Global-2024		PyData-Global-2024
UseR!-2020		UseR!-2020
UseR!-Meetup-Boston-2020		UseR!-Meetup-Boston-2020
WTC-2019		WTC-2019
WTC-2020		WTC-2020
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Presentations

WTC-2019

UseR!-2020

RStudio::global (2021)

FilesExpand file tree

Presentations

Directory actions

More options

Directory actions

More options

Latest commit

History

Presentations

Folders and files

parent directory

README.md

Presentations

WTC-2019

UseR!-2020

RStudio::global (2021)