Skip to content

Latest commit

 

History

History

README.md

Presentations

WTC-2019

Here is the Markdown version -- made with M2MD -- of the presentation notebook:

Here is the abstract:

In this presentation we show, explain, and compare methods for finding anomalies, breaks, and outliers in time series. We are interested in finding anomalies in both a single time series and a collection of time series.

We (mostly) employ non-parametric methods. First, we look at some motivational examples from well known datasets. Then we look into definitions of anomalies and definitions for measuring success of time series anomaly detection.

For a single time series we apply both WL built-in algorithms and additional, specialized algorithms. We discuss in more detail algorithms based on K-Nearest Neighbors (KNN), Dimension Reduction, Linear Regression, Quantile Regression, Prefix Trees.

For collections of time series we discuss: transformations into uniform representations, simple outlier finding based on variables distributions, anomalous trends finding, anomalies finding with KNN, and other related algorithms.

We are going to discuss how anomalies finding helps in producing faithful simulations of multi-variable datasets.

Concrete, real life time series are used in the examples.

See the related dedicated MathematicaVsR project "Time series anomalies, breaks, and outliers detection".

UseR!-2020

Here is a video recording:

Here are the slides for the (lightning) talk: HTML, Markdown, Rmd.

Here is the (extended) abstract of the proposed presentation:

Versions: HTML, Rmd.

Here is the submitted abstract:

In this presentation we discuss a systematic approach of software development that gives us the ability to rapidly specify Machine Learning (ML) computations using both programming Domain Specific Languages (DSL's) and natural language commands. We present in detail the selection of programming paradigms, languages, and packages.

A central topic of the presentation is the transformation of sequences of natural commands into corresponding DSL pipelines for ML computations.

We use monadic programming and code generation for implementation of ML packages. We use Raku (Perl 6) for grammar specifications, parsers generation, and interpreters.

Numerous examples are used based on ML packages written in R and English-based grammar rules. We look into code generation of ML workflows for supervised learning, time series analysis, latent semantic analysis, and recommendations. We show how with the same natural commands pipelines in other programming languages can be generated.

Finally we discuss the extensions of the presented approach to (1) handling wrong commands and spelling mistakes, (2) using multiple natural languages, and (3) making conversational agents.

RStudio::global (2021)

Here is the call for talks page: "rstudio::global() call for talks".

Here is the submitted abstract:

TITLE: Multi-language Data Wrangling Translations

ABSTRACT:

This presentation discusses how to facilitate the rapid specification of data wrangling programming code using natural language commands.

We want to do that because:

  1. Often we have to apply the same data wrangling workflows within different programming languages and/or packages

  2. It might be time consuming to express those workflows with the concrete language/package logic and syntax

  3. Natural language workflows are "universal"

We demonstrate data transformation code generation for different programming languages/packages. We focus on these three: R-base, R-tidyverse, Python-pandas.

In addition to code generation examples we also outline the utilized software strategy and architecture and the unit testing procedures.