Top Papers

Probabilistic Programming in Python using PyMC

29 Jul 2015pymc-devs/pymc3

Probabilistic programming (PP) allows flexible specification of Bayesian statistical models in code.

COMPUTATION

A large-scale analysis of racial disparities in police stops across the United States

18 Jun 2017rfordatascience/tidytuesday

We find that black drivers are stopped more often than white drivers relative to their share of the driving-age population, but that Hispanic drivers are stopped less often than whites.

APPLICATIONS

HyperTools: A Python toolbox for visualizing and manipulating high-dimensional data

28 Jan 2017ContextLab/hypertools

Just as the position of an object moving through space can be visualized as a 3D trajectory, HyperTools uses dimensionality reduction algorithms to create similar 2D and 3D trajectories for time series of high-dimensional observations.

OTHER STATISTICS

Computing Extremely Accurate Quantiles Using t-Digests

11 Feb 2019tdunning/t-digest

We present on-line algorithms for computing approximations of rank-based statistics that give high accuracy, particularly near the tails of a distribution, with very small sketches.

SEQUENTIAL QUANTILE ESTIMATION COMPUTATION DATA STRUCTURES AND ALGORITHMS

Estimating the number of infections and the impact of non-pharmaceutical interventions on COVID-19 in European countries: technical description update

23 Apr 2020ImperialCollegeLondon/covid19model

Our model estimates these changes by calculating backwards from temporal data on observed to estimate the number of infections and rate of transmission that occurred several weeks prior, allowing for a probabilistic time lag between infection and death.

APPLICATIONS METHODOLOGY

ruptures: change point detection in Python

2 Jan 2018deepcharles/ruptures

ruptures is a Python library for offline change point detection.

COMPUTATION MATHEMATICAL SOFTWARE

NILMTK: An Open Source Toolkit for Non-intrusive Load Monitoring

15 Apr 2014nilmtk/nilmtk

We demonstrate the range of reproducible analyses which are made possible by our toolkit, including the analysis of six publicly available data sets and the evaluation of both benchmark disaggregation algorithms across such data sets.

APPLICATIONS

Bambi: A simple interface for fitting Bayesian linear models in Python

19 Dec 2020bambinos/bambi

The popularity of Bayesian statistical methods has increased dramatically in recent years across many research areas and industrial applications.

COMPUTATION

Estimating Treatment Effects with Causal Forests: An Application

20 Feb 2019grf-labs/grf

We apply causal forests to a dataset derived from the National Study of Learning Mindsets, and consider resulting practical and conceptual challenges.

METHODOLOGY

Beyond Random Walk and Metropolis-Hastings Samplers: Why You Should Not Backtrack for Unbiased Graph Sampling

18 Apr 2012benedekrozemberczki/littleballoffur

In this paper, we propose non-backtracking random walk with re-weighting (NBRW-rw) and MH algorithm with delayed acceptance (MHDA) which are theoretically guaranteed to achieve, at almost no additional cost, not only unbiased graph sampling but also higher efficiency (smaller asymptotic variance of the resulting unbiased estimators) than the SRW-rw and the MH algorithm, respectively.

METHODOLOGY DATA STRUCTURES AND ALGORITHMS NETWORKING AND INTERNET ARCHITECTURE SOCIAL AND INFORMATION NETWORKS DATA ANALYSIS, STATISTICS AND PROBABILITY PHYSICS AND SOCIETY