Shell Theory: A Statistical Model of Reality
The foundational assumption of machine learning is that the data under consideration is separable into classes; while intuitively reasonable, separability constraints have proven remarkably difficult to formulate mathematically. We believe this problem is rooted in the mismatch between existing statistical techniques and commonly encountered data; object representations are typically high dimensional but statistical techniques tend to treat high dimensions a degenerate case. To address this problem, we develop a dedicated statistical framework for machine learning in high dimensions. The framework derives from the observation that object relations form a natural hierarchy; this leads us to model objects as instances of a high dimensional, hierarchal generative processes. Using a distance based statistical technique, also developed in this paper, we show that in such generative processes, instances of each process in the hierarchy, are almost-always encapsulated by a distinctive-shell that excludes almost-all other instances. The result is shell theory, a statistical machine learning framework in which separability constraints (distinctive-shells) are formally derived from the assumed generative process.
PDF AbstractCode
Tasks
Datasets
Results from the Paper
Ranked #1 on Unsupervised Anomaly Detection with Specified Settings -- 10% anomaly on STL-10 (using extra training data)
Task | Dataset | Model | Metric Name | Metric Value | Global Rank | Uses Extra Training Data |
Benchmark |
---|---|---|---|---|---|---|---|
Anomaly Detection | ASSIRA Cat Vs Dog | Shell-based Anomaly (supervisered) | ROC AUC | 99.9 | # 1 | ||
Unsupervised Anomaly Detection with Specified Settings -- 30% anomaly | ASSIRA Cat Vs Dog | Shell-Renormalized | AUC-ROC | 0.617 | # 6 | ||
Unsupervised Anomaly Detection with Specified Settings -- 10% anomaly | Cats and Dogs | Shell-Renormalized | AUC-ROC | 0.996 | # 1 | ||
Unsupervised Anomaly Detection with Specified Settings -- 20% anomaly | Cats and Dogs | Shell-Renormalized | AUC-ROC | 0.953 | # 1 | ||
Unsupervised Anomaly Detection with Specified Settings -- 0.1% anomaly | Cats and Dogs | Shell-Renormalized | AUC-ROC | 0.866 | # 3 | ||
Unsupervised Anomaly Detection with Specified Settings -- 20% anomaly | cifar10 | Shell-Renormalized | AUC-ROC | 0.896 | # 1 | ||
Unsupervised Anomaly Detection with Specified Settings -- 1% anomaly | CIFAR-10 | Shell-Renormalized | AUC-ROC | 0.756 | # 5 | ||
Unsupervised Anomaly Detection with Specified Settings -- 0.1% anomaly | CIFAR-10 | Shell-Renormalized | AUC-ROC | 0.740 | # 5 | ||
Unsupervised Anomaly Detection with Specified Settings -- 10% anomaly | CIFAR-10 | Shell-Renormalized | AUC-ROC | 0.895 | # 2 | ||
Unsupervised Anomaly Detection with Specified Settings -- 30% anomaly | CIFAR-10 | Shell-Renormalized | AUC-ROC | 0.894 | # 1 | ||
Anomaly Detection | Fashion-MNIST | Shell-based Anomaly (supervised) | ROC AUC | 92.1 | # 8 | ||
Anomaly Detection | STL-10 | Shell-based Anomaly (supervised) | ROC AUC | 99.2 | # 1 | ||
Unsupervised Anomaly Detection with Specified Settings -- 1% anomaly | STL-10 | Shell-Renormalized | AUC-ROC | 0.829 | # 5 | ||
Unsupervised Anomaly Detection with Specified Settings -- 0.1% anomaly | STL-10 | Shell-Renormalized | AUC-ROC | 0.803 | # 4 | ||
Unsupervised Anomaly Detection with Specified Settings -- 20% anomaly | STL-10 | Shell-Renormalized | AUC-ROC | 0.999 | # 1 | ||
Unsupervised Anomaly Detection with Specified Settings -- 30% anomaly | STL-10 | Shell-Renormalized | AUC-ROC | 0.999 | # 1 | ||
Unsupervised Anomaly Detection with Specified Settings -- 10% anomaly | STL-10 | Shell-Renormalized | AUC-ROC | 0.997 | # 1 |