1 code implementation • 12 Mar 2024 • Yutong Wang, Rishi Sonthalia, Wei Hu
Under a random matrix theoretic assumption on the data distribution and an eigendecay assumption on the data covariance matrix $\boldsymbol{\Sigma}$, we demonstrate that any near-interpolator exhibits rapid norm growth: for $\tau$ fixed, $\boldsymbol{\beta}$ has squared $\ell_2$-norm $\mathbb{E}[\|{\boldsymbol{\beta}}\|_{2}^{2}] = \Omega(n^{\alpha})$ where $n$ is the number of samples and $\alpha >1$ is the exponent of the eigendecay, i. e., $\lambda_i(\boldsymbol{\Sigma}) \sim i^{-\alpha}$.
no code implementations • 1 Oct 2023 • Chenghui Li, Rishi Sonthalia, Nicolas Garcia Trillos
There is a large variety of machine learning methodologies that are based on the extraction of spectral geometric information from data.
no code implementations • 26 May 2023 • Chinmaya Kausik, Kashvi Srivastava, Rishi Sonthalia
Motivated by this, we study supervised denoising and noisy-input regression under distribution shift.
no code implementations • 24 May 2023 • Rishi Sonthalia, Xinyue Li, Bochao Gu
For larger values of $\mu$, we observe that the curve for the norm of the estimator has a peak but that this no longer translates to a peak in the generalization error.
no code implementations • 24 May 2023 • Rishi Sonthalia, Anna Seigal, Guido Montufar
We define the supermodular rank of a function on a lattice.
2 code implementations • 23 Sep 2022 • Mario Krenn, Lorenzo Buffoni, Bruno Coutinho, Sagi Eppel, Jacob Gates Foster, Andrew Gritsevskiy, Harlin Lee, Yichao Lu, Joao P. Moutinho, Nima Sanjabi, Rishi Sonthalia, Ngoc Mai Tran, Francisco Valente, Yangxinyu Xie, Rose Yu, Michael Kopp
For that, we use more than 100, 000 research papers and build up a knowledge network with more than 64, 000 concept nodes.
1 code implementation • NeurIPS 2021 • Rishi Sonthalia, Gregory Van Buskirk, Benjamin Raichel, Anna C. Gilbert
While $D_l$ is not metric, when given as input to cMDS instead of $D$, it empirically results in solutions whose distance to $D$ does not increase when we increase the dimension and the classification accuracy degrades less than the cMDS solution.
no code implementations • 10 Oct 2021 • Dominic Flocco, Bryce Palmer-Toy, Ruixiao Wang, Hongyu Zhu, Rishi Sonthalia, Junyuan Lin, Andrea L. Bertozzi, P. Jeffrey Brantingham
The construction and application of knowledge graphs have seen a rapid increase across many disciplines in recent years.
no code implementations • 29 Sep 2021 • Rishi Sonthalia, Raj Rao Nadakuditi
In fact the generalization error versus number of of training data points is a double descent curve.
1 code implementation • 8 May 2020 • Rishi Sonthalia, Anna C. Gilbert
Given a set of dissimilarity measurements amongst data points, determining what metric representation is most "consistent" with the input measurements or the metric that best captures the relevant geometric features of the data is a key step in many machine learning algorithms.
3 code implementations • NeurIPS 2020 • Rishi Sonthalia, Anna C. Gilbert
Given data, finding a faithful low-dimensional hyperbolic embedding of the data is a key method by which we can extract hierarchical information or learn representative geometric features of the data.
no code implementations • 25 Sep 2019 • Anna C. Gilbert, Rishi Sonthalia
Given a set of distances amongst points, determining what metric representation is most “consistent” with the input distances or the metric that captures the relevant geometric features of the data is a key step in many machine learning algorithms.
3 code implementations • 19 Jul 2018 • Anna C. Gilbert, Rishi Sonthalia
Here, we present a new algorithm MR-MISSING that extends these previous algorithms and can be used to compute low dimensional representation on data sets with missing entries.