Deep Learning and Machine Learning

Complete Paper List in Reverse Chronological Order

Selected Recent Papers in Deep Learning and Machine Learning

A. R. Hsu, Y. Cherapanamjeri, A. Y. Odisho, P. R. Carroll, B. Yu (2024). Mechanistic Interpretation through Contexual Decomposition in Transformers. https://arxiv.org/pdf/2407.00886.
O. Ronen, A. I. Humayun, R. Balestriero, R. Baraniuk, B. Yu (2024). ScaLES: Scalable Latent Exploration Score for Pre-Trained Generative Networks. https://arxiv.org/pdf/2406.09657
S. Hayou, N. Ghosh, B. Yu (2024). The Impact of Initialization on LoRA Fineuning Dynamics. https://arxiv.org/pdf/2406.08447
B. Yu (2024). After Computational Reproducibility: Scientific Reproducibility and Trustworthy AI (discussion of Donoho’s paper “Data Science at the Singularity”) Harvard Data Science Review (HDSR).
N. R. Mallinar, A. Zane, S. Frei, B. Yu (2024). Minimum-Norm Interpolation Under Covariate Shift. Proc. ICML. https://arxiv.org/pdf/2404.00522
L. Sun, A. Agarwal, A. Kornblith, B. Yu, C. Xiong (2024). ED-Copilot: Reduce Emergency Department Wait Time with Language Model Diagnostic Assistance. Proc. ICML. https://arxiv.org/abs/2402.13448
S. Hayou, N. Ghosh, B. Yu (2024). LoRA+: Efficient Low Rank Adaptation of Large Models. Proc. ICML. https://arxiv.org/abs/2402.12354
N. Ghosh, S. Frei, W. Ha, B. Yu (2023). The effect of SGD batch size on autoencoder learning: sparsity, sharpness and feature learning. https://arxiv.org/abs/2308.03215
A. R. Hsu, Y. Cherapanamjeri, B. Park, T. Naumann, A. Odisho, and B. Yu (2023). Diagnosing transformers: illuminating feature space for clinical decison-making. ICLR 2023. https://arxiv.org/abs/2305.17588
C. Singh, A. R. Hsu, R. Antonello, S. Jain, A. G. Huth, B. Yu and J. Gao (2023). Explaining black box text modules in natural language with language models.
N. Ghosh, S. Mei, and B. Yu (2022). The three stages of dynamics in high-dimensional kernel methods. Proc. ICLR, 2022. https://arxiv.org/abs/2111.07167
W. Ha, C. Singh, F. Lanusse, S. Upadhyayula, and B. Yu (2021). Adaptive Wavelet Distillation from Neural Networks through Interpretation. Proc. NeurIPS 2021. (code)
L. Reiger, J. W. Murdoch, S. Singh, B. Yu (2020). Interpretations are Useful: Penalizing Explanations to Align Neural Networks with Prior Knowledge. ICML Proceedings. (code)
C. Singh, W. Ha, F. Lanusse, V. Boehm , J. Liu, B. Yu (2020). Transformation Importance with Applications to Cosmology ICLR Workshop paper. (code)
Y. Chen, R. Dwivedi, M. J. Wainwright and B. Yu (2020) Fast Mixing of Metropolized Hamiltonian Monte Carlo: Benefits of Multi-Step Gradients, JMLR, https://arxiv.org/abs/1905.12247
R. Dwivedi, Y. Chen, M. J. Wainwright and B. Yu (2019) Log-concave Sampling: Metropolis Hastings Algorithms are Fast JMLR. http://jmlr.org/papers/v20/19-306.html
Y. Chen, R. Dwivedi, M. J. Wainwright and B. Yu (2018) Fast MCMC Algorithms on Polytopes. JMLR. http://jmlr.org/papers/v19/18-158.html
Y. Chen, R. Abbasi-Asl, A. Bloniarz, M. Oliver, B. Willmore, J. Gallant, and B. Yu (2018) The DeepTune framework for modeling and characterizing neurons in visual cortex area V4 https://www.biorxiv.org/content/10.1101/465534v1
K. Kumbier, S. Sumanta, J. B. Brown, S. Celniker, and B. Yu* (2018) Refining interaction search through signed iterative Random Forests. https://arxiv.org/abs/1810.0728 (an enhanced version of iRF, PCS related)
J. Murdoch, P. Liu, and B. Yu (2018) Beyond word importance: contextual decomposition to extract interactions from LSTMs. Proc. ICLR 2018. https://arxiv.org/abs/1705.07356 (code)
S. Kunzel, J. Sekhon, P. Bickel, and B. Yu* (2019) Meta-learners for Estimating Heterogeneous Treatment Effects using Machine Learning, PNAS. 116 (10) 4156-4165. https://arxiv.org/abs/1706.03461 (code)
S. Basu, K. Kumbier, J. B. Brown, and B. Yu (2018) iterative Random Forests to discover predictive and stable high-order interactions PNAS, 115 (8), 1943-1948. (code) (PCS related)
S. Balakrishnan, M. Wainwright, B. Yu (2017) Statistical Guarantees for the EM algorithm: from population to sample-based analysis. Annals of Statistics, 45(1), 77 - 120.
S. Wu and B. Yu (2018). Local identifiability of l1-minimization dictionary learning: a sufficient and almost necessary condition. JMLR. 18, 1 - 56.
K. Rohe, T. Qin and B. Yu* (2016). Co-clustering directed graphs to discover asymmetries and directional communities. Proc. National Academy of Sciences (PNAS), 113(45), 12679 - 12684.
Siqi Wu, Antony Joseph, Ann S. Hammonds, Susan E. Celniker, Bin Yu, and Erwin Frise (2016). Stability-driven nonnegative matrix factorization to interpret spatial gene expression and build local gene networks (with support information). PNAS, pp. 4290 - 4295. (code) (PCS related)
A. Bloniarz, H. Liu, C. Zhang, J. Sekhon, and B. Yu* (2015). Lasso adjustments of treatment effect estimates in randomized experiments. PNAS. 113, 7383 - 7390.