Deep Learning and Machine Learning
Complete Paper List in Reverse Chronological Order
Selected Recent Papers in Deep Learning and Machine Learning
- A. R. Hsu, Y. Cherapanamjeri, A. Y. Odisho, P. R. Carroll, B. Yu (2024). Mechanistic Interpretation through Contexual Decomposition in Transformers. https://arxiv.org/pdf/2407.00886.
- O. Ronen, A. I. Humayun, R. Balestriero, R. Baraniuk, B. Yu (2024). ScaLES: Scalable Latent Exploration Score for Pre-Trained Generative Networks. https://arxiv.org/pdf/2406.09657
- S. Hayou, N. Ghosh, B. Yu (2024). The Impact of Initialization on LoRA Fineuning Dynamics. https://arxiv.org/pdf/2406.08447
- B. Yu (2024). After Computational Reproducibility: Scientific Reproducibility and Trustworthy AI (discussion of Donoho’s paper “Data Science at the Singularity”) Harvard Data Science Review (HDSR).
- N. R. Mallinar, A. Zane, S. Frei, B. Yu (2024). Minimum-Norm Interpolation Under Covariate Shift. Proc. ICML. https://arxiv.org/pdf/2404.00522
- L. Sun, A. Agarwal, A. Kornblith, B. Yu, C. Xiong (2024). ED-Copilot: Reduce Emergency Department Wait Time with Language Model Diagnostic Assistance. Proc. ICML. https://arxiv.org/abs/2402.13448
- S. Hayou, N. Ghosh, B. Yu (2024). LoRA+: Efficient Low Rank Adaptation of Large Models. Proc. ICML. https://arxiv.org/abs/2402.12354
- N. Ghosh, S. Frei, W. Ha, B. Yu (2023). The effect of SGD batch size on autoencoder learning: sparsity, sharpness and feature learning. https://arxiv.org/abs/2308.03215
- A. R. Hsu, Y. Cherapanamjeri, B. Park, T. Naumann, A. Odisho, and B. Yu (2023). Diagnosing transformers: illuminating feature space for clinical decison-making. ICLR 2023. https://arxiv.org/abs/2305.17588
- C. Singh, A. R. Hsu, R. Antonello, S. Jain, A. G. Huth, B. Yu and J. Gao (2023). Explaining black box text modules in natural language with language models.
- N. Ghosh, S. Mei, and B. Yu (2022). The three stages of dynamics in high-dimensional kernel methods. Proc. ICLR, 2022. https://arxiv.org/abs/2111.07167
- W. Ha, C. Singh, F. Lanusse, S. Upadhyayula, and B. Yu (2021). Adaptive Wavelet Distillation from Neural Networks through Interpretation. Proc. NeurIPS 2021. (code)
- L. Reiger, J. W. Murdoch, S. Singh, B. Yu (2020). Interpretations are Useful: Penalizing Explanations to Align Neural Networks with Prior Knowledge. ICML Proceedings. (code)
- C. Singh, W. Ha, F. Lanusse, V. Boehm , J. Liu, B. Yu (2020). Transformation Importance with Applications to Cosmology ICLR Workshop paper. (code)
- Y. Chen, R. Dwivedi, M. J. Wainwright and B. Yu (2020) Fast Mixing of Metropolized Hamiltonian Monte Carlo: Benefits of Multi-Step Gradients, JMLR, https://arxiv.org/abs/1905.12247
- R. Dwivedi, Y. Chen, M. J. Wainwright and B. Yu (2019) Log-concave Sampling: Metropolis Hastings Algorithms are Fast JMLR. http://jmlr.org/papers/v20/19-306.html
- Y. Chen, R. Dwivedi, M. J. Wainwright and B. Yu (2018) Fast MCMC Algorithms on Polytopes. JMLR. http://jmlr.org/papers/v19/18-158.html
- Y. Chen, R. Abbasi-Asl, A. Bloniarz, M. Oliver, B. Willmore, J. Gallant, and B. Yu (2018) The DeepTune framework for modeling and characterizing neurons in visual cortex area V4 https://www.biorxiv.org/content/10.1101/465534v1
- K. Kumbier, S. Sumanta, J. B. Brown, S. Celniker, and B. Yu* (2018) Refining interaction search through signed iterative Random Forests. https://arxiv.org/abs/1810.0728 (an enhanced version of iRF, PCS related)
- J. Murdoch, P. Liu, and B. Yu (2018) Beyond word importance: contextual decomposition to extract interactions from LSTMs. Proc. ICLR 2018. https://arxiv.org/abs/1705.07356 (code)
- S. Kunzel, J. Sekhon, P. Bickel, and B. Yu* (2019) Meta-learners for Estimating Heterogeneous Treatment Effects using Machine Learning, PNAS. 116 (10) 4156-4165. https://arxiv.org/abs/1706.03461 (code)
- S. Basu, K. Kumbier, J. B. Brown, and B. Yu (2018) iterative Random Forests to discover predictive and stable high-order interactions PNAS, 115 (8), 1943-1948. (code) (PCS related)
- S. Balakrishnan, M. Wainwright, B. Yu (2017) Statistical Guarantees for the EM algorithm: from population to sample-based analysis. Annals of Statistics, 45(1), 77 - 120.
- S. Wu and B. Yu (2018). Local identifiability of l1-minimization dictionary learning: a sufficient and almost necessary condition. JMLR. 18, 1 - 56.
- K. Rohe, T. Qin and B. Yu* (2016). Co-clustering directed graphs to discover asymmetries and directional communities. Proc. National Academy of Sciences (PNAS), 113(45), 12679 - 12684.
- Siqi Wu, Antony Joseph, Ann S. Hammonds, Susan E. Celniker, Bin Yu, and Erwin Frise (2016). Stability-driven nonnegative matrix factorization to interpret spatial gene expression and build local gene networks (with support information). PNAS, pp. 4290 - 4295. (code) (PCS related)
- A. Bloniarz, H. Liu, C. Zhang, J. Sekhon, and B. Yu* (2015). Lasso adjustments of treatment effect estimates in randomized experiments. PNAS. 113, 7383 - 7390.