Learning Joint Interventional Effects from Single-Variable Interventions in Additive Models |
Armin Kekić, Sergio Garrido Mejia, Bernhard Schölkopf
|
Generative Intervention Models for Causal Perturbation Modeling |
Nora Schneider, Lars Lorch, Niki Kilbertus, Bernhard Schölkopf, Andreas Krause |
Generalized Interpolating Discrete Diffusion |
Dimitri von Rütte, Janis Fluri, Yuhui Ding, Antonio Orvieto, Bernhard Schölkopf, Thomas Hofmann |
LLMs on the Line: Data Determines Loss-to-Loss Scaling Laws |
Prasanna Mayilvahanan, Thaddäus Wiedemer, Sayak Mallick, Matthias Bethge, Wieland Brendel
|
LAION-C: An Out-of-Distribution Benchmark for Web-Scale Vision Models |
Fanfei Li, Thomas Klein, Wieland Brendel, Robert Geirhos, Roland S. Zimmermann |
Position: An Empirically Grounded Identifiability Theory Will Accelerate Self Supervised Learning Research |
Patrik Reizinger, Randall Balestriero, David Klindt, Wieland Brendel
|
When, Where and Why to Average Weights? |
Niccolò Ajroldi, Antonio Orvieto, Jonas Geiping
|
An Interpretable N-gram Perplexity Threat Model for Large Language Model Jailbreaks |
Valentyn Boreiko, Alexander Panfilov, Václav Voráček, Matthias Hein, Jonas Geiping
|
Great Language Models Think Alike and this Undermines AI Oversight |
Shashwat Goel, Joschka Strüber, Ilze Amanda Auzina, Karuna Chandra, Ponnurangam Kumaraguru, Douwe Kiela, Ameya Pandurang Prabhu, Matthias Bethge, Jonas Geiping
|
Bayesian Neural Scaling Laws Extrapolation with Prior-Fitted Networks |
Dongwoo Lee, Dong Bok Lee, Steven Adriaensen, Juho Lee, Sung Ju Hwang, Frank Hutter, Seon Joo Kim, Hae Beom Lee |
FairPFN: A Tabular Foundation Model for Causal Fairness |
Jake Robertson, Noor Awad, Noah Hollmann, Frank Hutter, Samuel Gabriel Müller |
Tuning LLM Judge Design Decisions for 1/1000 of the Cost |
David Salinas, Omar Swelam, Frank Hutter
|
Position: The Future of Bayesian Prediction Is Prior-Fitted |
Samuel Gabriel Müller, Arik Reuter, Noah Hollmann, David Rügamer, Frank Hutter
|
From Low Rank Gradient Subspace Stabilization to Low-Rank Weights: Observations, Theories, and Applications |
Ajay Jaiswal, Yifan Wang, Lu Yin, Shiwei Liu, Runjin Chen, Jiawei Zhao, Ananth Grama, Yuandong Tian, Zhangyang “Atlas” Wang |
LIFT the Veil for the Truth: Principal Weights Emerge after Rank Reduction for Reasoning-Focused Supervised Fine-Tuning |
Zihang Liu, Tianyu Pang, Oleg Balabanov, Chaoqun Yang, Tianjin Huang, Lu Yin, Yaoqing Yang, Shiwei Liu
|
Mask-Enhanced Autoregressive Prediction: Pay Less Attention to Learn More |
Xialie Zhuang, Zhikai Jia, Jianjin Li, Zhenyu Zhang, Li Shen, Zheng Cao, Shiwei Liu
|