Zum Inhalt springen

ELLIS Summer Lecture Series - Explainability & Understanding of Models

Dr. Hila Chefer
Tel Aviv University

LECTURE DATE & TIME
June 1, 2026, 14:00 – 16:00

LOCATION
Friedrich Schiller University Jena, Inselplatz 5, 07743 Jena
Letcure Hall, Room 005

Registration is not required.

If you are unable to attend in person, a live stream will be available:
https://online.mmz.uni-jena.de/beta/livestream/?hsid=1113_e005

TITLE
Toward Generative Models that Understand the Visual World

ABSTRACT
Despite remarkable advances, visual generative models are still far from faithfully modeling the world, struggling with fundamental aspects such as spatial relations, physics, motion, and dynamic interactions.
In this talk, I present a line of work that tackles these challenges, based on a deep understanding of the inner mechanisms that drive models. I will begin by analyzing state-of-the-art visual generators, gaining insights into the underlying reasons for their limited understanding. Building upon these insights, I will demonstrate methods that significantly enhance both spatial and temporal reasoning in image and video generation, surpassing even resource-intensive proprietary models without relying on additional data or model scaling. I will conclude the talk by discussing open challenges and future directions for advancing faithful world modeling in visual generative models.

BIO
Dr. Hila Chefer is a Research Scientist at Black Forest Labs and an incoming Assistant Professor (Senior Lecturer) at Tel Aviv University. Her research focuses on architecture development and interpretability for visual foundation models, aiming to both understand and advance their generative capabilities.
Hila earned her PhD from Tel Aviv University under the supervision of Prof. Lior Wolf, where she developed the de facto leading methods for Transformer interpretability and the Attend-and-Excite framework for text-to-image control. Her work in video generation includes the development of Lumiere (a foundational video generation model) during her time at Google Research, and VideoJAM (a joint appearance-motion framework for motion and physics fidelity in video generation) at Meta AI. Her contributions have been recognized with several prestigious honors, including the Fulbright Postdoctoral Fellowship, the Eric and Wendy Schmidt Postdoctoral Award, and the Council for Higher Education (VATAT) Award for Outstanding PhD Students.

URL
https://hila-chefer.github.io/

Margret Keuper
University of Mannheim, Max Planck Institute for lnformatics  

LECTURE DATE & TIME
June 24, 2026, 16:00 – 18:00

LOCATION
Friedrich Schiller University Jena, Inselplatz 5, 07743 Jena
Letcure Hall, Room 005

Registration is not required.

If you are unable to attend in person, a live stream will be available:
https://online.mmz.uni-jena.de/beta/livestream/?hsid=1113_e005

TITLE
Reliablility in Computer Vision – Trade-Offs between Robustness, Fairness and Transparency

ABSTRACT
Over the past years, we have seen tremendous progress in machine learning techniques, partly due to an enormous growth in data resources, increased compute power and advances in methodology for specialized tasks. Despite the strong progress, there are several issues with current approaches. Not only do they rely on large amounts of annotated training data, but they also require the task to be explicitly defined at a fine-grained level and the learning architecture to be optimized specifically for this task. The resulting model usually has a very limited explainability and a low level of generalizability and robustness against domain shifts or adversarial examples. Yet, for many use-cases, these properties are crucial to allow for practical applicability. In this presentation, I will show samples from our recent work towards improving a model’s generalization ability and transparency, specifically (i) addressing robustness under adversarial attacks and domain shifts while not emphasizing class-wise biases, and (ii) addressing the relationship between inherent interpretability and spurious
correlations.

[1] S. Agnihotri, S. Jung, M. Keuper, CosPGD: A unified white-box adversarial attack for pixel-wise prediction tasks, ICML 2024.
[2] Fair-TAT: Improving Model Fairness Using Targeted Adversarial Training, T. Medi, S. Jung,M. Keuper, Winter Conference on Computer Vision (WACV) 2025.
[3] Maxsup: Overcoming representation collapse in label smoothing, Y. Zhou, H. Li, Z.Q. Cheng, X. Yan, Y. Dong, M .Fritz, M. Keuper, Advances in Neural Information Processing Systems (NeurIPS), 2025 (oral).
[4] AIM: Amending Inherent Interpretability via Self-Supervised Masking, E. Alshami, S. Agnihotri, B. Schiele, M. Keuper, International Conference on Computer Vision (ICCV) 2025.
[5] DCBM: Data-Efficient Visual Concept Bottleneck Models, K. Prasse, P. Knab, S. Marton, C. Bartelt, M. Keuper,  International Conference on Machine Learning (ICML) 2025. 

BIO
Prof. Dr. Margret Keuper is a full professor for Machine Learning with focus on computer vision at the University of Mannheim. She is also an affiliated research leader at the Max Planck Institute for Informatics, Saarbrücken. Professor Keuper received her PhD degree from the University of Freiburg under the supervision of Prof Thomas Brox and worked as a postdoctoral researcher at the University of Freiburg working on topics related to motion estimation, segmentation, and grouping. Since 2024 Professor Keuper is a ELLIS Fellow and member oft he ELLIS Unit Saarbrücken.

URL
https://www.uni-mannheim.de/dws/people/professors/prof-dr-ing-margret-keuper/

Simone Schaub-Meyer
Technical University of Darmstadt 

LECTURE DATE & TIME
July 8, 2026, 16:00 – 18:00

LOCATION
Friedrich Schiller University Jena, Inselplatz 5, 07743 Jena
Letcure Hall, Room 005

Registration is not required.

If you are unable to attend in person, a live stream will be available:
https://online.mmz.uni-jena.de/beta/livestream/?hsid=1113_e005

TITLE
Understanding Deep Vision Models and Its Benefits

ABSTRACT
Deep learning has led to remarkable progress in computer vision, yet benchmark accuracy alone provides only a limited view of model capabilities. In this talk, I will argue that a deeper analysis of model behavior can yield both practical improvements and conceptual insights. First, I will show how fine-grained performance analyses can lead towards simpler and more computationally efficient vision models. In the second part, I will focus on understanding model behavior through visual explanations. I will discuss the role of visual explanations in improving classification and present a method to obtain such explanations efficiently in
practice. I will then highlight recent findings showing that deep networks rely not only on the presence of visual concepts, but also on their absence and show how extending attribution and feature visualization methods makes these effects visible. Together, these perspectives illustrate how a deeper understanding of model behavior can inform the design of more efficient, reliable, and interpretable vision models. 

BIO
Simone Schaub-Meyer leads a research group on image and video analysis at the
Technical University of Darmstadt. Her work focuses on developing efficient, robust, and interpretable methods for visual perception Her research is supported by the Emmy Noether Programme of the German Research Foundation. Assistant Professor Schaub-Meyer is a member of the ELLIS Unit Darmstadt and the Hessian Center for Artificial Intelligence (hessian.AI). She received her doctoral degree from ETH Zurich in collaboration with Disney Research Zurich, where her thesis on motion representation and video frame interpolation was awarded the ETH Medal.

URL
https://www.visinf.tu-darmstadt.de/visual_inference/people_vi/visinf_team_details_102784.en.jsp

Supported by