
- This event has passed.
BMVC 2024_The 35th British Machine Vision Conference
November 25, 2024 - November 28, 2024
Davide Caffagni (University of Modena and Reggio Emilia) participated in the 35th British Machine Vision Conference (BMVC), presenting the paper “Revisiting Image Captioning Training Paradigm via Direct CLIP-based Optimization”, co-authored with Nicholas Moratelli, Marcella Cornia, Lorenzo Baraldi, and Rita Cucchiara (University of Modena and Reggio Emilia).
The paper introduces Direct CLIP-Based Optimization (DiCO), a novel training paradigm for image captioning that improves stability and enhances caption quality by optimizing modern metrics like CLIP-Score while maintaining fluency. The proposed approach outperforms existing methods in aligning with human preferences and generates more informative captions.