5th Workshop on Image/Video/Audio Quality Assessment in Computer Vision, VLM and Diffusion Model

Workshop Date: Mar 7, 2026

Location: TBD

Held in conjunction with WACV2026

Home

5th Workshop on Image/Video/Audio Quality Assessment in Computer Vision, VLM and Diffusion Model

Important Dates/Links:

Description:

Image, video, and audio quality significantly impacts machine learning and computer vision systems, yet remains underexplored by the broader research community. Real-world applications—from streaming services and autonomous vehicles to cashier-less stores and generative AI—critically depend on robust quality assessment and improvement techniques. Despite their importance, most visual learning systems assume high-quality inputs, while in reality, artifacts from capture, compression, transmission, and rendering processes can severely degrade performance and user experience.

This workshop is particularly timely given the explosive growth of generative AI, which introduces new challenges in quality assessment for both inputs and outputs. By bringing together researchers from industry and academia, we aim to systematically investigate how quality issues affect various visual learning tasks and develop innovative assessment and mitigation techniques. Building on the success of our previous workshops at WACV(2022-2025), we expect to stimulate new research directions and attract more talent to this critical field, ultimately improving the robustness and reliability of computer vision applications across industries.

Topics:

This workshop addresses topics related to image/video/audio quality assessment in machine learning, computer vision, VLM, Diffusion Model, and other types of generative AIs. The topics include, but are not limited to:

Keynotes

Keynote Speaker: Sarah Ostadabbas

Keynote Speaker

Title: "Toward Data-Efficient Dynamically-Aware Visual Intelligence"

Abstract: [TBD]

Bio: Professor Ostadabbas is an associate professor in the Electrical and Computer Engineering Department at Northeastern University (NU) in Boston, Massachusetts, USA. She joined NU in 2016 after completing her post-doctoral research at Georgia Tech, following the achievement of her PhD at the University of Texas at Dallas in 2014. At NU, Professor Ostadabbas holds the roles of Director at the Augmented Cognition Laboratory (ACLab), Director of Women in Engineering (WIE), and Co-Director at The Center for Signal Processing, Imaging, Reasoning, and Learning (SPIRAL). Her research focuses on the convergence of computer vision and machine learning, particularly emphasizing representation learning in visual perception problems. In her applied research, she has significantly contributed to the understanding, detection, and prediction of human and animal behaviors through the modeling of visual motion, considering various biomechanical factors. Professor Ostadabbas also extends her work to the Small Data Domain, including applications in medical and military fields, where data collection and labeling are costly and protected by strict privacy laws. Her solutions involve deep learning frameworks that operate effectively with limited labeled training data, incorporate domain knowledge for prior learning and synthetic data augmentation, and enhance the generalization of learning across domains by acquiring invariant representations. Professor Ostadabbas has co-authored over 130 peer-reviewed journal and conference articles and received research awards from prestigious institutions such as the National Science Foundation (NSF), Department of Defense (DoD), Sony, Mathworks, Amazon AWS, Verizon, Oracle, Biogen, and NVIDIA. She has been honored with the NSF CAREER Award (2022), Sony Faculty Innovation Award (2023), was the runner-up for the Oracle Excellence Award (2023), and One of the 120+ Women Spearheading Advances in Visual Tech and AI Recognized by LDV Capital (2024). She served in the organization committees of many workshops in renowned conferences (such as CVPR, ECCV, ICCV, ICIP, ICCASP, BioCAS, CHASE, ICHI) in various roles including Lead/Co-Lead Organizer, Program Chair, Board Member, Publicity Co-Chair, Session Chair, Technical Committee, and Mentor.


Keynote Speaker: Gérard G. Medioni

Keynote Speaker

Title: TBD

Abstract: [TBD]

Bio: Gérard G. Medioni is a computer scientist, author, academic and inventor. He is a vice president and distinguished scientist at Amazon and serves as emeritus professor of Computer Science at the University of Southern California. Medioni has made contributions to computer vision, in particular 3D sensing, surface reconstruction, and object modelling. He has translated his computer vision research into customer-facing inventions and products. He has authored four books, including Emerging Topics in Computer Vision, Multimedia Systems: Algorithms, Standards, and Industry Practices, and A Computational Framework for Segmentation and Grouping, and has published more than 80 journal papers, 200 conference papers, with over 34,000 citations and his h-index is 88. In addition, he holds 123 patents to his name which include Visual tracking in video images in unconstrained environments by exploiting on-the-fly context using supporters and distracters and Depth mapping based on pattern matching and stereoscopic information, along with patents on Just Walk Out technology and Amazon One. Medioni is a Fellow of the Association for the Advancement of Artificial Intelligence, the Institute of Electrical and Electronics Engineers, the International Association for Pattern Recognition, and the National Academy of Inventors. He is also a member of National Academy of Engineering.

Submission Guidelines and Review Process:

Organizers

Organizer 1

Yarong Feng

Amazon

Organizer 3

Joe Liu

Amazon

Organizer 2

Qipin Chen

Amazon

Information

TBD

Contact Us

If you have any questions or inquiries, please contact us at wacv2026-image-quality-workshop@amazon.com.