Add to Favourites
To login click here

This article discusses the research papers presented by Meta at CVPR 2023, which focus on enhancing the performance and scalability of computer vision systems. The paper introduces EgoTask Translation (EgoT2), a unified approach for wearable cameras, and LaViLa, a method for learning video-language representations using large language models. Both approaches are demonstrated to be effective on various video tasks and achieve top-ranked results in benchmark challenges.