Add to Favourites
To login click here

Multimodality is emerging as a competitive necessity in the large language model market, with Google, OpenAI, and Microsoft all releasing products that enable users to enter inputs in multiple formats like text, image, and voice. Training AI systems on multimodal inputs will open the door to a range of new use cases, and 2023 has been a pivotal year for defining the type of experience generative AI chatbots will provide. Text-to-image LLMs have been released earlier in 2022, and the utility of these tools is confined to the creation of images rather than providing a conversational experience.