AI companies are racing to develop multimodal systems, which allow users to interact with AI in multiple ways. OpenAI recently announced the addition of image analysis and speech synthesis capabilities to its GPT-3.5 and GPT-4 models, while Google is reportedly testing its yet-to-be-released Gemini model. Multimodal AI systems are not new, and OpenAI is actively hiring experts to work on its Gobi project, which is expected to be a multimodal AI system from scratch. Multimodality works by combining multiple AI models, such as text-to-image models, to create a more comprehensive AI system.
Previous ArticleLatest News Nexcom Powers Industry 4.0 Automation And Sustainable Factory Operations With Rugged Nife 200s
Next Article Q1 2024 Applied Digital Corp Earnings Call