Multimodal Information Text

Apple Unveils New 'MM1' Multimodal AI Model Capable of Interpreting Images, Text Data

Apple has revealed its latest development in artificial intelligence (AI) large language model (LLM), introducing the MM1 family of multimodal models capable of interpreting both images and text data.

InfoWorld

Microsoft’s Phi-4-multimodal AI model handles speech, text, and video

Microsoft has introduced a new AI model that, it says, can process speech, vision, and text locally on-device using less compute capacity than previous models. Innovation in generative artificial ...

Apple AI research shows how MLLMs understand, generate, search for images

Apple's researchers continue to focus on multimodal LLMs, with studies exploring their use for image generation, ...

Mashable

French startup Mistral unveils Pixtral 12B, its first multimodal AI model

French AI startup Mistral has dropped its first multimodal model, Pixtral 12B, capable of processing both images and text. The 12-billion-parameter model, built on Mistral’s existing text-based model ...

Gizmodo

Why ‘Multimodal AI’ Is the Hottest Thing in Tech Right Now

There's a new race in technology to make AI see and hear the world around you, and ultimately make sense of it for you. Reading time 3 minutes OpenAI and Google showcased their latest and greatest AI ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results