By MIKE MAGEE
For my parents, March, 1965 was a banner month. First, that was the month that NASA launched the Gemini program, unleashing “transformative capabilities and cutting-edge technologies that paved the way for not only Apollo, but the achievements of the space shuttle, building the International Space Station and setting the stage for human exploration of Mars.” It also was the last month that either of them took a puff of their favored cigarette brand – L&M’s.
They are long gone, but the words “Gemini” and the L’s and the M’s have taken on new meaning and relevance now six decades later.
The name Gemini reemerged with great fanfare on December 6, 2023, when Google chair, Sundar Pichai, introduced “Gemini: our largest and most capable AI model.” Embedded in the announcement were the L’s and the M’s as we see here: “From natural image, audio and video understanding to mathematical reasoning, Gemini’s performance exceeds current state-of-the-art results on 30 of the 32 widely-used academic benchmarks used in large language model (LLM) research and development.
Google’s announcement also offered a head to head comparison with GPT-4 (Generative Pretrained Transformer-4.) It is the product of a non-profit initiative, and was released on March 14, 2023. Microsoft’s helpful AI search engine, Bing, helpfully informs that, “OpenAI is a research organization that aims to create artificial general intelligence (AGI) that can benefit all of humanity…They have created models such as Generative Pretrained Transformers (GPT) which can understand and generate text or code, and DALL-E, which can generate and edit images given a text description.”
While “Bing” goes all the way back to a Steve Ballmer announcement on May 28, 2009, it was 14 years into the future, on February 7, 2023, that the company announced a major overhaul that, 1 month later, would allow Microsoft to broadcast that Bing (by leveraging an agreement with OpenAI) now had more than 100 million users.
Which brings us back to the other LLM (large language model) – GPT-4, which the Gemini announcement explores in a head-to-head comparison with its’ new offering. Google embraces text, image, video, and audio comparisons, and declares Gemini superior to GPT-4.
Mark Minevich, a “highly regarded and trusted Digital Cognitive Strategist,” writing this month in Forbes, seems to agree with this, writing, “Google rocked the technology world with the unveiling of Gemini – an artificial intelligence system representing their most significant leap in AI capabilities. Hailed as a potential game-changer across industries, Gemini combines data types like never before to unlock new possibilities in machine learning… Its multimodal nature builds on yet goes far beyond predecessors like GPT-3.5 and GPT-4 in its ability to understand our complex world dynamically.”
Expect to hear the word “multimodality” repeatedly in 2024 and with emphasis.
Continue reading…