This month, Google announced its latest attempt to usurp ChatGPT’s position as the king of generative AI chatbots since its launch.
Bard (now renamed Gemini) was released in early 2023 following OpenAI’s groundbreaking LLM-powered chat interface. And to be honest, it often seems as if we’re playing catch-up.
Bard’s integration with Google’s search technology gave it access to the internet from day one. On the other hand, the launch version of ChatGPT was limited to the knowledge provided during training.
But OpenAI quickly added connectivity and the ability to access external information to ChatGPT through a connection with Microsoft’s Bing. Connectivity aside, the consensus is that ChatGPT is useful for a broader range of language processing tasks.
Now Google is making stops, including renaming the language model it’s working on behind the scenes to the Bard brand name and allowing access to its Advanced service via a subscription priced in direct competition with ChatGPT. It is withdrawn.
So, are you ready to step into the ring and face off against the undisputed champion? Here’s an overview of both platforms and highlights the differences you’ll want to know when choosing which one to use. To do.
language model
First, it’s worth noting that both Gemini and ChatGPT are based on an incredibly expansive and powerful Large Language Model (LLM), which is far more advanced than anything published in the past.
Remember that ChatGPT is just an interface for users to communicate with language models (paid users of ChatGPT Pro) or GPT3.5 (free users).
In Google’s case, the interface is called Gemini (formerly Bard) and is used to communicate with language models. Although a language model is a separate entity, it is also known as Gemini (or Gemini Ultra if you are paying for the Gemini Advanced service).
An important thing to consider is that although they both call themselves chatbots, the intended user experience is slightly different. ChatGPT is designed to enable conversation and solve problems conversationally, similar to chatting with an expert on a subject.
Gemini, on the other hand, seems designed to process information and automate tasks in a way that saves users time and effort.
From a technical perspective, the power of an LLM model is often measured by the number of parameters (trainable values) in the neural network. GPT-4’s network is reported to contain approximately 1 trillion parameters, but no one knows for sure how many parameters Gemini uses.
However, this may not be important as it may be enough to know that both are very powerful.
Subbarao Kambampati, an AI professor at Arizona State University, recently told Wired, “Basically, most LLMs have gotten to the point where they are indistinguishable by qualitative metrics.”
In other words, it’s not the technical scale or capabilities of the model that matters. What really matters is how the model is tuned, trained, and presented to help the user solve the problem.
And the winner is…
After using both for a while and having various conversations about different topics, I think it’s clear that ChatGPT is still the more powerful chat interface, thanks to the grunt that GPT-4 provides. But Gemini is closing the gap!
Information retrieval
One of the benefits of Gemini is that by default it takes into account all the information at your fingertips, including the internet, Google’s vast knowledge graph, and training data.
ChatGPT, on the other hand, often tries to answer questions based only on training data. This may result in outdated information. However, you can avoid this by instructing it to search the web for the latest data. However, this is still introducing an extra step that Gemini has shown is not really necessary.
In my experience using both platforms, I have to say that Gemini has proven to be slightly better than ChatGPT when it comes to searching online and integrating the information you find into responses. not.
When ChatGPT goes online and looks for information, its responses tend to lose some of their dynamism. We often answer questions or provide answers based on a single web search and a single source of information, rather than comprehensively analyzing all the information we have access to and drawing conclusions. It looks like it is.
Here’s a simple example of what this means. I often use AI chatbots to provide a quick overview of the company, its products, and services. Using the same prompt (“Tell me about”) [URL]”), ChatGPT often simply regurgitates a website’s marketing blurb.
In the short time I tested it, Gemini seems to take a more nuanced approach. Summarizes the information found when trying to generate a balanced overview of features.
Therefore, we can say that this is one area where Gemini has a slight lead over its rivals.
But that’s not the end of the story. ChatGPT remains the winner when it comes to intelligently parsing the information it’s been trained to create responses.
And the winner is…
Gemini is better when it comes to creating answers from online text, and ChatGPT is better when it comes to queries that don’t involve the internet, so let’s call this a tie.
Multimodal functionality
Multimodal AI is AI that can process multiple types of data. Early versions of ChatGPT only read and generated text. However, since OpenAI upgraded its “engine” to his GPT-4, it gained the ability to process visual and audio data and became multimodal. Gemini, on the other hand, was multimodal from the start (though not all features were enabled right away).
ChatGPT uses the DALL-E model, also developed by OpenAI, to generate images. Gemini, on the other hand, is powered by Google’s Imagen 2 engine. Both are obviously very powerful and can produce amazing results. However, when compared with the same prompt, I would say that ChatGPT is more consistent when it comes to creating images that closely match what I was looking for.
One difference that others have pointed out is that Imagen 2 and Gemini are slightly better at producing photorealistic and highly detailed images. ChatGPT, on the other hand, is better at managing spatial relationships between objects in an image and better at creatively interpreting prompts.
Both can understand and write computer code across a wide range of programming languages. However, there are some differences in how this is done.
Now, I’m no programmer, but the great thing is, with ChatGPT and Gemini in front of me, I don’t have to be.
There’s no doubt that ChatGPT’s conversational prowess is a key advantage here. If you’re not sure what to do with your code or the best way to integrate it, it’s better to generate clear and helpful guidance and provide suggestions and tips.
And the winner is…
Pass this to ChatGPT again. Gemini produces better photorealism, but ChatGPT is better at producing images that closely match what the user is asking for in the prompt. Gemini seems to be a little better at writing technical code, but it can’t compete with ChatGPT as a conversational interface to use while building and experimenting.
(Quick note: Gemini image generation is not yet launched for European users. We hope it will be added soon.)
So which one is best?
Well, neither one is perfect by any means. Both still suffer from hallucinations and quite often end up simply giving the wrong information. For example, Gemini said that OpenAI’s Dall-E 2 does not use diffusion modeling technology (it does). Also, ChatGPT told me that Gemini can’t generate images (it actually uses them).
But for my money, if I was only going to subscribe to one, I’d be inclined to choose ChatGPT Pro at this point.
There are a few things to keep in mind. If you’re deeply interested in the Google ecosystem, Gemini’s ability to integrate with Gmail and Google Docs will likely be a big draw for you. Similarly, if you’re an experienced programmer and your primary need is coding, be sure to check out Gemini (but also check out his Co-Pilot from Microsoft).
I currently think ChatGPT is better for writing and creating documents, summarizing, generating generic images, and learning through conversation. Because of this, it maintains its position as the best available today.