Battle of the Bots: A Guide to the Best Large Language Model (LLM) Chatbots
Confused by the dozens, if not hundreds of chatbots flooding the market? Worry not because we have done the legwork of assessing the top chatbots to identify the best ones for each use case.

Just a year ago, I would have said that Artificial Intelligence (AI) was a domain that was largely accessible only to Big Tech, deep tech start-ups, and the data scientists and machine learning engineers that dwelled there.
OpenAI’s launch of ChatGPT-3.5 in November 2022 changed all of that. For the first time, AI, in the form of Large Language Models (LLMs) became available to “layman” users (i.e., non-technical / non-developer users like me!).
Fast forward just 8 months later and there are now dozens if not hundreds of such chatbots flooding the market. The emergence of these powerful and versatile chatbots promise to change the way we work and live for the better. Yet it can be challenging to navigate the sea of options, especially in the face of such rapid development.
Many of you will therefore be asking these questions: Which chatbot truly delivers on its promise? Which one is best for my specific needs?
Best of Breed
I've undertaken a detailed assessment of the leading chatbots on the market. By examining factors such as pricing / cost, performance, versatility, ease of use, we've identified the top performers in a variety of categories.
Whether you're a business looking to jumpstart productivity, an educator seeking a study aid, or simply an AI enthusiast curious about the latest developments, we've got you covered.
So, without further ado, let's dive into our findings in the Battle of the Bots!
N.B. As this assessment has been developed for non-technical / non-developer users, the comparative coding capabilities of each chatbot have not been factored into the evaluations.
Overall Winner: ChatGPT
The first Generative AI entrant remains the clear market leader, with key strengths in terms of versatility, analytical and logical reasoning capabilities, and for the paid version, integration with third-party tools (known as plugins) and Chrome extensions.
Overall Runner Up: Google Bard
Google Bard got off to a rocky start, but has returned stronger with cool features such as integration with Google Docs, image recognition, providing images as part of its responses, and improved accuracy. Many other features have also been announced.
Research Winner: Perplexity AI
For researchers, students, consultants and other research-focused professionals, Perplexity AI leads the pack with its ability to conduct AI-enabled Internet searches, provide data sources and citations, and high degree of accuracy.
Data Analytics Winner: ChatGPT-4
All LLMs have taken big steps forward on this front, but none more so than ChatGPT-4. OpenAI’s recently ChatGPT-4 Code Interpreter plugin is exceptional, and handle everything from exploratory data analysis through to complex data science techniques.
Longform Content Winner: Claude 2
Claude 2 is miles ahead when it comes to longform content creation and analysis. The LLM can process extremely long prompts and is able to analyse hefty research papers, lengthy articles, even short books, and can compare up to five uploaded documents.
Internet Search Winner: Bing AI
Bing AI is our winner for general AI-enabled Internet searches because of the AI’s direct integration with Bing (only for the Microsoft Edge browser), easy-to-use interface, and it being powered by the most “intelligent” LLM (for now), ChatGPT-4.
Marketing Winner: Jasper AI
Jasper AI is the go-to for marketers - blog posts, product descriptions, marketing copy, It has features such as the ability to train the AI with your brand voice, and integration with Grammarly for spellchecks etc. The catch - it doesn’t come cheap!
Companion & Support Winner: Pi
I like many other chatbot tools, but I simply adore Pi. It has the unique distinction of being designed to act as a coach, confidante, creative partner, and sounding board, and to be emotionally intelligent and human-like in its conversations with users.
The Deep Dives

Now that we’ve identified the category winners, let’s deep dive into each chatbot to better understand their strengths, weaknesses, and unique features.
This assessment has been based on my own testing and evaluation of each tool, supplemented by company announcements and reviews by other analysts. With the focus being on usability and relevant for non-technical users, I have looked at each of the chatbots along the following dimensions and have rated them on a scale of 1 to 5 (with 1 being worst and 5 being the best):
Availability: Extent to which the chatbot is availability across countries
Price: Cost to the individual user
Performance: Efficacy of the chatbot in performing its designated tasks, with a focus on its core function. For instance, Jasper AI is targeted at marketers and its performance has been assessed only for this specific domain
Versatility: Extent to which the chatbot can perform well at a variety of tasks
Accuracy / Safety: Measure of the chatbot’s responses being accurate and unbiased, and is protected from attempts to “jailbreak” it (using various methods to get the chatbot to do something was not designed to do)
Ease of Use: Extent to which the user interface is intuitive and easy-to-navigate, and enables the user to organise previous conversations.
Current: Extent to which the chatbot’s data is current and up-to-date.
It is important to note that because the world of Generative AI is moving so quickly, I’d expect the assessment and rankings to change within a few months tops. My plan is to keep this evaluation fresh and to update this article regularly.
ChatGPT
There are two versions of ChatGPT, including the free model, ChatGPT-3.5, and the paid model, ChatGPT-4.
ChatGPT-3.5
The Generative AI war was kickstarted by OpenAI in November 2022 with ChatGPT-3.5. This model remains in use today - albeit with significant updates to improve its capabilities and accuracy - and is the free-to-use version most users are familiar with.
Here is how the chatbot fares:
Availability (5/5): 195 countries.
Price (5/5): Free with no limits.
Performance (4/5): ChatGPT-3.5’s core use case is as a multipurpose AI assistant. It performs well at a wide range of tasks including writing, categorisation, summarisation, coding, challenging analytical tasks and many more. This is further enhanced by a plethora of Chrome extensions that allow users to leverage ChatGPT-3.5’s capabilities in myriad ways directly from their Chrome browsers. This chatbot has no rival among the free LLMs when it comes to the creativity and imagination of its responses.
Versatility (4/5): Due to its nature as a multipurpose AI assistant it naturally scores high in this regard, and I would consider it the most versatile of the free LLMs currently available.
Accuracy / Safety (3/5): The initial launch version of ChatGPT-3.5 was a lot more prone to lying or providing inaccurate information (‘hallucinating’), but recent updates have significantly improved its reliability. Work has also been undertaken to reduce inherent biases in its responses.
Ease of Use (3/5): OpenAI lags some of its competitors when it comes to providing a clean and intuitive user interface. There is also no native functionality to allow searching through past conversations.
Current (3/5): ChatGPT-3.5 does not have Internet access and was only trained on data up to September 2021. That being said, this is enough, in many cases to provide a sufficiently robust response.
ChatGPT-4
The latest ChatGPT model was launched in March 2023, and is available to public users with a ChatGPT Plus subscription. ChatGPT-4 is a clear step up from its predecessor in two ways: firstly, in terms of intelligence and accuracy, and secondly, in relation to the vast array of integrated third party and OpenAI tools known as plugins, that have now been made available to paying users.
Here is how the chatbot fares:
Availability (5/5): 195 countries.
Price (3/5): US$20 / month permitting up to 50 ChatGPT-4 searches every 3 hours.
Performance (4/5): As with the free model, ChatGPT-4’s core use case is as a multipurpose AI assistant. The underlying model itself is significantly more intelligent than its predecessor, while plugins massively extend the capabilities of the underlying model. Key areas of improvement over ChatGPT-3.5 is its ability to follow instructions and to solve advanced analytical and logical reasoning tasks. These include third party plugins from the likes of Wolfram (computational capabilities), OpenTable (restaurant bookings), and Kayak (travel bookings), as well as plugins developed by OpenAI itself. OpenAI’s recently released Code Interpreter plugin is a game changer (not at all an exaggeration!) for data analytics and data science.
Versatility (5/5): With the breath of plugins available, ChatGPT-4 is significantly more versatile than any other chatbot on the market. Period.
Accuracy / Safety (4/5): ChatGPT-4 is less prone to hallucinations and biases as compared to its predecessor model.
Ease of Use (3/5): It has the same interface and user experience as ChatGPT-3.5.
Current (4/5): The underlying ChatGPT-4 model itself does not have Internet access and was only trained on data up to September 2021. The chatbot previously had Internet connectivity via a Bing Search plugin, but this function appears to have been entirely removed for the foreseeable future due to copyright concerns. However, the cornucopia of available plugins provide ChatGPT-4 with indirect access to the Internet - for instance, the OpenTable plugin allows users to make restaurant bookings directly within the ChatGPT-4 interface via a connection with OpenTable.
Google Bard
Despite being the company that developed the Transformer technology that underlies all LLM and Generative AI models today, Alphabet was noticeably slow to the race. The initial launch of Google Bard was a debacle (the chatbot hallucinated in a promotional video) and wiped US$100m from Alphabet’s market value overnight.
Since then however, the company has made a number of upgrades to the chatbot and to underlying features, and announced a slew of new features. The planned integration with Adobe’s Firefly and Express, which will allow for image generation and editing has gotten me quite excited!
Here is how the chatbot fares:
Availability (5/5): Over 180 countries
Price (5/5): Free with no limits.
Performance (4/5): Google Bard has been positioned as a multipurpose AI assistant. In theory, it has many of the same capabilities as ChatGPT-3.5. However, it noticeable lags the latter in terms of creativity and is at best able to solve moderately difficult analytical and logical reasoning problems.
Versatility (4/5): While it lags ChatGPT-3.5 in some areas, Google Bard is still pretty versatile. In particular, I would call out its ability to connect to the Internet and useful features such as integration with Google Docs and Gmail. As noted above, planned integrations with Adobe and other third parties are on the cards and may one day allow Google Bard to stand on par with ChatGPT-4 in terms of versatility.
Accuracy / Safety (2/5): Hallucinations continue to be a problem for Google Bard, even with more recent improvements, and I would judge this to the area where it fares the most poorly versus the market leader, ChatGPT.
Ease of Use (4/5): Being an Alphabet product, it is no surprise that Google Bard’s user interface is clean and easy to navigate. From a user experience standpoint however, Google Bard lags Bing AI, our category leader for AI-enabled Internet search. The former lacks of a seamless interface with the core search engine and users have to switch tabs when accessing both. Given Good Bard’s centrality to its core search product, I’m sure that this is something that Alphabet is actively working on.
Current (5/5): Full Internet access and connection to Google’s search results. As noted above, unlike Bing AI, Google Bard is not yet directly integrated into the Google search engine interface.
Perplexity AI
Perplexity AI was founded recently in 2022 by a high-powered founding cast who were alumni from the likes of Quora, Meta, DeepMind, Google, OpenAI, and Databricks. This chatbot is by far my favourite when it comes to research - where it be general topical research, market research, or more academic-type literature reviews.
Here is how the chatbot fares:
Availability (5/5): 180 countries
Price (5/5): The free version has no limits and nominally uses the ChatGPT-3.5 engine but also offers 5 “Copilot” searches evert four hours that are powered by ChatGPT-4. There is a paid version for US$20 / month that enables 300 “Copilot” searches a day. Unless you are an academic or full-time researcher, the free version should be more than sufficient.
Performance (5/5): Perplexity AI is unmatched when it comes to its primary function as an AI research assistant. Users can search the Internet and specify the types of data sources they want searched. The chatbot provides data sources / citations for each key point (unlike Google Bard), and can also provide images in addition to text-based responses. After each search each complete, it offers up suggestions of further clarifying questions that builds on the user’s initial query. While not as detailed as ChatGPT’s responses, I find that Perplexity AI’s answers tend to be more comprehensive than those of Google Bard and Bing AI.
Versatility (3/5): Perplexity AI functions best as a research assistant, and tends to be less creative and verbose than ChatGPT-3.5. However, it does have access to robust analytical and logical capabilities due to its ability to access Wolfram’s computational engine.
Ease of Use (4/5): This chatbot has an intuitive and easy-to-navigate interface. I like that it provides the ability to search previous chats (which it terms “Threads”).
Accuracy / Safety (5/5): The Perplexity AI team claim that it is the “first-of-its-kind conversational answer engine that’s grounded in providing accurate and relevant information through citations”. After dozens of searches, I’ve only managed to prove them wrong a single time.
Current (5/5): This chatbot is an Internet-enabled research assistant. Enough said.
Claude 2
Claude 2 only recently entered the public domain and was developed by the AI start-up, Anthropic. The company bills itself as a “AI safety and research company”, and aims to differentiate itself by creating safer AI models.
Beyond its AI Constitution (see below) designed to promote safety, this model stands out for its ability to process extremely long prompts and documents. Claude 2 can process everything from hefty research papers, lengthy articles, as well as short books, and can compare up to five uploaded documents. It is also fast - it took only seconds to analyse, summarise, and compare two 50-page PDF documents that I had uploaded.
It’s limited availability (US and UK only currently) is what holds the chatbot back from a higher positioning in the Overall category, although I expect this to change in the coming months.
Here is how the chatbot fares:
Availability (1/5): Currently US and UK only
Price (5/5): Free with no limits. This may change over time as Claude 2 remains in beta mode.
Performance (4/5): Claude 2 has been positioned as a multipurpose AI assistant. The chatbot edges out ChatGPT-3.5 in terms of being better at following instructions, although the latter appears to have the leg up when it comes to coding, explaining results, and general formatting of its outputs. Where Claude 2 really shines is its massive “contextual window”, allowing it to analyse massive documents of up to 75k words and to compare up to five documents at a time. By way of context, depending on the model in question, ChatGPT’s contextual window is between 1/25 and 1/3 of the size of Claude 2’s. Performance-wise, Claude 2 is a serious contender for the Overall Winner category, and only narrowly missed out because of its limited availability (US and UK only currently).
Versatility (4/5): Claude 2’s versatility rivals that of the leading free LLM, ChatGPT-3.5, and has performed better than ChatGPT-3.5 on several writing and quantitative tests.
Accuracy / Safety (5/5): Anthropic was founded on the basis of making AI safer, and Claude 2 has an in-built “constitution” embedded with key principles such as the UN’s Universal Declaration of Human Rights that allows the model to autoregulate and check its responses for potential harm. In conversations, Claude 2’s responses tend to feel more cautious, and it will frequently clarify ambiguities and admit to its knowledge gaps.
Ease of Use (4/5): I like Claude 2’s clean user interface and that it provides the ability to search previous chats.
Current (3/5): Claude 2 does not have Internet access and was only trained on data up to early 2023.
Bing AI
Microsoft’s Bing AI could potentially mark the company’s return as a serious contender in the search space. The Big Tech giant launched the new AI-powered Bing in February of this year and has since worked to expand the chatbot’s functionality and footprint.
The company’s long-term partnership and multibillion dollar investment in OpenAI allows Microsoft to power Bing AI with the market-leading LLM, ChatGPT-4. With Microsoft working on several different types of AI-enabled “Copilots”, I expect Bing AI to simply be the first of many of the company’s integrated suite of products that feature Generative AI.
Here is how the chatbot fares:
Availability (5/5): 169 countries.
Price (4/5): Free but with limit of 20 questions per session and 10 sessions daily. Full chat mode requires users to download the Microsoft Edge browser.
Performance (5/5): Microsoft has clearly positioned Bing AI as a tool for AI-powered web searches. In this regard, the chatbot is the current market leader, allowing users to conduct complex searches with natural language. Two modes are available for Microsoft Edge browser users. This includes the standard search engine interface which provides search results, as well as the chat interface which offers written responses supported by specific citations. Both of these modes are now powered by Bing AI.
Versatility (4/5): While Bing AI has primarily been positioned as an AI-enabled search tool, it is highly versatile. Since it is powered by ChatGPT-4, it is able to generate various forms of written content, create and edit code, categorise and summarise content etc. However, the downside of Bing AI is that its responses tend to be quite short, which makes it less useful for longform content generation.
Accuracy / Safety (4/5): Bing AI is powered by ChatGPT-4 and has a similar accuracy and safety profile. The former has a slight leg up over the latter because it is connected to the Internet and can pull from more recent and multiple sources of information, potentially making it more accurate.
Current (5/5): Full Internet access, connection to the Bing search engine and direct integration into the Bing search engine interface.
Jasper AI
Jasper AI is targeted at and designed for marketers and sales teams. In Generative AI terms, it has been around for awhile - since February 2021 - and already had ~100k paying customers at the back end of 2022.
Here is how the chatbot fares:
Accessibility (5/5): Widely available
Price (2/5): With individual plans ranging from US$20-49 / month, there’s no doubt that Jasper AI is pricey. If you’re a marketer or copyrighter looking for an affordable option, then Copy AI, which offers a decent free option in addition to its paid option of $49 / month.
Performance (5/5): Jasper AI is the current best-in-class chatbot for marketers. It maximises productivity with its many pre-built templates and workflows (which it calls “recipes”) as well as its ability to be trained on a brand’s unique identity and knowledge base by feeding it style guides, product catalogues, company facts etc. Other useful tools include a plagiarism checker and integration with Grammarly for spellchecks. Through its Jasper Everywhere Extension, the chatbot can be seamlessly accessed from any number of social media platforms, content management systems (CMS), and email services.
Versatility (2/5): Given its focus on marketers, this chatbot naturally cannot compete with the likes of Claude 2, ChatGPT, and Google Bard for versatility.
Accuracy / Safety (3/5): There is scant evidence about the Jasper AI’s accuracy and safety. Since it is powered by ChatGPT-3.5 and in some cases, ChatGPT-4, I would expect it to perform at least as well as the former in this regard.
Current (2/5): As noted above, Jasper AI is based on both ChatGPT models, both of which were only trained on data up to September 2021.
Pi
Pi, which is short for Personal Intelligence, is an AI chatbot developed by Inflection AI, a start-up that was founded by co-founders of DeepMind and LinkedIn. Its value proposition is unique - the chatbot has been designed to act as a coach, confidante, creative partner, and sounding board, and to be emotionally intelligent in its conversations with users.
This quote by Mustafa Suleyman, CEO and co-founder of Inflection AI neatly sums up the vision for the chatbot, “Pi is a new kind of AI, one that isn’t just smart but also has good EQ. We think of Pi as a digital companion on hand whenever you want to learn something new, when you need a sounding board to talk through a tricky moment in your day, or just pass the time with a curious and kind counterpart”.
Here is how the chatbot fares:
Availability (3/5): US, Canada, UK, Ireland, Australia and New Zealand
Price (5/5): Free with no limits.
Performance (4/5): In its capacity as a coach, confidante, creative partner, sounding board, and overall human companion, Pi is second-to-none. I love that it routinely employs colloquial language, uses fun emojis, asks how I am feeling, and sometimes makes jokes. According to Inflection AI it has a distinct personality characterised by kindness, curiosity, creativity, and knowledge. By far, it feels the most human of all the chatbots I have used so far.
Versatility (3/5): Due to its design as a human companion, Pi’s capabilities have deliberately been made more limited than those of multipurpose task assistants such as ChatGPT and Bing AI, because the focus has been on safety (see below).
Accuracy / Safety (5/5): Since Pi was designed to be a human companion, safety has been a top priority for Inflection AI, in particular ensuring that the chatbot does not engage in harmful or offensive behaviours.
Current (5/5): Pi is not built for web browsing per se, but it is connected to the Internet. N.B. When I asked about the weather in London today, it was able to provide me with accurate information (that being said, London is mostly grey and gloomy so maybe it was just a lucky stab ;p)
Conclusion
Beyond these category winners, we’ve also explored other chatbots. While not yet leaders, the remaining chatbots have their own unique strengths, and are certainly worth keeping an eye on:
Koala Writer is great for longform content generation.
Poe is an aggregator of other chatbot and makes it convenient to access multiple models from a single interface.
Hugging Chat provides access to opensource models and currently leverages Meta AI’s Llama 2 model.
The floor is currently lava in chatbot race, and the incumbents have no cause to rest on their laurels! As this space evolves, we’ll aim to keep this list fresh to help you stay up-to-date.