Five for Friday: Issue #3
On OpenAI’s o1 model, Microsoft Copilot Wave 2, Google NotebookLM’s new Audio Overview feature, Fei-Fei Li’s new startups, and Waymo’s sterling safety record

Welcome to this week's Five for Friday, where we continue to unpack the latest and greatest in the world of AI. Before we jump in, a quick heads up: Five for Friday will be taking a well-deserved holiday break next week. So savour this edition, and we'll be back with more cutting-edge updates in a fortnight!
#1 The Good, the Bad, and the Strawberry
OpenAI’s new model can outthink PhDs but is not without its limitations

OpenAI has released its latest AI model series, o1, which has generated mixed reactions in the AI community. Previously codenamed “Strawberry” and Project Q*, the models demonstrate impressive capabilities in certain areas, but also has limitations that have led to polarised views on its significance.
The two models, o1-Preview and o1-Mini, are available to ChatGPT Plus and Teams subscribers as part of their subscriptions. o1-Preview, the higher capability but slower model has a weekly usage cap of 30 messages per week, while o1-Mini’s weekly usage cap is 50 messages per week.
First the “Good”.
The model exceeds human PhD-level accuracy on a benchmark of physics, biology, and chemistry problems (GPQA) and also excels at coding, math, and logic-based problems. It has also been found to have an IQ of 120, and therefore “smarter” than 95% of humans. My own game theory experiment yielded impressive results, while others have also demonstrated many new capabilities.
Now the “Bad”.
The model takes a significantly longer time to "think” (anything from 5 seconds to 2 minutes has been my experience) compared to other frontier Large Language Models (LLMs). The stringent weekly usage caps and high API costs mean that it is relative costly for both ordinary users and developers to employ.
The model also appears to represent a step backward in transparency, because much of the “thinking” process is largely hidden from end users. AI safety researchers have also pointed to the model’s apparent ability to be “deceitful”. 0.8% of o1-Preview’s responses were flagged as being “deceptive”, with 0.56% being some form of hallucination and roughly two-thirds of those (0.39%) appeared to be intentional. While the rate is low, this is nonetheless a perplexing and worrying phenomenon.
Finally the “Strawberry”.
Its clear that o1 represents a significant step up in LLM capabilities, but its also apparent that will take users awhile to understand how best to utilise it. OpenAI researchers have themselves admitted that they haven’t quite figured the best use cases for o1.
For most day-to-day use cases such content generation, simple problem solving, document analysis, and idea generation, users will find (at least for now) that other frontier models (e.g., GPT-4o, Claude 3.5 Sonnet) are more economic to use and will also likely perform better.
Perspectives:
I’m undoubtedly excited about o1’s capabilities. While not quite what many in the AI community were hoping for, o1 nonetheless represents a watershed moment for AI, similar to when ChatGPT was first released in November 2022. Yes, people were amazed when GPT-3.5 was initially released. But for months after that, most people were still trying to get their heads around how LLMs could be adopted into their workflows.
Some critics have pointed out that the o1 models are not truly “reasoning” and are therefore a dead end on the road toward Artificial General Intelligence (AGI). But seriously, as ordinary users, who cares? I’m more interested in its potential to open up new and useful applications across many new domains.
#2 Clippy Who?
Does Microsoft Copilot stand a chance with its sequel release?
Salesforce CEO Marc Benioff recently quipped, "I really think that Copilot is the new Clippy. It's cute. It's fun. It does some things, and then you are not really using it"?
With its recent announcements of “Wave 2” features for Copilot, the Redmond giant is now making a valiant effort to prove that sometimes, sequels can actually be better than the original. (George Lucas, are you taking notes?)
Let's dive into the three big-ticket items that might just make Benioff eat his words:
Copilot Pages: This new feature introduces a collaborative workspace within Copilot. Pages allows users to edit, add to, and share AI-generated content with team members. It's designed to facilitate real-time collaboration between humans and AI, potentially streamlining group projects and brainstorming sessions.
Copilot in Excel: Copilot in Excel is now generally available to all users, offering support for complex formulas, data visualization, and conditional formatting. More importantly and the one I’ve been waiting for, a Copilot in Excel with Python feature has also been released in public preview mode. This integration allows users to leverage Python's data analysis capabilities within Excel using natural language prompts, without requiring coding expertise, similar to OpenAI’s Advanced Data Analysis capabilities.
Copilot Agents: Microsoft has rebranded and improved its AI assistant creation tool, now called Copilot Agents. This feature allows users to build custom AI assistants that can access and utilise information from an organization's OneDrive and SharePoint documents. It's designed to be more user-friendly than its predecessor and aligns closely with OpenAI's GPT Builder in functionality.
Other notable enhancements include improvements to Copilot in Outlook (with a new inbox prioritisation feature), PowerPoint (introducing Narrative Builder for quick presentation drafts), Teams (now summarising both spoken and chat conversations), and Word (offering improved integration of external resources).
Perspectives:
While Wave 2 went largely under the radar, I’ve been quite impressed by the new feature announcements and might actually give Copilot a second third fourth go.
I am cautiously optimistic that Microsoft is getting into the swing of incrementally improving Copilot for the better based on customer feedback, and frankly this fits with the usual Microsoft pattern of initially releasing highly buggy and unfit-for-purpose products, only to revamp them later on.
The company does face the challenge of overcoming previous disappointments among users, but this hill is by no means insurmountable given the huge advantage it has from its extensive Microsoft 365 / Office user base.
#3 From Text to Talk
Google’s NotebookLM summarises documents to generate podcasts

In a move that might make podcast hosts nervously adjust their microphones, Google has rolled out a new feature for NotebookLM that turns boring old text into a chatty AI-powered podcast (or any other form of conversational audio).
For those who are less familiar with NotebookLM, it was debuted in the US earlier this year and went global over summer. The AI tool is primarily designed to help users organise and make sense of complex information.
NotebookLM allows users to upload various types of documents, including Google Docs, PDFs, text files, Google Slides, and even web URLs. The tool then analyses these sources, providing summaries, answering questions, and helping users create content based on the uploaded information. Think creating FAQs from operating manuals, student notes from textbooks, or executive summaries from large reports or documents.
The new Audio Overview feature goes a step further to transform written content into podcast-style audio summaries. The podcast presented on the NotebookLM website as well as the audio book review of Gary Marcus’s “Taming Silicon Valley” created by Robert Maciejko (the founder of INSEAD’s Alumni AI community, which I am a member of) are incredibly (and slightly eerily) human-like.
But before you start planning your AI-powered podcast empire, there are a few wrinkles to iron out. The AI hosts currently only speak English (no Klingon option yet), may occasionally introduce inaccuracies (keep your fact-checking hats on), and can't be interrupted (much like that one friend we all have).
Perspectives:
The line between human and AI-created content continues to blur faster than a speed-reader on espresso. When I first heard of Audio Overviews, I was sceptical as I primarily listen to podcasts to listen to human stories. Yet the human-like quality of conversations produced by Audio Overviews has made me reconsider whether I would in the future be open to an AI-generated podcast.
For marketers, content creators, marketers, educators, communications professionals, NotebookLM could become a powerful tool for rapidly synthesising information and generating novel and compelling content in conversational audio, a format that has so far mostly eluded AI.
Its potential to aid those working with visually impaired individuals is particularly noteworthy, potentially democratising access to vast swatches of previously written-only content.
#4 The Third Dimension of AI
Fei-Fei Li's World Labs aims to bridge gap between digital and physical
Move over, flat-screen AI. Fei-Fei Li, often referred to as "Godmother of AI," is bringing the third dimension to AI with her new startup, World Labs.
Launched with a cool $230 million in funding from the likes of Andreessen Horowitz, NEA, and Radical Ventures, the company is already valued at over $1 billion.
World Labs is developing what it calls "Large World Models" (LWMs), designed to help AI understand and interact with the 3D world. The company's first product, expected to launch in 2025, will target professionals such as artists, designers, developers, and engineers.
Li's background makes her well-suited to lead this venture. She previously headed AI at Google Cloud, currently advises the White House task force on AI, and is known for her work on ImageNet, which laid the foundation for modern visual object recognition.
Perspectives:
Spatial intelligence could allow AI to operate more effectively into the real world, supporting fields like robotics with applications in manufacturing, retail, and healthcare. It could also enable new applications in augmented and virtual reality, where AI could be used to better represent the reality of the physical world in virtual environments.
Perhaps most intriguingly, this development could bring us a step closer to Artificial General Intelligence (AGI). AI's current limited understanding of the physical world is considered one of the barriers to it attaining human-like cognition. By improving AI's ability to comprehend and interact with three-dimensional space, World Labs might be paving the way for more advanced, generalised AI systems.
#5 When AI Takes the Wheel
Waymo's vehicles show much lower crash rates than human drivers

Waymo, Alphabet’s robotaxi arm, has just released a compelling safety report. After 22m miles of driverless cruising, Waymo reported 84% fewer airbag deployment crashes, 73% fewer injury crashes, and 48% fewer police-reported crashes compared to human drivers covering the same distance in the same regions.
They reported 200 crashes, which equates to approximately one crash every 100,000 miles. Timothy B. Lee, who dug into the data, found that nearly all of the 23 serious accidents, defined as those that either caused an injury, caused an airbag to deploy, or both, a large majority (16) of these involved another car rear-ending a Waymo.
The rest of the accidents were directly attributable to human error (e.g., human-driven car running a red light, getting side-swiped by a human-driven vehicle in an adjacent lane). Only in two cases, where a vehicle turned left across the path of a Waymo vehicle, is the blame less clear.
The quality and transparency of Waymo’s data release is considered high and sets a new standard for transparency in the autonomous vehicle industry. It challenges other self-driving car companies to publish similarly detailed safety records, potentially driving industry-wide improvements in safety and public trust.
Perspectives:
The limited scope of Waymo's operations is a crucial factor to consider. While the results are promising, comprehensive safety assessments will require testing in more diverse environments, including different weather conditions, rural areas, and a wider range of urban settings (most of Waymo’s driving has taken place only in Phoenix and San Francisco).
While Waymo's safety record is impressive, the true challenge lies in changing public perception. Despite years of data showing self-driving cars to be safer than human drivers, psychological barriers and societal resistance remain significant hurdles to widespread adoption.
Justin Tan is passionate about supporting organisations and teams to navigate disruptive change and towards sustainable and robust growth. He founded Evolutio Consulting in 2021 to help senior leaders to upskill and accelerate adoption of AI within their organisation through AI literacy and proficiency training, and also works with his clients to design and build bespoke AI solutions that drive growth and productivity for their businesses. If you're pondering how to harness these technologies in your business, or simply fancy a chat about the latest developments in AI, why not reach out?