Latency, Inference & UX
Last week OpenAI unveiled its new o1 model series — a set of models designed to tackle complex problems by spending more time “thinking” before they respond. While some were quick to dismiss these latest developments as a damp squib, they are overlooking an ongoing shift. It’s not always about increasing model sizes or adding more data; what will matter for most enterprises (and applications) going forward is how efficiently AI models can think and reason before providing an answer.
This is where advancements in inference techniques come into play. Unlike training, which focuses on feeding the model vast amounts of data, improving inference is about enhancing how effectively the model applies its learned knowledge to solve real-world problems. By optimising inference processes (in GPT- o1’s case, by enabling the model to reason through complex tasks, try different strategies, and refine its responses before delivering an answer), we can break through the diminishing returns that many thought would halt AI progress.
With the groundwork already laid by training large models, the emphasis now is on how efficiently these models can use that foundation to deliver nuanced, thoughtful responses. The future of AI will not always be about bigger models; it will be about models that know how to use their knowledge better, reasoning through complex issues instead of just providing surface-level answers.
OpenAI’s new o1 series marks a big step in this direction: a family of models that process information more effectively, enabling them to solve complex tasks that were previously out of reach. These models are no longer reacting; they are responding — thinking deeply before generating answers, thanks to some behind-the-scenes chain-of-thought prompting.
According to the announcement, these models are starting to excel in areas like science, coding, and math, achieving results that rival top human performers in challenging fields. This is a total rethink of how AI tackles problems, moving from quick, surface-level answers to deeper, more thoughtful responses.
As we’ve seen in the past, the real breakthroughs come when we start paying attention to the underlying trends. The question is not if AI will surpass human-level reasoning in certain tasks, but when — and we’re probably much closer to that reality than many might realise.
Follow the lines.
Latency as Thinking
One other aspect of GPT-o1 that has piqued my interest is how it reveals part of its reasoning process to users directly within the user interface. Traditionally, AI interactions have been swift and opaque, providing instant responses without insight into the underlying thought process. Yet, with the o1 models, the AI’s chain-of-thought is on display. For the immediate user, this approach transforms AI from a mysterious black box into an open book, allowing them to follow the model as it “thinks.”
But why does this happen? Why do we like to think that machines ‘think’. One key aspect is anthropomorphism — the natural human tendency to attribute human-like qualities to non-human entities. When an AI takes time to respond, we perceive it as “thinking,” which makes the interaction feel more personal and engaging.
Studies have shown that anthropomorphic design features can significantly impact users’ trust and engagement with computer systems. For example, Reeves and Nass (1996) found that people tend to treat computers and media as real social actors, responding to them in social ways. By incorporating human-like behaviours — such as brief pauses to simulate thinking — AI systems can leverage this tendency to build stronger relationships.
Brief pauses allow for cognitive engagement by providing users with time to process interactions more deeply, enhancing their involvement and making the eventual response more satisfying. The anticipation built during this pause helps foster trust in the AI’s responses as users feel that the AI is carefully considering their input — reflecting how human expectations of thoughtful communication usually equates time spent with effort. This has been long-demonstrated: research by Sundar and Nass (2000) showed that users are more satisfied with computer agents that provide responses after a slight delay, as it mimics human conversational norms.
The natural anthropomorphic perception of AI thinking presents several opportunities for UX design. By incorporating elements that highlight the AI’s thought process, designers can make interactions feel more interactive and less transactional, enhancing overall engagement.
This is important for many reasons, the most obvious being adoption, where latency beyond a few seconds has been shown to significantly hinder uptake. As GPT-o1 demonstrates, displaying the AI’s chain-of-thought not only fills in that wait time but also demystifies the technology, cleverly turning latency into part of the experience. In this sense, transparency — or even the perception of transparency — becomes a powerful tool for building user confidence.
Some readers might remember how in The Sims its developers placed humorous messages during loading times (“Leeb, Leefuh, Lurve”). Those messages didn’t just fill the wait; they enhanced the player’s experience, making the wait itself a part of the game. That is essentially what this is about: transforming wait times into opportunities for engagement. But instead of jokes, GPT-o1 leverages latency and transparency to build trust in AI, facilitating both engagement and adoption.
OpenAI’s o1 series is a step towards AI that thinks more like us, providing deeper, more thoughtful responses. As AI advances, it should — and will — be designed in ways that mirror us as well.
At the end of the day, ChatGPT was never meant to be a consumer-facing product — it was, and always will be, a research tool that others can built other things atop of. While these user-centric touches barely scratch the surface of what’s possible in terms of AI and UX, they still illustrate how intentional design choices can transform AI from a simple utility into a trusted companion.
This is the trajectory.
Get in touch to learn more about how to build AI solutions that stick.