• No Playbook
  • Posts
  • Beyond Clicks and Codes: New Age of AI Agents From the Ashes of RPA

Beyond Clicks and Codes: New Age of AI Agents From the Ashes of RPA

Delving into the evolutionary journey of automation to what is now adaptive and reasoning capabilities of Tactile AI Agents

This article has been co-written by AI. All images were decided on and created by AI based on the overall article. The voice-over is an AI deep fake of me (currently working on improving it, hopefully ready for the next article).

I've always had a bit of a love-hate relationship with Robotic Process Automation (RPA). On the one hand, its promise to automate almost any process using basic interactions like mouse clicks and keystrokes was incredibly appealing. On the other, its reliance on hardcoded logic made it inflexible and somewhat limited. But as we stand on the brink of a new era with AI agents, it's timely to revisit the role of RPA in the future of AI-enabled automation.

The Promise and Pitfalls of Traditional RPA

RPA was touted as a universal automation solution, capable of handling a wide range of tasks by mimicking human interaction with computers. Yet, its reliance on predefined rules and inability to adapt to new scenarios were significant drawbacks. The technology was like a skilled musician playing from a rigidly defined score, unable to improvise.

The Rise of Generative AI: A New Hope for Automation

Enter generative AI, the game-changer that promised to solve the adaptability issue plaguing traditional automation. Unlike its predecessors, generative AI isn't bound by hardcoded rules. It learns from data, making decisions and generating responses based on patterns it has absorbed. This capability allows it to tackle tasks that are too complex or nuanced for standard RPA solutions.

AI Agents: The Next Evolution in Automation

The advent of AI agents marks a pivotal shift in the automation landscape. These entities are not your ordinary bots; they are powered by the flexible reasoning capabilities of generative AI, enabling them to perform tasks with an understanding and adaptability that mimic human cognition. From narrow, task-specific applications, we're witnessing the emergence of entities with broad, adaptable expertise. For example, conversational AI agents can handle complex customer service interactions, while OpenAI's GPT ecosystem demonstrates capabilities ranging from writing assistance to complex problem-solving, showcasing the diverse potential of AI agents.

Addressing the Long Tail Problem

The long tail problem has been a persistent issue in automation, where addressing each process variation or logic scenario typically requires hardcoded logic. This approach is not scalable. For instance, automating customer service might work for common queries but falls short when faced with unique or complex issues. AI agents, however, can learn and adapt to handle these outliers, significantly broadening the scope of automation.

Subscribe below to be notified of the next article!

AI Agents is a new concept, but it’s already evolving. Towards the Tactile AI Agent?

Over the last year, the evolution of AI agents has predominantly focused on equipping them with various "skills" to accomplish specific tasks. This development approach has seen AI agents gain abilities ranging from web searches and flow diagram creation to writing comprehensive articles (such as this one!).

But as AI Agents gains new skills or its capability expanded, it typically requires overcoming challenges reminiscent of the past, such as the need for manually developing new integrations and interfaces to facilitate these capabilities. However, a shift is emerging from within the industry, taking a different approach in overcoming these constraints. Instead of merely accumulating discrete skills, there is movement in endowing AI agents with the fundamental motor skills of digital and physical interaction.

For the purpose of this article, let's term the next step in this evolution as "Tactile AI Agents." These agents leverage fundamental modes of interaction—such as clicks and keystrokes—to interact with digital environments in ways previously limited to humans.

A prime example is MultiOn, which at its core is teaching AI agents to navigate the web browser. This creates the flexibility to perform many tasks like navigating Google Calendar to create new invites or purchasing goods from online retailers of which existing APIs and integrations may not exist. This approach isn't confined to the digital domain; it's moving into the hardware realm, as seen with OpenAI's announcement and the Rabbit release, marking a significant leap towards comprehensive automation.

Blending Cognitive and Tactile Automation

Tactile AI Agents represent a fusion of generative AI's cognitive capabilities with the tactile, interaction-based methodologies of traditional RPA. This combination allows for the shortcomings or constraints of both approaches to be overcome. Generative AI serves as the brain and cognition of old-school, dumb RPA. And techniques from RPA serve as the eyes, arms, and legs of AI Agents.

A Future where AI Agents Embodied in Digital and Physical Domains

Imagine AI entities autonomously navigating both digital and physical landscapes, undertaking tasks ranging from the mundane to the complex.

As we explore the potential of Tactile AI Agents, we're not just observing the evolution of automation but the birth of a new partnership between humans and digital entities. These agents will transform our interactions with the digital and physical world.

Truly becoming a digital embodiment of human beings that also sit in front of laptops and tap away on our devices.

Thank you for making it to the end! If you found this insightful or valuable, please consider subscribing to get notified when the next article is released!