OpenAI’s Next Move Towards an ‘Agentic’ Future

Samsung is rolling out generative AI across its devices, and now OpenAI is getting in on the action with a new tool called Operator, announced on January 23. Operator builds on the same tech as ChatGPT but operates within a proprietary web browser. This means it can handle tasks like ordering groceries or booking travel all on its own.

In a recent blog post, OpenAI hinted that Operator could unlock fresh engagement possibilities for businesses but didn’t go into detail about how that would work.

So, what exactly is Operator? It’s an app that combines a web browser with the generative AI model GPT-4o. OpenAI developed it to enhance GPT-4o’s ability to navigate and interact with typical web pages. What sets Operator apart is its knack for making multi-step plans and self-correcting when things go off track. It’s specifically trained to deal with common web elements like buttons and forms.

Right now, Operator is in beta. OpenAI plans to gather feedback from early users to refine the tool. If you’re a ChatGPT Pro subscriber, you can sign up for Operator today, and it will soon be available to Plus, Team, and Enterprise users as well. Eventually, OpenAI will incorporate Operator’s features into ChatGPT more broadly, with the Computer-Using Agent (CUA) soon accessible through their API.

How does Operator actually work? The CUA uses a technique they call an “inner monologue” to follow a logical path and adapt when faced with surprises. It takes screenshots of web pages and utilizes a virtual mouse and keyboard for navigation. Just like with ChatGPT, you can give Operator custom instructions that it will remember, such as your favorite airline.

Users can prompt Operator using natural language, but it won’t handle logging into sites, providing payment details, or solving CAPTCHAs—those steps will be handed back to you. Operator won’t process sensitive actions like banking transactions or take part in pivotal decisions like hiring an employee. If it encounters an interface it can’t navigate, it will also defer to the user.

OpenAI collaborated with various companies to ensure Operator could interact smoothly with their platforms, including DoorDash, Instacart, OpenTable, Priceline, StubHub, Thumbtack, and Uber.

However, the early version of Operator has its challenges, especially with complicated interfaces like creating slideshows or managing calendar events.

Operator enters a competitive landscape, sharing some features with rivals like Google Gemini and Apple Intelligence. It’s also reminiscent of Microsoft’s Recall feature, which uses screenshots for navigation. While some functions overlap, Operator’s ability to autonomously navigate websites could set it apart. The concept of agentic AI, where generative models take on multi-step tasks for users, is gaining traction, yet there are still limits to these products.

Unlock your business potential with our expert guidance. Get in touch now!

IT-jobs-career-training-women-adobe.jpeg

Cultivate Your Talents and Dreams This International Women’s Day

tr_20250307-salesforce-diversity-shift-legal-compliance.jpg

Salesforce Abandons DEI Initiatives, Repositions Equality as Legal Obligation

leaf-nature-growth-adobe.jpeg

Recent Demos Showcase Enhancements in Alibaba’s AI Model

Apple-Store-Hong-Kong-hanohiki-1-adobe.jpg

Apple’s IPT Appeal on “Backdoor” Encryption Order: A Crucial Test for Major Challenges Ahead

tr_20250305-complete-microsoft-excel-training-bundle.jpg

Master Excel from Fundamentals to AI Integration with This $35 Course Bundle

Productivity-compass-fotolia.jpg

Podcast: Martin Sorrell of S4Capital Discusses AI in the Enterprise

tr_20240212-microsoft-visual-studio-professional-2022-the-2024-premium-learn-to-code-certification-b.jpeg

Master Coding Skills and Unlock Microsoft Visual Studio for Just $56!