Web browsers are entering a new era where AI skills take over from extensions

“The browser is bigger than chat. It’s a more sticky product, and it’s the only way to build agents. It’s the only way to build end-to-end “workflows,” these were the comments of Perplexity CEO, Aravind Srinivas, in a recent interview. The Perplexity co-founder was talking about the future of web browsers, AI agents, and automations in web browsers.

Srinivas was bullish on the prospects, partly because his company is already testing a buzzy new browser called Comet. Currently in an invite-only beta phase, the browser comes with an agent that can handle complex and time-consuming tasks on your behalf.

AI skills are the new work champions

All the talk of AI agents and skills sounds like a bunch of tech jargon, so let me break it down for you. In the Dia browser, I recently created a skill called “expand.” How did I do it, even though I didn’t write a single line of code? I simply described it in the following words:

“When I use this skill and paste a snippet, do a deep web search, and pull up the entire history in the form of an article in a timely order. Pull information only from reliable news outlets.”

I read and write articles for a living, and I often come across snippets and events in articles that I am not familiar with. For such scenarios, all I have to do is select the relevant text (or copy-paste it in the chat sidebar) and use a “/“ command to summon the “expand” skill.

As described above, the AI agent in the Dia browser will search the mentions of my target in top news outlets and create a brief report about it in chronological order. This saves me a lot of precious time that would otherwise be spent on wild Google Search attempts.

But more importantly, I don’t even have to open another tab, and I can ask follow-up questions in the same chat box within the active reading tab. It’s quick and convenient. I don’t know an extension that can do exactly what this “expand” skill does for me.

It’s not possible either. I created it with a specific purpose and intent. And I can create as many as I want, or fine-tune it further to suit my workflow. I’ve created another one called “research” that references a work (or phrase) and performs web research by looking exclusively at peer-reviewed science papers.

The Dia user community is even saving some money by creating skills that hunt for coupon codes available on products right before checkout. For my Amazon shopping, I’ve created one that combines the reviews, ratings, and features of products across different Amazon tabs, creates a comparison table, and helps me make the best choice. All of that happens by typing a single word!

Another one quickly looks up for grammatical errors and style guide clarity in my emails. There’s one that creates a quiz-based reading material for kids I teach at a nearby non-profit institution, based on the learning material I have prepared.

Just made a @diabrowser skill that instantly saved me money pic.twitter.com/YbSAclRrtQ
— Egor (@eg0rev) July 23, 2025

The students love the fun and playful tone in their multiple-choice questions that test their current affairs knowledge. There’s even an official Dia gallery where you can find skills created by Dia users, and a crowd-sourced web dashboard where you can find even more.

But here’s the main reason why I think browser skills are a bigger deal than extensions. Anyone can create them by simply describing what they want. With extensions, you need coding knowledge and basic skills of how the web and its browsing architecture work.

Security is another reason that I would put more faith in browser skills than extensions. There is a long history of browser extensions being weaponized but bad actors to seed malware. An average user can’t look or make sense of an extension’s inner workings, and only realizes the folly when the damage has been done.

The situation with AI skills in browsers is as transparent as it gets. How exactly a skill works is described in detail, in natural language, and without any hidden caveats. You just need to read it thoroughly, or just copy-paste it and create your own with extra modifications. That approach is flexible, a lot safer, and gives the whole power in users’ hands.

Browser agents are here to stay

Next, we have browser agents. Opera browser has already implemented one, and it is already offering a more advanced version called Operator. Then you can have tools like ChatGPT Agent, and Perplexity’s Comet browser. Think of it as Siri, but for web browsing.

Agents are more suited for complex, time-consuming tasks. And they work best when they get access to the services you visit on a daily basis, like your email and Calendar. For example, this is what I did in Perplexity’s Comet browser last night:

“Check my inbox and give me an update on all the interview requests with a scientist or company executive that I intended to proceed with. Focus on conversations where I expressed the possibility of virtual interviews, instead of an in-person meeting.”

Without opening another tab, the built-in Assistant went through my Gmail inbox, looked up the relevant emails, and then provided me with a list of such interactions in a well-formatted view. For added convenience, it even included one-click Gmail links so that I can directly open that email chain without having to manually dig in.

It’s great for a lot of other things. For example, during a Twitter AMA, I simply asked it to pick the responses by the speaker and list them as bullet points. That saved me a lot of back-and-forth time opening and closing X conversation chains.

For travel planning, shopping, or even consuming videos, the assistant in Comet browser works fine. The only “ick” is that if you need it to get more personal work done, you will need to provide access to connectors. For example, to handle your Gmail, Calendar, and Drive, you will need to enable access.

I did it for my WhatsApp account, as well, and it worked really well in the Comet browser. Not everyone will feel easy doing that, and the caution is totally warranted. For such scenarios, Google and OpenAI offer similar agentic features for Gemini and ChatGPT, respectively.

There is no going back

Just the way you create skills in Dia by simply typing or narrating your requirements, Gemini and ChatGPT also let you create custom agents for specific tasks. Google calls them Gems, while OpenAI refers to them as GPTs. And yes, you can share them just like skills. Using them is free, but to create them, you’ll need a subscription that costs $20 per month.

I’ve created numerous Gems and custom GPTs to speed up my mundane chores. For personal social posting, I’ve created a Gem that breaks down articles I’ve written into smaller bits, which are then posted as a chain on X. Likewise, I’ve created custom agents to handle my emails.

One of the Gems simply needs me to type “yes” or “no,” and it will accordingly write a polite response while picking up all the context from the email. With connectors coming into the picture, you can link them to as many services as you want.

The best part about these gems is that you can effortlessly use them across a desktop browser and mobile apps, as well. Extensions require you to stick with a desktop browser. Some mobile browsers do support extensions, but they are rare.

Moreover, they don’t offer the same flexibility and peace of mind as custom browser skills or agents created by users. ChatGPT Agent and Google’s Project Mariner are a new breed of AI assistants that are tailor-made for web-based tasks, just like the assistant built within Perplexity’s Comet browser.

Unlike an extension, they can handle multi-step workflows, and you can take over at any stage. Furthermore, you can modify the inner workings of your web browsing automation and tailor the AI skills to your exact specifications, something that’s not possible with extensions.

Of course, they are not perfect. At the same time, you can take over it and complete the things when it’s not able to do it because no AI agent is foolproof, especially when we are at a time when reasoning models are still far from perfection,” admits Perplexity’s CEO.

But the shift is clearly evident. Browser extensions are not going to vanish overnight, but browsing agents and AI skills created by users are going to take over. It’s only a matter of time before the barriers (read: subscription fee) come down!

Search form

Web browsers are entering a new era where AI skills take over from extensions

AI skills are the new work champions

Browser agents are here to stay

There is no going back

Related Article