
WebBrain is a free, open-source browser agent for Chrome and Firefox. It reads pages, extracts data, and automates multi-step tasks. Unlike most browser AI plugins, it can also run entirely on a local model. It is built by Emre Sokullu and licensed under MIT. The full source lives on GitHub. Run the agent against a local model, and no page data leaves your machine. Connect a cloud API when you want more capability. What is WebBrain? WebBrain lives in your browser’s side panel
Will WebBrain be listed and available on the Chrome Web Store by July 31, 2026?
Resolves by Jul 31, 2026
WebBrain is a free, open-source browser extension for Chrome and Firefox that uses AI to read web pages, extract data, and automate tasks. It can run entirely on a local model installed on a user's machine, meaning page data never leaves the device, or it can connect to cloud APIs for more capability. The tool operates in two modes: Ask mode for reading pages without making changes, and Act mode for clicking, typing, and automating multi-step workflows. Security is built in by design, starting in read-only mode and requiring approval before the agent takes consequential actions like submitting forms or making purchases.

In this tutorial, we build a RAG-Anything workflow and use it to explore how multimodal retrieval works across text, tables, equations, and images. We start by preparing the Colab environment, installing the required packages, and securely entering our OpenAI API key at runtime to keep the notebook practical and safe to run. We then create a synthetic multimodal report, generate a chart and PDF, convert the content into RAG-Anything’s direct content_list format, and insert it into the retrieval

Cursor hopes to continue offering third-party AI models after it's acquired by SpaceX, testing the relationships between frontier AI labs.

Most browser automation runs from the outside. Playwright, Puppeteer, Selenium, and browser-use all drive a browser from an external process. They read the page through screenshots or the Chrome DevTools Protocol. Alibaba’s Page Agent takes the opposite path. The agent lives inside the webpage as plain JavaScript. It reads the live DOM as text and acts as the real user. No headless browser, no screenshots, no multi-modal model. The project is open-source under the MIT license. The c
Want to go deeper than the news? Explore live, cohort-based AI courses taught by practitioners.
Browse AI courses on Maven