Show HN: I wrote an open-source browser alternative for Computer Use for any LLM

gregpr07 · Nov 5, 2024

Hey HN,
I made Browser-Use, an open-source tool that lets (all Langchain supported) LLMs execute tasks directly in the browser just with function calling.
It allows you to build agents that interact with web elements using natural language prompts. We created a layer that simplifies website interaction for LLMs by extracting xPaths and interactive elements like buttons and input fields (and other fancy things). This enables you to design custom web automation and scraping functions without manual inspection through DevTools.
Hasn't this been done a lot of times? Good question, as a general SaaS tool yes, but I think a lot of people are going to try to make their own web automation agents from scratch, so the idea is to provide groundwork/library for the hard part so that not everyone has to repeat these steps:

parse html in a LLM friendly way (clickable items + screenshots)
provide a nice function calls for everything inside the browser
create reusable agent classes

What this is NOT? An all knowing AI agent that can solve all your problems.
The vision: create repeatable tasks on the web just by prompting your agent and not care about the hows.
To better showcase the power of text extraction we made a few demos such as:

Applying for multiple software engineering jobs in San Francisco
Opening new tabs to search for images of Albert Einstein, Oprah Winfrey, and Steve Jobs
Finding the cheapest one-way flight from London to Kyrgyzstan for December 25th

I’d be interested in feedback on how this tool fits into your automation workflows. Try it out and let me know how it performs on your end.
We are Gregor & Magnus and we built this in 5 days.

Comments URL: Show HN: I wrote an open-source browser alternative for Computer Use for any LLM | Hacker News

Points: 56

# Comments: 33

Continue reading...

Show HN: I wrote an open-source browser alternative for Computer Use for any LLM

gregpr07