• Half Baked
  • Posts
  • How we built and deployed an app with 1 prompt

How we built and deployed an app with 1 prompt

We used Devin to build a working app with 1 prompt

We just got access to Devin, and it’s wild.

With one Prompt, we build and deployed a fully functional app (Geoguessr for House Prices).

What to expect

  • How we did it

  • What we love about Devin

  • What frustrates us about Devin

  • Our take

The app concept is simple: Guess the price of the house.

However, the depth of reasoning with is complex enough to test Devin’s abilities.

To start out, we told Devin to go to Zillow and get house data - specifically the house’s address, price, and image URL. Devin has an inbuilt browser so its able to browse the web and perform tasks:

Searches city:

Capture the image:

We then told it to build the game. This is where it began to face issues handling the database and code (we’ll talk more on this later).

We copied the data it outputted and created a new chat.

New chat prompt:

Devin has a built-in planner that outlines all the tasks it intends to complete and executes them sequentially, one by one.​

As Devin progresses through its tasks, it continuously updates you with messages about its current status and any new developments.​

Before deploying Devin tests the website on its localhost. If an error is detected, Devin can identify it much like a human would, and then return to the code to fix it.

Devin also performs this process after deploying the website; it visits the live site and actively engages with it, such as playing a game, to ensure everything functions correctly

Additionally, Devin directly deploys your site to Netlify.

What we love about Devin

Devin closes the loop, unlike ChatGPT and Claude, which can only generate code without verifying its functionality. Devin not only generates the code but also tests it, identifies bugs, and fixes them. In another instance, when it encountered an error, Devin searched the web for documentation to resolve the issue effectively.

Devin also has a built-in code editor:

Disclaimer: Devin can be frustrating

What frustrates us most about Devin is that its performance tends to decline the longer the chat goes on. Unlike Claude, Devin is also significantly more expensive, costing $50 per month for 40 ACUs—the metric used to count Devin’s runs (not prompts). However, it’s not entirely clear how these ACUs are calculated.

The first three chats with Devin are usually quite impressive, but subsequent ones tend to diminish in quality with each run. If you can complete your entire application within those initial three runs, you’re in good shape. But beyond that, it starts to get frustrating.

Cognition Lab really needs to expand Devin’s context window to improve its performance over extended interactions.

One thing is for sure…

With advancements in AI and automation tools like Devin, the process of coding, testing, and deploying applications is becoming more streamlined and accessible than ever before. Soon, building an app could be as simple as writing a few sentences, creating new opportunities for builders.

Reply

or to participate.