Structured output contracts beat scraping glue

Feb 2026

Selector-based scraping is brittle by design. You write: 'grab the text from div.product-price', or 'click the button with data-testid=submit'. This works until the site redesigns, changes their class names, or reorganizes their markup. Then the selectors break, silently.

Selector failures are especially insidious because they often fail silently. The script runs, returns empty data, and you do not find out until someone complains that the data is wrong. You have been collecting bad data for days.

With BrowserPilot you describe the output shape as a JSON contract instead: 'return an array of products with name (string), price (number), and rating (number)'. The agent is responsible for satisfying that contract, however the page is built that day. If the fields are in a table, the agent reads the table. If they are in cards with nested divs, the agent navigates the structure. If the site changes its layout entirely, the agent still reads the same logical data.

This is more robust because the agent is bound to the intent ('get the price'), not the implementation ('click div.price-tag'). The intent does not change nearly as often as the implementation.

When the contract cannot be satisfied — maybe the site is down, or the data is not there — you get a clear failure instead of silently wrong data. A handler that gets an error message can retry, fall back to a cache, or alert a human. A handler that gets an empty array from a failed selector will never know.

Contracts also make it easier to catch regressions. If a task is returning data, but the structure changed, you will catch it immediately. If a selector task is quietly returning incomplete data because the selectors drifted, you might not catch it for weeks.