Turn any URL into agent-ready data: clean markdown, contact info, a company description, pricing plans, or one combined page-to-JSON summary.
Web Extraction is a set of page-scraping building blocks for AI agents: url-to-markdown fetches a page and returns clean, markdown-ish text (nav/footer/scripts/cookie-banners stripped); extract-contact-info pulls emails, phones, social links and contact URLs (and follows a same-origin contact page for more signal); extract-company-description returns a short description from og/meta or the first real paragraph; extract-pricing heuristically parses plans (name, price, currency, billing period, features) from a pricing page; page-to-agent-json combines title, summary, links, prices and contacts in a single fetch. All heuristic (regex/DOM cleanup, no LLM pass in v1). Pay per request via x402; no API key needed from the caller.
Endpoints
POST
/extract-company-description
$0.07Return a short company description from a homepage: og:description > meta description > first meaningful paragraph. Cleaned, single-line, truncated to ~280 chars. Never invents text that is not on the page.
POST
/extract-contact-info
$0.07Extract emails, phone numbers, social links and contact URLs from a page. Known placeholder emails are filtered; a same-origin contact page is fetched for extra signal and merged (deduplicated).
POST
/extract-pricing
$0.07Heuristically parse pricing plans (name, price, currency, billing period, features) from a pricing page. Each plan gets its own detected currency. Returns plans [] with confidence 0 when nothing matches. Best-effort, not a guaranteed parser for every layout.
POST
/page-to-agent-json
$0.07One combined, agent-readable summary of a page — title, short summary, links, prices and contacts — assembled from a SINGLE fetch (the most economical of the five). entities is [] in v1 (no entity recognition yet).
POST
/url-to-markdown
$0.07Fetch a URL, strip noise (script/style/nav/footer/cookie banners) and return the page title plus markdown-ish text (headings as #..######, paragraphs/list-items as lines). Not a full HTML-to-MD conversion.