Blog

Thinking out loud.

Notes on web crawling, AI agents, and structured data.

12 posts

Latest

2026-03-18

7 min read

Firecrawl vs Apify vs Last Crawler: Which Web Scraping API Should You Use?

A technical comparison of three approaches to web data extraction — selector-based, platform-based, and AI-powered.

ComparisonWeb ScrapingDeveloper ToolsRead →

2026-03-17

5 min read

How to Extract Product Data from Any Ecommerce Site

A step-by-step guide to extracting product names, prices, specs, and reviews from any online store using JSON schemas.

TutorialEcommerceJSON Extraction

2026-03-16

6 min read

Web Scraping Without Getting Blocked: What Actually Works in 2026

Proxies, headless browsers, rate limiting — the traditional anti-block toolkit is failing. Here's what works now.

Web ScrapingAnti-BotGuide

2026-03-15

6 min read

Why AI Agents Need a Better Web Scraping API

Most web scraping APIs weren't built for AI agents. Here's why that matters and what a purpose-built approach looks like.

AI AgentsWeb ScrapingRAG

2026-03-14

6 min read

How to Feed Live Web Data to Your LLM Agent

A practical guide to giving your AI agent real-time web access with structured data output.

AI AgentsLLMsTutorial

2026-03-12

5 min read

Structured Data for Embeddings: The Missing Piece in Your RAG Pipeline

Web scraping for RAG pipelines usually means embedding noise alongside signal. Structured extraction fixes that from the start.

RAGLLMsEmbeddings

2026-03-10

8 min read

From Web Crawl to Vector Database: Building a RAG Pipeline

An end-to-end tutorial: crawl web pages, extract clean content, chunk it, embed it, and store it in a vector database for retrieval.

RAGVector DatabaseTutorial

2026-03-08

4 min read

Web Scraping Without Getting Blocked Is Still Broken in 2026

Proxies, headless browsers, CAPTCHA solving — the traditional approach to web scraping without getting blocked keeps getting more expensive and fragile.

Web ScrapingOpinion

2026-03-05

5 min read

JSON Schema Web Scraping: Turn Any Website into a Typed JSON API

JSON schemas are emerging as a declarative extraction language for the web — describe the shape of data you want, get exactly that back from any URL.

JSONAI ExtractionDeveloper Tools

2026-02-28

7 min read

How to Build Web Tools for AI Agents That Actually Work

AI agent tool use is only as good as the tools themselves. Here's how to build structured, typed web tools that agents can actually reason with.

AI AgentsTool UseArchitecture

2026-02-20

5 min read

Automated Competitor Price Monitoring with Structured Web Crawling

Manual competitive intelligence is always stale. Here's how to build automated competitor price monitoring with structured crawling.

Competitive IntelAutomationUse Cases

2026-02-15

4 min read

Turn Any Website into an API and Extract Structured Data from Any URL

No API? No problem. Here's how to turn any website into an API and extract structured data from any URL in under a minute.

Developer ToolsAPIsTutorial

+_+

Home

2026