2026-02-20
5 min read
Automated Competitor Price Monitoring with Structured Web Crawling
Every company monitors competitors. Most do it badly — someone manually checks competitor websites once a month, screenshots some pricing pages, and pastes findings into a Google Doc. By the time the report reaches decision-makers, the data is stale.
There's a better way.
Define what you're tracking
Start with the specific data points that matter to your business. For most companies, this is some combination of:
- Pricing — plan names, prices, feature limits, enterprise tiers (similar to how you'd extract product data from ecommerce sites)
- Product features — changelog entries, new capabilities, deprecated features
- Positioning — homepage copy, taglines, value propositions
- Content — blog posts, case studies, documentation updates
- Job openings — what roles they're hiring for signals where they're investing
Each of these maps cleanly to a JSON schema. With a URL to JSON API, you define the structure once and extract it from any competitor's page.
javascript
const pricingSchema = {
plans: [{
name: "string",
price: "string",
billing_period: "string",
features: ["string"],
limits: "string",
cta_text: "string"
}],
enterprise: "boolean",
free_tier: "boolean"
};
Schedule crawls for automated competitor price monitoring
Set up a recurring crawl — daily for pricing and features, weekly for content and jobs. Store every extraction with a timestamp. Now you have a time series of your competitor's public data.
javascript
// Daily crawl job
const competitors = [
{ name: "Acme", pricing: "https://acme.com/pricing" },
{ name: "Globex", pricing: "https://globex.com/plans" },
];
for (const comp of competitors) {
const data = await crawler.json(comp.pricing, { schema: pricingSchema });
await db.insert("competitor_pricing", {
competitor: comp.name,
data,
crawled_at: new Date(),
});
}
Diff and alert
The raw data isn't the valuable part -- the changes are. Diff today's extraction against yesterday's. When something changes, send an alert.
Price drop? Slack message to the sales team. New feature announced? Notify product. Hiring a bunch of ML engineers? Flag for the exec team.
javascript
const today = await getLatestExtraction(competitor);
const yesterday = await getPreviousExtraction(competitor);
const changes = diff(yesterday.data, today.data);
if (changes.length > 0) {
await slack.send("#competitive-intel", {
text: `${competitor} changed their pricing page`,
changes: changes.map(c => `${c.field}: ${c.old} → ${c.new}`),
});
}
Build a dashboard, not a document
Instead of a monthly report that nobody reads, build a live dashboard. Show current competitor pricing side-by-side. Graph price changes over time. Track feature parity. Display new content and job postings.
The data is structured and timestamped. Building the dashboard is the easy part.
What this actually looks like in practice
One of our early users tracks 12 competitors across pricing, features, and blog content. Their setup:
- 15 schemas covering pricing pages, feature comparison tables, changelog pages, and blog listings
- Daily crawls at 6 AM, results stored in Postgres
- Diffing pipeline that compares daily snapshots and posts to Slack
- Weekly digest auto-generated from the week's changes
Total engineering time to set up: about two days using the best web scraping API available. Previously, this was a part-time job for an analyst.
The compound effect
The historical data is where this gets interesting. After three months of daily crawls, you can answer questions like:
- How often does Competitor X change their pricing?
- Are they gradually adding features to their free tier?
- What topics are they writing about more frequently?
- Are they hiring more in engineering or sales?
These patterns are invisible in point-in-time snapshots. They only emerge from consistent, structured data collection over time.
FAQ
How do I set up automated competitor price monitoring?
Define a JSON schema for the pricing data you want to track, crawl each competitor's pricing page on a daily schedule, and store the results with timestamps in a database. Then diff each day's extraction against the previous one and alert on changes. The whole pipeline can be set up in a day or two.
What data should I track for competitive intelligence automation?
The highest-value signals are pricing (plan names, prices, feature limits), product positioning (homepage copy, value props), new feature announcements (changelog pages), and hiring patterns (job listings). Each maps cleanly to a JSON schema you can crawl on a recurring schedule.
Is automated web crawling of competitor sites legal?
Crawling publicly accessible pages is generally permitted, but you should review the site's terms of service and robots.txt, avoid overwhelming their servers, and consult a lawyer for your specific jurisdiction and use case. Most competitive intelligence use cases involving public pricing pages are standard industry practice.
Last Crawler
2026-02-20