Data

WE SCRAPED 1M SITES: HERE'S WHAT WE FOUND

2026-03-24

In January 2026, our Data Intelligence team completed one of the largest independent web crawls ever conducted in Southeast Asia. We scraped, crawled, and analysed 1,002,847 websites — including 312,000 Thai domains, 198,000 Indonesian domains, 156,000 Malaysian domains, 142,000 Vietnamese domains, and 194,000 from the Philippines, Singapore, and other regional markets.

The goal was simple: understand the actual state of digital readiness across the region. Not what agencies claim in pitch decks. Not what industry reports assume. The real, messy, data-driven truth.

What we found was simultaneously encouraging and alarming. The gap between best practices and actual implementation is enormous — which means the opportunity for businesses willing to invest properly is equally enormous.

This is the full report. At 2BKK, we are making this data public because we believe transparency drives better decisions. If you want to see how your specific competitors stack up, contact us for a custom competitive intelligence report.

Methodology

Before diving into the findings, here is how we conducted the study.

Crawling infrastructure: We used a distributed crawling system running across 14 data centres in Asia-Pacific, executing approximately 4.2 million page loads over 21 days. Each domain was crawled to a depth of 5 pages (homepage + 4 internal pages) to capture representative technical characteristics.

Data collected per domain:

Validation: We cross-validated our findings against the HTTP Archive and Chrome UX Report for the subset of domains that appear in both datasets. Our measurements correlated at 0.94 (Pearson) for Core Web Vitals, giving us confidence in the accuracy of our methodology.

Finding 1: Thailand's Tech Stack Landscape

Content Management Systems:

The CMS landscape in Thailand reveals a market still heavily dependent on legacy platforms:

The important insight: Only 4% of Thai websites use modern JavaScript frameworks (Next.js, Nuxt, Gatsby, Remix). This matters because these frameworks deliver significantly better Core Web Vitals scores and enable features like server-side rendering and incremental static regeneration that are becoming critical for AI SEO.

The technology gap between Thai websites and their Western counterparts is narrowing — but the gap between Thai market leaders and Thai market laggards is widening at an alarming rate.

Analytics and Tracking:

That last number is staggering. Nearly one in three Thai websites has no analytics tracking whatsoever. These businesses are flying completely blind.

Finding 2: SEO Implementation Is Shockingly Poor

This is where the data gets brutal.

Title Tags:

Meta Descriptions:

Heading Structure:

Structured Data / Schema Markup:

This is perhaps the most alarming finding given how critical structured data is becoming for AI-powered search:

To put this in perspective, the global average for structured data adoption is approximately 38%. Thailand is at 23%. The competitive advantage for businesses that implement structured data properly is enormous.

Internal Linking:

Finding 3: Page Speed Is a Crisis

Core Web Vitals data tells a story of widespread underperformance:

Largest Contentful Paint (LCP) — target: under 2.5s:

Cumulative Layout Shift (CLS) — target: under 0.1:

First Input Delay / Interaction to Next Paint (INP) — target: under 200ms:

The primary speed killers:

Finding 4: Mobile Optimisation Gap

With 78% of Thai internet traffic coming from mobile devices, mobile optimisation should be a top priority. The data suggests otherwise.

The mobile speed crisis: Only 24% of Thai websites deliver a fast experience on mobile. This is a critical failure point because Google uses mobile performance as the primary indexing signal, and AI Overviews are even more prominent on mobile devices.

Finding 5: SSL/Security Landscape

Finding 6: Open Graph and Social Sharing

Social sharing metadata is crucial for how content appears when shared on LINE, Facebook, and Twitter — platforms with enormous reach in Thailand.

Finding 7: Content and Language Patterns

Content length:

Thai websites are dramatically underinvesting in content depth. This aligns with our findings in the AI SEO report — topical authority requires comprehensive content, and most Thai sites are not coming close.

Language:

Finding 8: E-Commerce Specific Insights

We analysed 47,000 Thai e-commerce websites separately. The findings are particularly relevant for sellers considering their own web presence alongside marketplace channels.

Finding 9: Thailand vs Regional Comparison

How does Thailand compare with other Southeast Asian markets?

Overall digital readiness score (our composite metric, 0-100):

Thailand sits in the middle of the pack. We are more advanced than Indonesia and Vietnam but significantly behind Singapore and Malaysia. The primary gaps are in technical SEO implementation, page speed, and structured data adoption.

Areas where Thailand leads:

Areas where Thailand lags:

Finding 10: The Opportunity Matrix

Based on all of this data, here is where the biggest opportunities lie for Thai businesses:

Quick wins (implementable in days, high impact):

Medium-term wins (weeks to months):

Strategic wins (months to build, compounding returns):

How We Use This Data

At 2BKK, this scraping infrastructure is not a one-time project — it is a living system. We continuously crawl competitor websites, marketplace listings, and search results to provide our clients with real-time competitive intelligence.

Our Data Intelligence & Scraping service gives you access to:

What This Means for Your Business

If you have read this far, you understand the landscape. The question is: what will you do about it?

The data shows that the bar in Thailand is still relatively low. Basic technical SEO, proper structured data, decent page speed, and comprehensive content put you ahead of 70-80% of Thai websites. That is the good news.

The challenging news is that the bar is rising fast. Google's AI Overviews are raising the stakes for content quality. Mobile performance is becoming a ranking gatekeeper. And the businesses that are investing now — many of them our clients — are building moats that will be increasingly expensive to breach.

In a market where 77% of websites have no structured data, simply implementing schema markup is a competitive advantage. But that window is closing.

Whether you work with 2BKK or not, the action items from this research are clear:

If you want a custom competitive analysis based on this dataset, or if you want to set up continuous monitoring for your industry, get in touch with our team. We will show you exactly where you stand — and exactly what to do about it.

Ready to execute?

Our team is ready to deploy this protocol for your brand.

CONTACT US