How do I scrape Zillow without getting blocked?

Zillow uses aggressive anti-bot measures including PerimeterX (now HUMAN Security), fingerprinting, and IP rate limiting. Use rotating residential proxies, realistic browser fingerprints, random delays (3-8 seconds), and headless browsers with stealth plugins. Alternatively, use Mantis API to handle all anti-bot measures automatically.

What data can I extract from Zillow?

You can extract property listings, sale prices, Zestimates, rental prices, property details (beds, baths, sqft, lot size, year built), tax history, price history, neighborhood data, school ratings, walk scores, agent information, and market trends (median prices, inventory, days on market).

What's the best programming language for scraping Zillow?

Python with Playwright or Selenium is the most popular choice due to Zillow's heavy JavaScript rendering. Node.js with Puppeteer is also effective. For production use, Mantis API provides structured Zillow data extraction without managing browsers, proxies, or anti-bot detection.

How to Scrape Zillow Data in 2026: Listings, Prices & Property Details

Q: Can I scrape Zillow Zestimates?

Yes, Zestimates are displayed on public property pages and can be extracted. They're embedded in the __NEXT_DATA__ JSON payload on each property page. Note that Zestimates are Zillow's proprietary valuation model and should be attributed if republished.

📑 Table of Contents

What Data Can You Extract from Zillow?
Method 1: Python + Requests + BeautifulSoup
Method 2: Playwright Headless Browser
Method 3: Node.js + Puppeteer
Method 4: Mantis API (Production-Ready)
Zillow's Anti-Bot Defenses
Zillow API vs Scraping vs Mantis
Real-World Use Cases
Legal Considerations
FAQ

Zillow is the largest real estate marketplace in the United States, with data on over 135 million properties. For real estate investors, proptech startups, and AI agents building property analysis tools, Zillow's data is invaluable — listing prices, Zestimates (automated valuations), tax history, price trends, school ratings, and neighborhood stats.

The problem? Zillow killed its free public API in 2021. The Bridge API is MLS-partner-only. The Zestimate API is discontinued. If you want Zillow data programmatically, scraping (or a scraping API like Mantis) is your only option.

This guide covers 4 methods to extract Zillow data in 2026, from basic Python scripts to production-ready API calls. We'll cover anti-bot bypassing, legal considerations, and real-world use cases with complete code examples.

What Data Can You Extract from Zillow?

Data Type	Available?	Notes
Property listings (for sale)	✅	Search results + detail pages
Rental listings	✅	Via /homes/for_rent/
Sale prices / Zestimates	✅	Embedded in __NEXT_DATA__ JSON
Price history	✅	Historical sale prices per property
Tax history	✅	Annual tax assessments
Property details (beds, baths, sqft)	✅	Structured data in JSON payload
Photos	⚠️	URLs available; copyright applies (VHT v. Zillow)
Neighborhood data	✅	Walk score, transit score, crime
School ratings	✅	GreatSchools data embedded
Agent information	✅	Listing agent, brokerage
Market trends	✅	Median prices, inventory, DOM by ZIP
Recently sold	✅	Via /homes/recently_sold/

💡 The Gold Mine: Zillow's __NEXT_DATA__ JSON payload on property pages contains structured data for the entire listing — prices, history, features, schools, and more. One page load gives you everything.

Method 1: Python + Requests + BeautifulSoup

Search Results Scraping

Zillow search pages load property data via an internal API. The initial page render includes a __NEXT_DATA__ script tag with all listing data in JSON format.

import requests
from bs4 import BeautifulSoup
import json
import time
import random

headers = {
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) '
                  'AppleWebKit/537.36 (KHTML, like Gecko) '
                  'Chrome/120.0.0.0 Safari/537.36',
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
    'Accept-Language': 'en-US,en;q=0.9',
    'Accept-Encoding': 'gzip, deflate, br',
    'Connection': 'keep-alive',
}

def scrape_zillow_search(location, page=1):
    """Scrape Zillow search results for a location."""
    # Zillow uses URL-encoded search paths
    url = f'https://www.zillow.com/homes/{location}_rb/'
    if page > 1:
        url += f'{page}_p/'
    
    response = requests.get(url, headers=headers)
    
    if response.status_code != 200:
        print(f"Blocked or error: {response.status_code}")
        return None
    
    soup = BeautifulSoup(response.text, 'html.parser')
    
    # Extract __NEXT_DATA__ JSON
    script_tag = soup.find('script', {'id': '__NEXT_DATA__'})
    if not script_tag:
        print("No __NEXT_DATA__ found — likely blocked by PerimeterX")
        return None
    
    data = json.loads(script_tag.string)
    
    # Navigate to search results
    try:
        results = data['props']['pageProps']['searchPageState']['cat1']['searchResults']['listResults']
    except (KeyError, TypeError):
        print("Search results structure changed")
        return None
    
    properties = []
    for listing in results:
        properties.append({
            'zpid': listing.get('zpid'),
            'address': listing.get('address'),
            'price': listing.get('unformattedPrice') or listing.get('price'),
            'beds': listing.get('beds'),
            'baths': listing.get('baths'),
            'sqft': listing.get('area'),
            'zestimate': listing.get('zestimate'),
            'status': listing.get('statusText'),
            'listing_type': listing.get('listingType'),
            'broker': listing.get('brokerName'),
            'detail_url': listing.get('detailUrl'),
            'latitude': listing.get('latLong', {}).get('latitude'),
            'longitude': listing.get('latLong', {}).get('longitude'),
        })
    
    return properties

# Example: Scrape Austin, TX listings
listings = scrape_zillow_search('Austin-TX')
if listings:
    for p in listings[:5]:
        print(f"{p['address']} — ${p['price']:,} | {p['beds']}bd/{p['baths']}ba | {p['sqft']} sqft")
        print(f"  Zestimate: ${p['zestimate']:,}" if p['zestimate'] else "  Zestimate: N/A")

time.sleep(random.uniform(3, 7))  # Always add delays

Property Detail Scraping

def scrape_property_detail(zpid_or_url):
    """Extract full property details from a Zillow listing page."""
    if str(zpid_or_url).startswith('http'):
        url = zpid_or_url
    else:
        url = f'https://www.zillow.com/homedetails/{zpid_or_url}_zpid/'
    
    response = requests.get(url, headers=headers)
    soup = BeautifulSoup(response.text, 'html.parser')
    
    script_tag = soup.find('script', {'id': '__NEXT_DATA__'})
    if not script_tag:
        return None
    
    data = json.loads(script_tag.string)
    
    try:
        property_data = data['props']['pageProps']['componentProps']['gdpClientCache']
        # The cache key varies — extract the first (and usually only) entry
        cache_key = list(json.loads(property_data).keys())[0]
        details = json.loads(property_data)[cache_key]['property']
    except (KeyError, TypeError, json.JSONDecodeError):
        return None
    
    return {
        'zpid': details.get('zpid'),
        'address': details.get('address', {}).get('streetAddress'),
        'city': details.get('address', {}).get('city'),
        'state': details.get('address', {}).get('state'),
        'zip': details.get('address', {}).get('zipcode'),
        'price': details.get('price'),
        'zestimate': details.get('zestimate'),
        'rent_zestimate': details.get('rentZestimate'),
        'beds': details.get('bedrooms'),
        'baths': details.get('bathrooms'),
        'sqft': details.get('livingArea'),
        'lot_size': details.get('lotSize'),
        'year_built': details.get('yearBuilt'),
        'property_type': details.get('homeType'),
        'description': details.get('description'),
        'price_history': details.get('priceHistory', []),
        'tax_history': details.get('taxHistory', []),
        'schools': details.get('schools', []),
        'walk_score': details.get('walkScore'),
        'transit_score': details.get('transitScore'),
        'listing_agent': details.get('attributionInfo', {}).get('agentName'),
        'broker': details.get('attributionInfo', {}).get('brokerName'),
    }

⚠️ PerimeterX Detection: Zillow uses PerimeterX (now HUMAN Security) — one of the most aggressive bot detection systems. Plain requests will get blocked after a few pages. Use this method only for small, one-off extractions. For production, use Methods 2-4.

Method 2: Playwright Headless Browser

Playwright with stealth configuration is the most reliable free method for scraping Zillow. It handles JavaScript rendering and can bypass PerimeterX with proper fingerprint management.

import asyncio
from playwright.async_api import async_playwright
import json
import random

async def scrape_zillow_playwright(location, max_pages=3):
    """Scrape Zillow search results using Playwright with stealth."""
    async with async_playwright() as p:
        browser = await p.chromium.launch(
            headless=True,
            args=[
                '--disable-blink-features=AutomationControlled',
                '--disable-features=IsolateOrigins,site-per-process',
                '--no-sandbox',
            ]
        )
        
        context = await browser.new_context(
            viewport={'width': 1920, 'height': 1080},
            user_agent='Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) '
                       'AppleWebKit/537.36 (KHTML, like Gecko) '
                       'Chrome/120.0.0.0 Safari/537.36',
            locale='en-US',
            timezone_id='America/New_York',
        )
        
        # Remove webdriver flag
        await context.add_init_script("""
            Object.defineProperty(navigator, 'webdriver', {
                get: () => undefined
            });
            // Override permissions
            const originalQuery = window.navigator.permissions.query;
            window.navigator.permissions.query = (parameters) =>
                parameters.name === 'notifications'
                    ? Promise.resolve({ state: Notification.permission })
                    : originalQuery(parameters);
        """)
        
        page = await context.new_page()
        all_listings = []
        
        for page_num in range(1, max_pages + 1):
            url = f'https://www.zillow.com/homes/{location}_rb/'
            if page_num > 1:
                url += f'{page_num}_p/'
            
            print(f"Scraping page {page_num}: {url}")
            await page.goto(url, wait_until='networkidle', timeout=30000)
            
            # Wait for listings to render
            await page.wait_for_selector('article[data-test="property-card"]',
                                          timeout=10000)
            
            # Extract __NEXT_DATA__
            next_data = await page.evaluate("""
                () => {
                    const el = document.getElementById('__NEXT_DATA__');
                    return el ? JSON.parse(el.textContent) : null;
                }
            """)
            
            if next_data:
                try:
                    results = next_data['props']['pageProps']['searchPageState'] \
                             ['cat1']['searchResults']['listResults']
                    for listing in results:
                        all_listings.append({
                            'zpid': listing.get('zpid'),
                            'address': listing.get('address'),
                            'price': listing.get('unformattedPrice'),
                            'beds': listing.get('beds'),
                            'baths': listing.get('baths'),
                            'sqft': listing.get('area'),
                            'zestimate': listing.get('zestimate'),
                            'status': listing.get('statusText'),
                            'url': listing.get('detailUrl'),
                        })
                except (KeyError, TypeError):
                    print(f"Failed to parse page {page_num}")
            
            # Human-like delay between pages
            await asyncio.sleep(random.uniform(4, 8))
        
        await browser.close()
        return all_listings

# Run the scraper
listings = asyncio.run(scrape_zillow_playwright('Austin-TX', max_pages=3))
for p in listings[:10]:
    print(f"{p['address']} — ${p['price']:,} | Zestimate: ${p.get('zestimate', 'N/A')}")

Map-Based Search (Advanced)

Zillow's map search uses an internal API endpoint that accepts bounding box coordinates. This is powerful for geographic data collection:

async def scrape_zillow_map_api(page, bounds):
    """Intercept Zillow's internal search API for map-based queries."""
    
    api_responses = []
    
    async def handle_response(response):
        if 'search/GetSearchPageState' in response.url:
            try:
                data = await response.json()
                api_responses.append(data)
            except:
                pass
    
    page.on('response', handle_response)
    
    # Navigate to map search with bounds
    search_url = (
        f"https://www.zillow.com/homes/"
        f"?searchQueryState=%7B%22mapBounds%22%3A%7B"
        f"%22north%22%3A{bounds['north']}%2C"
        f"%22south%22%3A{bounds['south']}%2C"
        f"%22east%22%3A{bounds['east']}%2C"
        f"%22west%22%3A{bounds['west']}"
        f"%7D%7D"
    )
    
    await page.goto(search_url, wait_until='networkidle')
    await asyncio.sleep(3)
    
    return api_responses

Method 3: Node.js + Puppeteer

Puppeteer with the stealth plugin is excellent for Zillow scraping. The stealth plugin patches all common fingerprinting vectors that PerimeterX checks.

const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');
puppeteer.use(StealthPlugin());

async function scrapeZillow(location, maxPages = 3) {
    const browser = await puppeteer.launch({
        headless: 'new',
        args: [
            '--no-sandbox',
            '--disable-setuid-sandbox',
            '--disable-blink-features=AutomationControlled',
        ],
    });

    const page = await browser.newPage();
    await page.setViewport({ width: 1920, height: 1080 });

    const allListings = [];

    for (let pageNum = 1; pageNum <= maxPages; pageNum++) {
        let url = `https://www.zillow.com/homes/${location}_rb/`;
        if (pageNum > 1) url += `${pageNum}_p/`;

        console.log(`Scraping page ${pageNum}...`);
        await page.goto(url, { waitUntil: 'networkidle2', timeout: 30000 });

        // Extract property data from __NEXT_DATA__
        const listings = await page.evaluate(() => {
            const scriptEl = document.getElementById('__NEXT_DATA__');
            if (!scriptEl) return [];
            
            try {
                const data = JSON.parse(scriptEl.textContent);
                const results = data.props.pageProps.searchPageState
                    .cat1.searchResults.listResults;
                
                return results.map(listing => ({
                    zpid: listing.zpid,
                    address: listing.address,
                    price: listing.unformattedPrice || listing.price,
                    beds: listing.beds,
                    baths: listing.baths,
                    sqft: listing.area,
                    zestimate: listing.zestimate,
                    status: listing.statusText,
                    latitude: listing.latLong?.latitude,
                    longitude: listing.latLong?.longitude,
                    detailUrl: listing.detailUrl,
                    broker: listing.brokerName,
                }));
            } catch (e) {
                return [];
            }
        });

        allListings.push(...listings);
        console.log(`  Found ${listings.length} listings on page ${pageNum}`);

        // Random delay between pages
        const delay = 4000 + Math.random() * 4000;
        await new Promise(r => setTimeout(r, delay));
    }

    await browser.close();
    return allListings;
}

// Property detail extraction
async function scrapePropertyDetail(page, url) {
    await page.goto(url, { waitUntil: 'networkidle2', timeout: 30000 });

    const details = await page.evaluate(() => {
        const scriptEl = document.getElementById('__NEXT_DATA__');
        if (!scriptEl) return null;
        
        try {
            const data = JSON.parse(scriptEl.textContent);
            const cache = JSON.parse(
                data.props.pageProps.componentProps.gdpClientCache
            );
            const key = Object.keys(cache)[0];
            const prop = cache[key].property;
            
            return {
                zpid: prop.zpid,
                address: prop.address?.streetAddress,
                city: prop.address?.city,
                state: prop.address?.state,
                zip: prop.address?.zipcode,
                price: prop.price,
                zestimate: prop.zestimate,
                rentZestimate: prop.rentZestimate,
                beds: prop.bedrooms,
                baths: prop.bathrooms,
                sqft: prop.livingArea,
                lotSize: prop.lotSize,
                yearBuilt: prop.yearBuilt,
                homeType: prop.homeType,
                priceHistory: prop.priceHistory?.slice(0, 10),
                taxHistory: prop.taxHistory?.slice(0, 5),
                schools: prop.schools,
                walkScore: prop.walkScore,
            };
        } catch (e) {
            return null;
        }
    });

    return details;
}

// Usage
(async () => {
    const listings = await scrapeZillow('Austin-TX', 2);
    console.log(`Total: ${listings.length} properties`);
    listings.slice(0, 5).forEach(l => {
        console.log(`${l.address} — $${l.price?.toLocaleString()} | ${l.beds}bd/${l.baths}ba`);
    });
})();

Method 4: Mantis API (Production-Ready)

For production applications — especially AI agents that need reliable, structured real estate data — Mantis API handles all the complexity: PerimeterX bypass, proxy rotation, JavaScript rendering, and data extraction in a single call.

# Python — Extract Zillow property data with Mantis API
import requests

response = requests.post('https://api.mantisapi.com/v1/extract', json={
    'url': 'https://www.zillow.com/homedetails/123-Main-St-Austin-TX-78701/12345678_zpid/',
    'schema': {
        'address': 'string - Full property address',
        'price': 'number - Listing price in dollars',
        'zestimate': 'number - Zillow Zestimate value',
        'beds': 'number - Number of bedrooms',
        'baths': 'number - Number of bathrooms',
        'sqft': 'number - Living area in square feet',
        'lot_size': 'string - Lot size',
        'year_built': 'number - Year the property was built',
        'property_type': 'string - Type (Single Family, Condo, etc)',
        'price_history': 'array - Last 5 sale prices with dates',
        'schools': 'array - Nearby schools with ratings',
        'walk_score': 'number - Walk Score rating',
    }
}, headers={
    'Authorization': 'Bearer YOUR_API_KEY',
})

property_data = response.json()['data']
print(f"{property_data['address']}")
print(f"Price: ${property_data['price']:,} | Zestimate: ${property_data['zestimate']:,}")
print(f"{property_data['beds']}bd / {property_data['baths']}ba / {property_data['sqft']:,} sqft")
print(f"Built: {property_data['year_built']} | Type: {property_data['property_type']}")

// Node.js — Batch extract Zillow search results
const axios = require('axios');

const response = await axios.post('https://api.mantisapi.com/v1/extract', {
    url: 'https://www.zillow.com/homes/Austin-TX_rb/',
    schema: {
        listings: [{
            address: 'string',
            price: 'number',
            beds: 'number',
            baths: 'number',
            sqft: 'number',
            zestimate: 'number',
            status: 'string - For Sale, Pending, etc',
        }]
    }
}, {
    headers: { 'Authorization': 'Bearer YOUR_API_KEY' }
});

const { listings } = response.data.data;
console.log(`Found ${listings.length} properties`);
listings.forEach(l => {
    console.log(`${l.address} — $${l.price.toLocaleString()} | ${l.beds}bd/${l.baths}ba`);
});

Skip the Anti-Bot Battle

Zillow's PerimeterX protection blocks most scrapers. Mantis handles it automatically — structured property data in one API call.

View Pricing Get Started Free

Zillow's Anti-Bot Defenses

Zillow has some of the most aggressive anti-bot measures of any website. Here's what you're up against:

PerimeterX (HUMAN Security)

Zillow uses PerimeterX — now rebranded as HUMAN Security — as its primary bot detection layer. It checks:

Browser fingerprinting: Canvas, WebGL, AudioContext, fonts, plugins — hundreds of signals
Behavioral analysis: Mouse movements, scroll patterns, click timing, keystroke dynamics
JavaScript challenges: Invisible JS that must execute correctly in a real browser environment
TLS fingerprinting: JA3/JA4 signatures to detect headless browsers and non-browser HTTP clients
Cookie validation: The _px3 and _pxvid cookies must be set by their JS

IP Rate Limiting

Residential IPs: ~20-50 requests before throttling
Datacenter IPs: Often blocked immediately
Rate limit triggers a CAPTCHA or 403 response

Search Query Detection

Zillow monitors for systematic search patterns (sequential ZIP codes, grid-based map scans)
Excessive API calls to GetSearchPageState trigger blocks
Map-based searches with programmatic bounding box changes are flagged

Bypassing Zillow's Defenses

Defense	Countermeasure	Difficulty
PerimeterX fingerprinting	Stealth browser plugins, realistic fingerprints	🔴 Hard
JavaScript challenges	Full browser rendering (Playwright/Puppeteer)	🟡 Medium
IP rate limiting	Rotating residential proxies	🟢 Easy
TLS fingerprinting	Use real browser TLS stack (not curl/requests)	🟡 Medium
Behavioral analysis	Random delays, human-like mouse movements	🟡 Medium
Cookie validation	Let PerimeterX JS run, preserve cookies	🟢 Easy

Zillow API vs Scraping vs Mantis

Feature	Zillow API (Discontinued)	DIY Scraping	Mantis API
Availability	❌ Shut down 2021	✅ Works	✅ Works
Property listings	❌	✅	✅
Zestimates	❌ (API discontinued)	✅	✅
Price history	❌	✅	✅
Tax history	❌	✅	✅
School ratings	❌	✅	✅
Anti-bot handling	N/A	You manage	Handled
Proxy management	N/A	You manage	Included
Structured output	Was JSON	You parse	JSON
Rate limits	Was 1K/day	~20-50/IP	Per plan
Cost	Was free	Proxy costs ($50-200/mo)	From $29/mo
Maintenance	N/A	High (constant updates)	None

Key Insight: With Zillow's API completely shut down, the real comparison is DIY scraping vs. a managed API. DIY works but requires constant maintenance as Zillow updates PerimeterX configs weekly. For AI agents and production apps, a managed API is significantly more reliable.

Real-World Use Cases

1. Investment Property Analyzer

Build an AI agent that evaluates rental properties by comparing listing price, Zestimate, rent Zestimate, and local market trends.

def analyze_investment(property_data):
    """Calculate investment metrics for a rental property."""
    price = property_data['price']
    rent_estimate = property_data.get('rent_zestimate', 0)
    zestimate = property_data.get('zestimate', price)
    
    # Monthly cash flow estimate
    monthly_mortgage = price * 0.8 * 0.065 / 12  # 80% LTV, 6.5% rate
    monthly_taxes = (property_data.get('tax_history', [{}])[0]
                     .get('taxPaid', price * 0.012)) / 12
    monthly_insurance = price * 0.004 / 12
    monthly_expenses = monthly_mortgage + monthly_taxes + monthly_insurance
    
    cash_flow = rent_estimate - monthly_expenses
    cap_rate = (rent_estimate * 12 - monthly_expenses * 12) / price * 100
    price_to_rent = price / (rent_estimate * 12) if rent_estimate else None
    price_vs_zestimate = ((price - zestimate) / zestimate * 100) if zestimate else None
    
    return {
        'monthly_cash_flow': round(cash_flow, 2),
        'cap_rate': round(cap_rate, 2),
        'price_to_rent_ratio': round(price_to_rent, 1) if price_to_rent else None,
        'price_vs_zestimate': round(price_vs_zestimate, 1),
        'verdict': 'BUY' if cap_rate > 6 and price_vs_zestimate < -5 else
                   'CONSIDER' if cap_rate > 4 else 'PASS'
    }

2. Market Trend Monitor

Track real estate market conditions across ZIP codes — median prices, inventory levels, days on market, and price changes.

import json
from datetime import datetime

def track_market_trends(zip_codes, mantis_api_key):
    """Monitor market trends across multiple ZIP codes."""
    trends = {}
    
    for zip_code in zip_codes:
        response = requests.post('https://api.mantisapi.com/v1/extract', json={
            'url': f'https://www.zillow.com/homes/{zip_code}_rb/',
            'schema': {
                'total_listings': 'number - Total properties for sale',
                'median_price': 'number - Median listing price',
                'listings': [{
                    'price': 'number',
                    'days_on_market': 'number',
                    'price_cut': 'boolean - Has the price been reduced',
                }]
            }
        }, headers={'Authorization': f'Bearer {mantis_api_key}'})
        
        data = response.json()['data']
        listings = data.get('listings', [])
        
        prices = [l['price'] for l in listings if l.get('price')]
        dom_values = [l['days_on_market'] for l in listings if l.get('days_on_market')]
        price_cuts = sum(1 for l in listings if l.get('price_cut'))
        
        trends[zip_code] = {
            'date': datetime.now().isoformat(),
            'total_listings': data.get('total_listings', len(listings)),
            'median_price': sorted(prices)[len(prices)//2] if prices else None,
            'avg_days_on_market': sum(dom_values) / len(dom_values) if dom_values else None,
            'price_cut_pct': (price_cuts / len(listings) * 100) if listings else 0,
            'market_temp': 'HOT' if (dom_values and sum(dom_values)/len(dom_values) < 15)
                          else 'WARM' if (dom_values and sum(dom_values)/len(dom_values) < 30)
                          else 'COLD'
        }
    
    return trends

3. AI Agent Property Scout

An AI agent that finds undervalued properties by comparing listing prices to Zestimates and analyzing price history trends.

def find_undervalued_properties(location, min_discount_pct=10, mantis_api_key=None):
    """Find properties listed below their Zestimate."""
    response = requests.post('https://api.mantisapi.com/v1/extract', json={
        'url': f'https://www.zillow.com/homes/{location}_rb/',
        'schema': {
            'listings': [{
                'address': 'string',
                'price': 'number - Listing price',
                'zestimate': 'number - Zillow Zestimate',
                'beds': 'number',
                'baths': 'number',
                'sqft': 'number',
                'days_on_market': 'number',
                'url': 'string - Detail page URL',
            }]
        }
    }, headers={'Authorization': f'Bearer {mantis_api_key}'})
    
    listings = response.json()['data']['listings']
    
    deals = []
    for listing in listings:
        price = listing.get('price', 0)
        zestimate = listing.get('zestimate', 0)
        
        if price and zestimate and zestimate > 0:
            discount = (zestimate - price) / zestimate * 100
            if discount >= min_discount_pct:
                deals.append({
                    **listing,
                    'discount_pct': round(discount, 1),
                    'savings': zestimate - price,
                })
    
    # Sort by biggest discount
    deals.sort(key=lambda x: x['discount_pct'], reverse=True)
    return deals

# Find properties at least 10% below Zestimate
deals = find_undervalued_properties('Austin-TX', min_discount_pct=10, mantis_api_key='YOUR_KEY')
for d in deals[:5]:
    print(f"🏠 {d['address']}")
    print(f"   Listed: ${d['price']:,} | Zestimate: ${d['zestimate']:,} | {d['discount_pct']}% below")
    print(f"   {d['beds']}bd/{d['baths']}ba | {d['sqft']:,} sqft | {d['days_on_market']} days on market")
    print()

Legal Considerations

Key Legal Precedents

Zillow v. VHT Inc. (2019): VHT sued Zillow for using professional listing photos beyond license terms. The court ruled that listing photos can be copyrighted and unauthorized use infringes. This is critical: you can scrape listing data (prices, addresses, details) but using copyrighted photos requires permission.

hiQ Labs v. LinkedIn (2022): The Ninth Circuit confirmed that scraping publicly available data is not a violation of the Computer Fraud and Abuse Act (CFAA). This supports scraping Zillow's public listings.

Van Buren v. United States (2021): The Supreme Court narrowed the CFAA — accessing publicly available data doesn't constitute "exceeding authorized access."

Practical Guidelines

✅ Safe: Scraping public listing data (prices, addresses, details, Zestimates)
✅ Safe: Storing data for analysis and research
⚠️ Caution: Republishing listing photos without permission (VHT v. Zillow precedent)
⚠️ Caution: Republishing Zestimates without attribution
❌ Avoid: Accessing data behind login walls
❌ Avoid: Overloading Zillow's servers with excessive requests
❌ Avoid: Creating a competing real estate marketplace with scraped data

⚠️ Note: Zillow actively sends cease-and-desist letters to scrapers. While scraping public data is legal under current precedent, Zillow's legal team is aggressive. Use reasonable rate limits and avoid creating a competing service.

Frequently Asked Questions

Is it legal to scrape Zillow?

Scraping publicly available Zillow data is generally legal under the hiQ v. LinkedIn precedent and Van Buren v. US ruling. However, Zillow's ToS prohibit automated scraping and they actively enforce via cease-and-desist letters. Listing photos may be copyrighted (VHT v. Zillow, 2019). Stick to public data, respect rate limits.

Does Zillow have an API?

Zillow shut down its free public API (Zillow Web Services) in 2021. The Bridge API is MLS-partner-only. The Zestimate API is discontinued. For programmatic access to Zillow data, scraping or a third-party API like Mantis is the only practical option.

How do I avoid getting blocked by Zillow?

Use rotating residential proxies, a stealth-configured headless browser (Playwright or Puppeteer with stealth plugin), random delays of 3-8 seconds between requests, and realistic browser fingerprints. Avoid datacenter IPs — PerimeterX blocks them immediately.

What is the __NEXT_DATA__ trick?

Zillow is built on Next.js, which embeds page data in a <script id="__NEXT_DATA__"> tag. This contains structured JSON with all property data — prices, details, history, schools — eliminating the need to parse HTML. It's the most reliable extraction method.

Can I scrape Zillow Zestimates?

Yes, Zestimates are publicly displayed on property pages and embedded in the __NEXT_DATA__ JSON payload. They can be extracted programmatically. If republishing, attribute them as Zillow's proprietary valuation.

What's the best language for scraping Zillow?

Python with Playwright is the most popular choice — Zillow requires JavaScript rendering, and Playwright handles it well with stealth configuration. Node.js with Puppeteer is equally effective. For production, Mantis API provides structured data without managing browsers or proxies.

Extract Zillow Data Without the Headaches

PerimeterX, rotating proxies, browser fingerprints... or one API call. Your choice.

View Pricing Get Started Free

📑 Table of Contents

What Data Can You Extract from Zillow?

Method 1: Python + Requests + BeautifulSoup

Search Results Scraping

Property Detail Scraping

Method 2: Playwright Headless Browser

Map-Based Search (Advanced)

Method 3: Node.js + Puppeteer

Method 4: Mantis API (Production-Ready)

Skip the Anti-Bot Battle

Zillow's Anti-Bot Defenses

PerimeterX (HUMAN Security)

IP Rate Limiting

Search Query Detection

Bypassing Zillow's Defenses

Zillow API vs Scraping vs Mantis

Real-World Use Cases

1. Investment Property Analyzer

2. Market Trend Monitor

3. AI Agent Property Scout

Legal Considerations

Key Legal Precedents

Practical Guidelines

Frequently Asked Questions

Extract Zillow Data Without the Headaches

Related Guides