Web Scraping with Selenium and Python in 2026: The Complete Guide

Q: How do I make Selenium undetectable?

Use undetected-chromedriver instead of standard ChromeDriver, set realistic window sizes and user agents, disable automation flags (--disable-blink-features=AutomationControlled), add random delays between actions, rotate proxies, and handle cookies/sessions properly. For advanced anti-bot systems, you may also need to spoof WebGL, Canvas, and AudioContext fingerprints. Even with all this, detection is an arms race — web scraping APIs handle it automatically.

Q: Should I use Selenium or an API for web scraping?

Use Selenium when you need complex browser interactions (multi-step forms, custom JavaScript, login flows) on a small number of sites, or when you need fine-grained control over browser behavior. Use a web scraping API like Mantis when you need scale (thousands of pages), reliability (built-in anti-detection), or cost efficiency (no browser infrastructure). Most teams start with Selenium and switch to APIs as they scale beyond a few hundred pages per day.

Published March 15, 2026 · 18 min read · Updated for Selenium 4.x

Selenium is the most widely-used browser automation framework in the world. With over 30,000 GitHub stars and millions of active users, it's the tool most developers learn first for web scraping. This guide covers everything you need to scrape the modern web with Selenium and Python in 2026 — from basic setup to production-ready anti-detection.

We'll cover setup, headless Chrome, explicit waits, pagination, login handling, proxy rotation, stealth techniques, and when it makes sense to switch to a web scraping API instead.

Setting Up Selenium with Python
Basic Web Scraping
Headless Chrome Configuration
Waiting Strategies
Handling Pagination
Infinite Scroll Pages
Login and Authentication
Anti-Detection and Stealth
Proxy Rotation
Screenshots and PDFs
Production-Ready Scraper
Selenium vs Playwright vs APIs
Cost Analysis
FAQ

1. Setting Up Selenium with Python

Installation

# Install Selenium and webdriver-manager
pip install selenium webdriver-manager

# Or with all common extras
pip install selenium webdriver-manager beautifulsoup4 lxml

Since Selenium 4.6+, you no longer need to manually download ChromeDriver — Selenium Manager handles it automatically. But webdriver-manager gives you more control:

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from webdriver_manager.chrome import ChromeDriverManager

# Selenium 4.x setup
options = Options()
service = Service(ChromeDriverManager().install())
driver = webdriver.Chrome(service=service, options=options)

# Navigate to a page
driver.get("https://example.com")
print(driver.title)

# Always clean up
driver.quit()

Selenium Manager (Built-in, No Extra Dependencies)

from selenium import webdriver

# Selenium 4.6+ handles driver management automatically
driver = webdriver.Chrome()
driver.get("https://example.com")
print(driver.title)
driver.quit()

2. Basic Web Scraping

Selenium finds elements using locators. The recommended approach in Selenium 4 is the By class:

from selenium import webdriver
from selenium.webdriver.common.by import By

driver = webdriver.Chrome()
driver.get("https://quotes.toscrape.com")

# Find elements by CSS selector
quotes = driver.find_elements(By.CSS_SELECTOR, "div.quote")

for quote in quotes:
    text = quote.find_element(By.CSS_SELECTOR, "span.text").text
    author = quote.find_element(By.CSS_SELECTOR, "small.author").text
    tags = [tag.text for tag in quote.find_elements(By.CSS_SELECTOR, "a.tag")]
    print(f"{text}\n  — {author} | Tags: {', '.join(tags)}\n")

driver.quit()

Common Locator Strategies

# By ID
element = driver.find_element(By.ID, "search-input")

# By class name
elements = driver.find_elements(By.CLASS_NAME, "product-card")

# By CSS selector (most flexible)
element = driver.find_element(By.CSS_SELECTOR, "div.results > a.link")

# By XPath (for complex traversals)
element = driver.find_element(By.XPATH, "//div[@data-testid='price']/span")

# By link text
element = driver.find_element(By.LINK_TEXT, "Next Page")

# By partial link text
element = driver.find_element(By.PARTIAL_LINK_TEXT, "Next")

3. Headless Chrome Configuration

For scraping, you almost always want headless mode — no visible browser window, faster execution, and lower resource usage:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options

options = Options()

# Core headless settings
options.add_argument("--headless=new")  # New headless mode (Chrome 109+)
options.add_argument("--no-sandbox")
options.add_argument("--disable-dev-shm-usage")

# Performance optimizations
options.add_argument("--disable-gpu")
options.add_argument("--disable-extensions")
options.add_argument("--disable-infobars")
options.add_argument("--window-size=1920,1080")

# Reduce memory usage
options.add_argument("--disable-images")  # Skip loading images
options.add_argument("--blink-settings=imagesEnabled=false")

# Set a realistic user agent
options.add_argument(
    "user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) "
    "AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36"
)

driver = webdriver.Chrome(options=options)
driver.get("https://example.com")
print(f"Page title: {driver.title}")
driver.quit()

4. Waiting Strategies

Modern websites load content dynamically. Never use time.sleep() — use Selenium's built-in waits instead:

Explicit Waits (Recommended)

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By

driver = webdriver.Chrome(options=options)
driver.get("https://example.com/dynamic-page")

# Wait up to 10 seconds for an element to appear
wait = WebDriverWait(driver, 10)

# Wait for element to be present in DOM
element = wait.until(
    EC.presence_of_element_located((By.CSS_SELECTOR, "div.results"))
)

# Wait for element to be clickable
button = wait.until(
    EC.element_to_be_clickable((By.ID, "load-more"))
)

# Wait for text to appear in element
wait.until(
    EC.text_to_be_present_in_element((By.ID, "status"), "Complete")
)

# Wait for element to disappear (loading spinner)
wait.until(
    EC.invisibility_of_element_located((By.CSS_SELECTOR, ".spinner"))
)

# Custom wait condition
def results_loaded(driver):
    items = driver.find_elements(By.CSS_SELECTOR, ".result-item")
    return len(items) > 0

wait.until(results_loaded)

Implicit Waits (Simpler but Less Control)

# Sets a default wait time for all find_element calls
driver.implicitly_wait(10)  # Wait up to 10 seconds

# Now all find_element calls will wait up to 10 seconds
# before throwing NoSuchElementException
element = driver.find_element(By.CSS_SELECTOR, "div.results")

Best practice: Use explicit waits for specific conditions, and avoid mixing implicit and explicit waits (it can cause unpredictable timeout behavior).

5. Handling Pagination

import json
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

def scrape_paginated_site(base_url, max_pages=10):
    driver = webdriver.Chrome(options=options)
    wait = WebDriverWait(driver, 10)
    all_items = []

    try:
        driver.get(base_url)

        for page in range(max_pages):
            # Wait for results to load
            wait.until(
                EC.presence_of_element_located((By.CSS_SELECTOR, ".product-card"))
            )

            # Extract data from current page
            cards = driver.find_elements(By.CSS_SELECTOR, ".product-card")
            for card in cards:
                item = {
                    "name": card.find_element(By.CSS_SELECTOR, "h3").text,
                    "price": card.find_element(By.CSS_SELECTOR, ".price").text,
                    "url": card.find_element(By.CSS_SELECTOR, "a").get_attribute("href"),
                }
                all_items.append(item)

            print(f"Page {page + 1}: scraped {len(cards)} items")

            # Try to click "Next" button
            try:
                next_btn = driver.find_element(By.CSS_SELECTOR, "a.next-page")
                if "disabled" in next_btn.get_attribute("class"):
                    break
                next_btn.click()

                # Wait for new content to load
                wait.until(EC.staleness_of(cards[0]))
            except Exception:
                break  # No more pages

    finally:
        driver.quit()

    return all_items

items = scrape_paginated_site("https://example.com/products")
print(f"Total items scraped: {len(items)}")

6. Infinite Scroll Pages

import time
from selenium import webdriver
from selenium.webdriver.common.by import By

def scrape_infinite_scroll(url, max_scrolls=20, scroll_pause=2):
    driver = webdriver.Chrome(options=options)
    driver.get(url)

    items = set()
    last_height = driver.execute_script("return document.body.scrollHeight")

    for i in range(max_scrolls):
        # Scroll to bottom
        driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
        time.sleep(scroll_pause)

        # Collect items
        elements = driver.find_elements(By.CSS_SELECTOR, ".feed-item")
        for el in elements:
            items.add(el.text)

        # Check if we've reached the bottom
        new_height = driver.execute_script("return document.body.scrollHeight")
        if new_height == last_height:
            print(f"Reached bottom after {i + 1} scrolls")
            break
        last_height = new_height

        print(f"Scroll {i + 1}: {len(items)} unique items")

    driver.quit()
    return list(items)

results = scrape_infinite_scroll("https://example.com/feed")

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import pickle
import os

class AuthenticatedScraper:
    def __init__(self):
        self.driver = webdriver.Chrome(options=options)
        self.wait = WebDriverWait(self.driver, 10)
        self.cookies_file = "cookies.pkl"

    def login(self, username, password):
        """Login and save cookies for session reuse."""
        # Check for saved cookies first
        if os.path.exists(self.cookies_file):
            self.driver.get("https://example.com")
            cookies = pickle.load(open(self.cookies_file, "rb"))
            for cookie in cookies:
                self.driver.add_cookie(cookie)
            self.driver.refresh()

            # Verify we're logged in
            try:
                self.wait.until(
                    EC.presence_of_element_located((By.CSS_SELECTOR, ".user-menu"))
                )
                print("Restored session from cookies")
                return True
            except Exception:
                pass  # Cookies expired, login fresh

        # Fresh login
        self.driver.get("https://example.com/login")

        username_field = self.wait.until(
            EC.presence_of_element_located((By.NAME, "username"))
        )
        username_field.clear()
        username_field.send_keys(username)

        password_field = self.driver.find_element(By.NAME, "password")
        password_field.clear()
        password_field.send_keys(password)

        # Click login button
        login_btn = self.driver.find_element(By.CSS_SELECTOR, "button[type='submit']")
        login_btn.click()

        # Wait for login to complete
        self.wait.until(
            EC.presence_of_element_located((By.CSS_SELECTOR, ".user-menu"))
        )

        # Save cookies
        pickle.dump(self.driver.get_cookies(), open(self.cookies_file, "wb"))
        print("Login successful, cookies saved")
        return True

    def scrape_protected_page(self, url):
        """Scrape a page that requires authentication."""
        self.driver.get(url)
        self.wait.until(
            EC.presence_of_element_located((By.CSS_SELECTOR, ".content"))
        )
        return self.driver.find_element(By.CSS_SELECTOR, ".content").text

    def close(self):
        self.driver.quit()

8. Anti-Detection and Stealth

Default Selenium is trivially detected by anti-bot systems. Here's how to reduce your detection footprint:

Using undetected-chromedriver

# pip install undetected-chromedriver
import undetected_chromedriver as uc

# Automatically patches ChromeDriver to avoid detection
driver = uc.Chrome(headless=True, version_main=122)
driver.get("https://nowsecure.nl")  # Anti-bot test site

# Check if we passed
print(driver.title)
driver.quit()

Manual Stealth Configuration

from selenium import webdriver
from selenium.webdriver.chrome.options import Options

options = Options()
options.add_argument("--headless=new")
options.add_argument("--window-size=1920,1080")

# Key anti-detection flags
options.add_argument("--disable-blink-features=AutomationControlled")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option("useAutomationExtension", False)

# Realistic user agent
options.add_argument(
    "user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) "
    "AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36"
)

driver = webdriver.Chrome(options=options)

# Remove navigator.webdriver flag
driver.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", {
    "source": """
        Object.defineProperty(navigator, 'webdriver', {get: () => undefined});
        Object.defineProperty(navigator, 'languages', {get: () => ['en-US', 'en']});
        Object.defineProperty(navigator, 'plugins', {
            get: () => [1, 2, 3, 4, 5]
        });
        window.chrome = { runtime: {} };
    """
})

driver.get("https://bot.sannysoft.com")
driver.save_screenshot("stealth_test.png")
driver.quit()

Human-Like Behavior

import random
import time
from selenium.webdriver.common.action_chains import ActionChains

def human_like_delay(min_sec=0.5, max_sec=2.0):
    """Random delay to mimic human behavior."""
    time.sleep(random.uniform(min_sec, max_sec))

def human_like_scroll(driver):
    """Scroll like a human — not perfectly to the bottom."""
    total_height = driver.execute_script("return document.body.scrollHeight")
    viewport = driver.execute_script("return window.innerHeight")
    current = 0

    while current < total_height:
        scroll_amount = random.randint(200, viewport)
        current += scroll_amount
        driver.execute_script(f"window.scrollTo(0, {current});")
        time.sleep(random.uniform(0.3, 1.2))

def human_like_type(element, text):
    """Type text character by character with random delays."""
    for char in text:
        element.send_keys(char)
        time.sleep(random.uniform(0.05, 0.15))

def random_mouse_movement(driver):
    """Move mouse to random positions on the page."""
    actions = ActionChains(driver)
    body = driver.find_element(By.TAG_NAME, "body")
    for _ in range(random.randint(2, 5)):
        x = random.randint(100, 800)
        y = random.randint(100, 600)
        actions.move_to_element_with_offset(body, x, y)
        actions.pause(random.uniform(0.1, 0.5))
    actions.perform()

9. Proxy Rotation

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import random

PROXIES = [
    "http://user:pass@proxy1.example.com:8080",
    "http://user:pass@proxy2.example.com:8080",
    "http://user:pass@proxy3.example.com:8080",
]

def create_driver_with_proxy(proxy_url=None):
    """Create a Chrome driver with proxy support."""
    options = Options()
    options.add_argument("--headless=new")
    options.add_argument("--disable-blink-features=AutomationControlled")

    if proxy_url:
        options.add_argument(f"--proxy-server={proxy_url}")

    return webdriver.Chrome(options=options)

# Rotate proxy per request
proxy = random.choice(PROXIES)
driver = create_driver_with_proxy(proxy)
driver.get("https://httpbin.org/ip")
print(driver.find_element(By.TAG_NAME, "body").text)
driver.quit()

Using Selenium Wire for Advanced Proxy Control

# pip install selenium-wire
from seleniumwire import webdriver

proxy_options = {
    "proxy": {
        "http": "http://user:pass@proxy.example.com:8080",
        "https": "http://user:pass@proxy.example.com:8080",
    }
}

driver = webdriver.Chrome(seleniumwire_options=proxy_options)
driver.get("https://httpbin.org/ip")
print(driver.find_element(By.TAG_NAME, "body").text)

# Selenium Wire also lets you intercept/modify requests
for request in driver.requests:
    if request.response:
        print(f"{request.url} → {request.response.status_code}")

driver.quit()

10. Screenshots and PDFs

# Full page screenshot
driver.save_screenshot("page.png")

# Element screenshot
element = driver.find_element(By.CSS_SELECTOR, ".chart-container")
element.screenshot("chart.png")

# Full page screenshot (requires scrolling for long pages)
def full_page_screenshot(driver, filename):
    """Capture full page by adjusting window size."""
    total_height = driver.execute_script("return document.body.scrollHeight")
    total_width = driver.execute_script("return document.body.scrollWidth")
    driver.set_window_size(total_width, total_height)
    driver.save_screenshot(filename)
    driver.set_window_size(1920, 1080)  # Reset

# Save as PDF (Chrome headless)
def save_as_pdf(driver, filename):
    """Save page as PDF using Chrome DevTools Protocol."""
    import base64
    result = driver.execute_cdp_cmd("Page.printToPDF", {
        "printBackground": True,
        "preferCSSPageSize": True,
    })
    with open(filename, "wb") as f:
        f.write(base64.b64decode(result["data"]))

11. Production-Ready Scraper

Here's a complete, production-ready scraper with retries, error handling, and data export:

import json
import csv
import time
import random
import logging
from dataclasses import dataclass, asdict
from typing import List, Optional
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import (
    TimeoutException, StaleElementReferenceException,
    NoSuchElementException, WebDriverException
)

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

@dataclass
class Product:
    name: str
    price: str
    url: str
    rating: Optional[str] = None
    reviews: Optional[int] = None

class ProductScraper:
    def __init__(self, headless=True, max_retries=3):
        self.max_retries = max_retries
        self.options = Options()
        if headless:
            self.options.add_argument("--headless=new")
        self.options.add_argument("--no-sandbox")
        self.options.add_argument("--disable-dev-shm-usage")
        self.options.add_argument("--disable-blink-features=AutomationControlled")
        self.options.add_argument("--window-size=1920,1080")
        self.options.add_argument(
            "user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) "
            "AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36"
        )
        self.driver = None
        self.wait = None

    def start(self):
        self.driver = webdriver.Chrome(options=self.options)
        self.wait = WebDriverWait(self.driver, 15)
        logger.info("Browser started")

    def stop(self):
        if self.driver:
            self.driver.quit()
            logger.info("Browser stopped")

    def _retry(self, func, *args, **kwargs):
        """Retry a function with exponential backoff."""
        for attempt in range(self.max_retries):
            try:
                return func(*args, **kwargs)
            except (TimeoutException, WebDriverException) as e:
                wait_time = (2 ** attempt) + random.uniform(0, 1)
                logger.warning(
                    f"Attempt {attempt + 1}/{self.max_retries} failed: {e}. "
                    f"Retrying in {wait_time:.1f}s"
                )
                time.sleep(wait_time)
                if attempt == self.max_retries - 1:
                    raise

    def scrape_page(self, url) -> List[Product]:
        """Scrape all products from a single page."""
        self.driver.get(url)
        self.wait.until(
            EC.presence_of_element_located((By.CSS_SELECTOR, ".product-card"))
        )

        products = []
        cards = self.driver.find_elements(By.CSS_SELECTOR, ".product-card")

        for card in cards:
            try:
                product = Product(
                    name=card.find_element(By.CSS_SELECTOR, "h3").text.strip(),
                    price=card.find_element(By.CSS_SELECTOR, ".price").text.strip(),
                    url=card.find_element(By.CSS_SELECTOR, "a").get_attribute("href"),
                )
                try:
                    product.rating = card.find_element(
                        By.CSS_SELECTOR, ".rating"
                    ).text.strip()
                except NoSuchElementException:
                    pass
                products.append(product)
            except StaleElementReferenceException:
                logger.warning("Stale element, skipping card")
                continue

        return products

    def scrape_all_pages(self, start_url, max_pages=50) -> List[Product]:
        """Scrape products across multiple pages."""
        all_products = []
        url = start_url

        for page_num in range(1, max_pages + 1):
            logger.info(f"Scraping page {page_num}: {url}")

            products = self._retry(self.scrape_page, url)
            all_products.extend(products)
            logger.info(f"  Found {len(products)} products (total: {len(all_products)})")

            # Find next page
            try:
                next_link = self.driver.find_element(By.CSS_SELECTOR, "a.next-page")
                url = next_link.get_attribute("href")
                time.sleep(random.uniform(1, 3))  # Polite delay
            except NoSuchElementException:
                logger.info("No more pages")
                break

        return all_products

    @staticmethod
    def export_json(products: List[Product], filename: str):
        with open(filename, "w") as f:
            json.dump([asdict(p) for p in products], f, indent=2)
        logger.info(f"Exported {len(products)} products to {filename}")

    @staticmethod
    def export_csv(products: List[Product], filename: str):
        with open(filename, "w", newline="") as f:
            writer = csv.DictWriter(f, fieldnames=["name", "price", "url", "rating", "reviews"])
            writer.writeheader()
            writer.writerows(asdict(p) for p in products)
        logger.info(f"Exported {len(products)} products to {filename}")

# Usage
if __name__ == "__main__":
    scraper = ProductScraper(headless=True)
    try:
        scraper.start()
        products = scraper.scrape_all_pages("https://example.com/products")
        scraper.export_json(products, "products.json")
        scraper.export_csv(products, "products.csv")
        print(f"\nScraped {len(products)} products successfully!")
    finally:
        scraper.stop()

12. Selenium vs Playwright vs Web Scraping APIs

Feature	Selenium	Playwright	Mantis API
Setup complexity	Medium (driver management)	Low (auto-install)	None (HTTP calls)
Speed per page	3-15 seconds	1-8 seconds	1-5 seconds
Memory per instance	300-600 MB	200-500 MB	0 (serverless)
Anti-detection	Manual (undetected-chromedriver)	Manual (stealth plugin)	Built-in
Proxy management	Manual or selenium-wire	Built-in	Built-in
JavaScript rendering	Yes (full browser)	Yes (full browser)	Yes (cloud rendering)
Auto-waiting	Explicit waits required	Built-in auto-wait	N/A (returns when ready)
Community size	Largest (30K+ stars)	Growing (65K+ stars)	API-based
Language support	Python, Java, JS, C#, Ruby	Python, JS, Java, .NET	Any (HTTP/REST)
Scaling	Hard (infrastructure)	Hard (infrastructure)	Easy (API calls)
AI data extraction	No	No	Yes (built-in)
Best for	Legacy projects, broad language support	New projects, performance	Production at scale

13. Cost Analysis: DIY Selenium vs. Mantis API

Cost Component	DIY Selenium	Mantis API
Compute (cloud VMs for browsers)	$150-500/mo	$0
Residential proxies	$50-200/mo	$0 (included)
CAPTCHA solving	$20-100/mo	$0 (handled)
ChromeDriver management	$0 (time cost)	$0
Developer time (maintenance)	10-20 hrs/mo	~0
Total monthly cost	$200-800 + time	$29-299

🦐 Stop Managing Browsers — Start Shipping Data

Mantis API handles rendering, proxies, anti-detection, and AI extraction. One API call replaces 200+ lines of Selenium code.

Get Started Free →

14. Frequently Asked Questions

Is Selenium good for web scraping in 2026?

Selenium remains a solid choice for web scraping in 2026, especially for JavaScript-heavy sites that require full browser rendering. It has the largest community of any browser automation tool, extensive documentation, and supports Chrome, Firefox, Edge, and Safari. However, it's slower than HTTP-based scraping and newer tools like Playwright offer better performance. For large-scale scraping, a web scraping API like Mantis is more cost-effective and reliable.

Is Selenium or Playwright better for web scraping?

Playwright is generally faster and has better built-in features (auto-waiting, network interception, multi-browser from one API). Selenium has a much larger community, more tutorials, better IDE integration, and longer track record. For new projects in 2026, Playwright is the better technical choice, but Selenium is perfectly capable and many developers prefer it for its familiarity and ecosystem.

Can websites detect Selenium scraping?

Yes. Default Selenium instances are easily detected through the navigator.webdriver property, ChromeDriver-specific JavaScript variables, missing browser plugins, and WebDriver protocol fingerprints. Tools like undetected-chromedriver help bypass basic detection, but sophisticated anti-bot systems can still detect Selenium through behavioral analysis, TLS fingerprinting, and HTTP/2 characteristics.

How much does Selenium web scraping cost to run?

Running Selenium at scale costs $200-800/month: cloud compute for headless browsers ($150-500), residential proxies ($50-200), and CAPTCHA solving ($20-100). Each browser instance uses 300-600MB RAM. A web scraping API like Mantis costs $29-299/month and handles everything automatically.

How do I make Selenium undetectable?

Use undetected-chromedriver, set realistic window sizes and user agents, disable automation flags, add random delays between actions, rotate proxies, and handle cookies properly. For advanced anti-bot systems, you may also need to spoof WebGL, Canvas, and AudioContext fingerprints. Web scraping APIs handle all of this automatically.

Should I use Selenium or an API for web scraping?

Use Selenium for complex browser interactions on a small number of sites. Use an API like Mantis when you need scale, reliability, or cost efficiency. Most teams start with Selenium and switch to APIs as they scale beyond a few hundred pages per day.

Conclusion

Selenium remains one of the most popular tools for web scraping in 2026. Its massive community, multi-language support, and battle-tested reliability make it a solid choice — especially if you're already familiar with it from testing.

For small to medium scraping projects (up to a few hundred pages per day), Selenium with undetected-chromedriver and residential proxies can handle most sites. But as you scale, the infrastructure cost and maintenance burden grows quickly.

That's where a web scraping API shines — one API call replaces hundreds of lines of Selenium code, and you never worry about driver updates, proxy rotation, or anti-detection again.

Web Scraping with Selenium and Python in 2026: The Complete Guide

Table of Contents

1. Setting Up Selenium with Python

Installation

Selenium Manager (Built-in, No Extra Dependencies)

2. Basic Web Scraping

Common Locator Strategies

3. Headless Chrome Configuration

4. Waiting Strategies

Explicit Waits (Recommended)

Implicit Waits (Simpler but Less Control)

6. Infinite Scroll Pages

8. Anti-Detection and Stealth

Using undetected-chromedriver

Manual Stealth Configuration

Human-Like Behavior

9. Proxy Rotation

Using Selenium Wire for Advanced Proxy Control

10. Screenshots and PDFs

11. Production-Ready Scraper

12. Selenium vs Playwright vs Web Scraping APIs

13. Cost Analysis: DIY Selenium vs. Mantis API

🦐 Stop Managing Browsers — Start Shipping Data

14. Frequently Asked Questions

Is Selenium good for web scraping in 2026?

Is Selenium or Playwright better for web scraping?

Can websites detect Selenium scraping?

How much does Selenium web scraping cost to run?

How do I make Selenium undetectable?

Should I use Selenium or an API for web scraping?

Conclusion

Related Guides

Web Scraping with Selenium and Python in 2026: The Complete Guide

Table of Contents

1. Setting Up Selenium with Python

Installation

Selenium Manager (Built-in, No Extra Dependencies)

2. Basic Web Scraping

Common Locator Strategies

3. Headless Chrome Configuration

4. Waiting Strategies

Explicit Waits (Recommended)

Implicit Waits (Simpler but Less Control)

5. Handling Pagination

6. Infinite Scroll Pages

7. Login and Authentication

8. Anti-Detection and Stealth

Using undetected-chromedriver

Manual Stealth Configuration

Human-Like Behavior

9. Proxy Rotation

Using Selenium Wire for Advanced Proxy Control

10. Screenshots and PDFs

11. Production-Ready Scraper

12. Selenium vs Playwright vs Web Scraping APIs

13. Cost Analysis: DIY Selenium vs. Mantis API

🦐 Stop Managing Browsers — Start Shipping Data

14. Frequently Asked Questions

Is Selenium good for web scraping in 2026?

Is Selenium or Playwright better for web scraping?

Can websites detect Selenium scraping?

How much does Selenium web scraping cost to run?

How do I make Selenium undetectable?

Should I use Selenium or an API for web scraping?

Conclusion

Related Guides