Web Scraping for Sports, Betting & Fantasy: How AI Agents Track Odds, Stats & Player Data in 2026

Published March 12, 2026 · 15 min read · Sports Betting Fantasy Sports AI Agents Web Scraping

The global sports betting market surpassed $85 billion in 2025, with the US alone generating $15B+ in handle after legalization swept through 38 states. Fantasy sports adds another $22B. And behind every successful bettor, DFS player, and sports analytics startup sits one common need: better data, faster.

Yet professional sports data is shockingly expensive. Sportradar charges $1M+/year for real-time feeds. Stats Perform (Opta) commands $500K-$2M annually. Even basic historical data packages from providers like SportsDataIO start at $500-$2,000/month. All while much of the underlying data — odds, box scores, injury reports, roster moves — is published freely across dozens of public websites.

In this guide, you'll build an AI-powered sports intelligence system that scrapes betting odds across sportsbooks, tracks player stats and injury reports, monitors line movements, generates fantasy projections, and uses GPT-4o to find edges that humans miss.

Why AI Agents Are Transforming Sports Analytics

Sports data has characteristics that make it perfect for AI agent automation:

Time-critical: Odds move in seconds after injury news breaks. A 30-second edge in line shopping can mean the difference between +EV and -EV. Manual monitoring of 10+ sportsbooks is physically impossible.
Massive volume: The NFL alone generates 50,000+ player-game stat combinations per season. Add NBA, MLB, NHL, soccer, tennis, and you're looking at millions of data points per week.
Fragmented sources: Odds live on DraftKings, FanDuel, BetMGM, and 20+ other books. Stats span ESPN, Basketball-Reference, FBRef, and league official sites. Injury reports come from team Twitter accounts, beat reporters, and official league transactions.
Pattern-rich: Line movements correlate with sharp money. Injury impact varies by position and scheme. Weather affects totals. These patterns are discoverable — if you have the data.

🏈 The opportunity: Professional sports data platforms charge $500-$2M/year for data that largely originates from public sources. An AI agent with Mantis WebPerception API delivers 80% of that value for $29-$299/month.

Architecture: The 6-Step Sports Intelligence Pipeline

Source Discovery — Identify sportsbooks, stats sites, injury feeds, and transaction wires to monitor
AI-Powered Extraction — Use Mantis WebPerception API to scrape and structure sports data from complex, JS-heavy pages
SQLite Storage — Store historical odds, player stats, injuries, and line movements locally
Edge Detection — Flag line discrepancies, sharp money movements, injury impacts, and value bets
GPT-4o Analysis — AI generates projections, identifies correlations, and produces betting/fantasy insights
Slack/Discord Alerts — Real-time notifications for odds changes, injury news, and detected edges

Step 1: Define Your Sports Data Models

First, create Pydantic schemas for structured sports data extraction:

from pydantic import BaseModel
from typing import Optional, List
from datetime import datetime
from enum import Enum

class Sport(str, Enum):
    NFL = "nfl"
    NBA = "nba"
    MLB = "mlb"
    NHL = "nhl"
    NCAAF = "ncaaf"
    NCAAB = "ncaab"
    SOCCER = "soccer"
    TENNIS = "tennis"
    MMA = "mma"

class BettingOdds(BaseModel):
    """Betting odds from a sportsbook for a specific market."""
    sportsbook: str             # "draftkings", "fanduel", "betmgm", "caesars"
    sport: Sport
    event_name: str             # "Chiefs vs Eagles" or "Lakers vs Celtics"
    event_date: str
    market_type: str            # "spread", "moneyline", "total", "player_prop"
    selection: str              # "Chiefs -3.5", "Over 47.5", "Mahomes Over 2.5 TDs"
    odds_american: int          # -110, +150, etc.
    odds_decimal: Optional[float] = None
    line: Optional[float] = None  # Spread or total number
    implied_probability: Optional[float] = None
    previous_odds: Optional[int] = None
    previous_line: Optional[float] = None
    movement_direction: Optional[str] = None  # "steam", "reverse", "stable"
    scraped_at: str

class PlayerStats(BaseModel):
    """Player performance statistics from a game or season."""
    player_name: str
    team: str
    sport: Sport
    season: str                 # "2025-26"
    game_date: Optional[str] = None
    opponent: Optional[str] = None
    minutes_played: Optional[float] = None
    # Universal stats
    points: Optional[float] = None
    assists: Optional[float] = None
    rebounds: Optional[float] = None
    # Sport-specific
    passing_yards: Optional[float] = None
    rushing_yards: Optional[float] = None
    touchdowns: Optional[int] = None
    strikeouts: Optional[int] = None
    era: Optional[float] = None
    goals: Optional[int] = None
    shots_on_target: Optional[int] = None
    # Advanced
    usage_rate: Optional[float] = None
    true_shooting_pct: Optional[float] = None
    war: Optional[float] = None
    epa_per_play: Optional[float] = None
    xg: Optional[float] = None  # Expected goals (soccer)
    source: str
    scraped_at: str

class InjuryReport(BaseModel):
    """Player injury status from official or media sources."""
    player_name: str
    team: str
    sport: Sport
    injury_type: str            # "knee", "hamstring", "concussion", "illness"
    status: str                 # "out", "doubtful", "questionable", "probable", "day-to-day"
    details: Optional[str] = None
    game_date: Optional[str] = None
    estimated_return: Optional[str] = None
    impact_rating: Optional[str] = None  # "high", "medium", "low"
    source: str                 # "official_report", "beat_reporter", "team_announcement"
    reported_at: str

class LineMovement(BaseModel):
    """Tracked line movement over time for a specific market."""
    sport: Sport
    event_name: str
    event_date: str
    market_type: str
    sportsbook: str
    opening_line: float
    current_line: float
    opening_odds: int
    current_odds: int
    movement_size: float        # Absolute change
    movement_pct: float
    tickets_pct: Optional[float] = None    # % of tickets on this side
    money_pct: Optional[float] = None      # % of money on this side
    sharp_indicator: bool       # True if reverse line movement detected
    timestamp: str

Step 2: Scrape Betting Odds Across Sportsbooks

Use the Mantis WebPerception API to extract real-time odds from multiple sportsbooks:

import requests
import json
import sqlite3
from datetime import datetime

MANTIS_API_KEY = "your-mantis-api-key"
BASE_URL = "https://api.mantisapi.com/v1"

def scrape_sportsbook_odds(sportsbook: str, sport: str, url: str) -> list[BettingOdds]:
    """Scrape current betting odds from a sportsbook."""

    # Step 1: Capture the odds page with JS rendering
    # Sportsbooks are heavily JS-rendered with dynamic updates
    response = requests.post(
        f"{BASE_URL}/scrape",
        headers={"Authorization": f"Bearer {MANTIS_API_KEY}"},
        json={
            "url": url,
            "render_js": True,
            "wait_for": "[class*='odds'], [class*='line'], [class*='spread']",
            "timeout": 30000
        }
    )

    page_data = response.json()

    # Step 2: AI-powered extraction of odds
    extraction = requests.post(
        f"{BASE_URL}/extract",
        headers={"Authorization": f"Bearer {MANTIS_API_KEY}"},
        json={
            "content": page_data["content"],
            "schema": BettingOdds.model_json_schema(),
            "prompt": f"""Extract ALL betting odds from this {sportsbook} {sport} page.
            For each game/event, capture:
            - Both sides of the spread with odds (e.g., Chiefs -3.5 (-110), Eagles +3.5 (-110))
            - Both moneyline odds (e.g., Chiefs -175, Eagles +150)
            - Total (over/under) with odds for both sides
            - Any featured player props if visible
            Convert all odds to American format.
            Calculate implied probability from odds.
            Note the event date/time.""",
            "multiple": True
        }
    )

    odds = [BettingOdds(**o) for o in extraction.json()["data"]]
    return odds

# Monitor major US sportsbooks
sportsbook_urls = {
    "DraftKings": {
        "nfl": "https://sportsbook.draftkings.com/leagues/football/nfl",
        "nba": "https://sportsbook.draftkings.com/leagues/basketball/nba",
        "mlb": "https://sportsbook.draftkings.com/leagues/baseball/mlb",
        "nhl": "https://sportsbook.draftkings.com/leagues/hockey/nhl",
    },
    "FanDuel": {
        "nfl": "https://sportsbook.fanduel.com/football/nfl",
        "nba": "https://sportsbook.fanduel.com/basketball/nba",
        "mlb": "https://sportsbook.fanduel.com/baseball/mlb",
        "nhl": "https://sportsbook.fanduel.com/hockey/nhl",
    },
    "BetMGM": {
        "nfl": "https://sports.betmgm.com/en/sports/football/nfl",
        "nba": "https://sports.betmgm.com/en/sports/basketball/nba",
        "mlb": "https://sports.betmgm.com/en/sports/baseball/mlb",
        "nhl": "https://sports.betmgm.com/en/sports/hockey/nhl",
    },
    "Caesars": {
        "nfl": "https://www.caesars.com/sportsbook-and-casino/sports/football/nfl",
        "nba": "https://www.caesars.com/sportsbook-and-casino/sports/basketball/nba",
        "mlb": "https://www.caesars.com/sportsbook-and-casino/sports/baseball/mlb",
        "nhl": "https://www.caesars.com/sportsbook-and-casino/sports/hockey/nhl",
    },
}

# Scrape all books for today's games
for book, sports in sportsbook_urls.items():
    for sport, url in sports.items():
        try:
            odds = scrape_sportsbook_odds(book, sport, url)
            print(f"✅ {book} {sport.upper()}: {len(odds)} odds captured")
        except Exception as e:
            print(f"❌ {book} {sport}: {e}")

Step 3: Track Player Stats & Performance

Scrape player statistics from multiple sources for comprehensive coverage:

def scrape_player_stats(sport: str, source: str) -> list[PlayerStats]:
    """Scrape player stats from sports reference sites."""

    stat_sources = {
        "nba": [
            {
                "url": "https://www.basketball-reference.com/leagues/NBA_2026_per_game.html",
                "source": "basketball_reference",
                "prompt": """Extract per-game statistics for all NBA players:
                - Player name, team, games played, minutes
                - Points, rebounds, assists, steals, blocks
                - Field goal %, 3-point %, free throw %
                - Turnovers, personal fouls
                - Usage rate and true shooting % if available"""
            },
            {
                "url": "https://www.nba.com/stats/players/traditional",
                "source": "nba_official",
                "prompt": """Extract official NBA player statistics:
                - All traditional stats (PTS, REB, AST, STL, BLK)
                - Minutes per game
                - Shooting splits (FG%, 3P%, FT%)
                - Plus/minus"""
            }
        ],
        "nfl": [
            {
                "url": "https://www.pro-football-reference.com/years/2025/passing.htm",
                "source": "pfr",
                "prompt": """Extract NFL passing statistics:
                - Player name, team, games
                - Completions, attempts, completion %
                - Passing yards, TDs, interceptions
                - Passer rating, QBR if available
                - Yards per attempt, sack data"""
            },
            {
                "url": "https://www.pro-football-reference.com/years/2025/rushing.htm",
                "source": "pfr",
                "prompt": """Extract NFL rushing statistics:
                - Player name, team, games
                - Rushing attempts, yards, TDs
                - Yards per carry, longest run
                - Fumbles and fumbles lost"""
            }
        ],
        "mlb": [
            {
                "url": "https://www.baseball-reference.com/leagues/majors/2026-standard-batting.shtml",
                "source": "baseball_reference",
                "prompt": """Extract MLB batting statistics:
                - Player name, team, games, plate appearances
                - AVG, OBP, SLG, OPS
                - Home runs, RBIs, stolen bases
                - WAR if available"""
            }
        ],
        "soccer": [
            {
                "url": "https://fbref.com/en/comps/9/stats/Premier-League-Stats",
                "source": "fbref",
                "prompt": """Extract Premier League player statistics:
                - Player name, team, position, games, minutes
                - Goals, assists, xG, xAG
                - Shots, shots on target
                - Progressive passes, progressive carries
                - Tackles, interceptions"""
            }
        ]
    }

    all_stats = []
    for src in stat_sources.get(sport, []):
        try:
            response = requests.post(
                f"{BASE_URL}/scrape",
                headers={"Authorization": f"Bearer {MANTIS_API_KEY}"},
                json={"url": src["url"], "render_js": True, "timeout": 30000}
            )

            extraction = requests.post(
                f"{BASE_URL}/extract",
                headers={"Authorization": f"Bearer {MANTIS_API_KEY}"},
                json={
                    "content": response.json()["content"],
                    "schema": PlayerStats.model_json_schema(),
                    "prompt": src["prompt"],
                    "multiple": True
                }
            )

            stats = [PlayerStats(**s) for s in extraction.json()["data"]]
            all_stats.extend(stats)
            print(f"✅ {src['source']}: {len(stats)} player records")
        except Exception as e:
            print(f"❌ {src['source']}: {e}")

    return all_stats

Step 4: Monitor Injuries & Roster Moves

Injury information is the single biggest edge in sports betting — injuries move lines more than anything else:

def scrape_injury_reports(sport: str) -> list[InjuryReport]:
    """Scrape injury reports from official and media sources."""

    injury_sources = {
        "nba": [
            {
                "url": "https://www.espn.com/nba/injuries",
                "prompt": """Extract ALL NBA injury reports:
                - Player name and team
                - Injury type (knee, ankle, back, illness, etc.)
                - Status: Out, Doubtful, Questionable, Day-to-Day, Probable
                - Injury details and estimated return timeline
                - Date of report/update"""
            },
            {
                "url": "https://www.cbssports.com/nba/injuries/",
                "prompt": """Extract NBA injury information:
                - Player name, team, position
                - Injury description
                - Status and expected return
                - Date updated"""
            }
        ],
        "nfl": [
            {
                "url": "https://www.espn.com/nfl/injuries",
                "prompt": """Extract ALL NFL injury reports:
                - Player name, team, position
                - Injury type
                - Practice participation status (Full, Limited, DNP)
                - Game status (Out, Doubtful, Questionable)
                - Week and opponent"""
            }
        ],
        "mlb": [
            {
                "url": "https://www.espn.com/mlb/injuries",
                "prompt": """Extract ALL MLB injury reports including IL placements:
                - Player name, team, position
                - Injury type and IL designation (10-day, 15-day, 60-day)
                - Expected return date
                - Rehab assignment details if any"""
            }
        ]
    }

    all_injuries = []
    for src in injury_sources.get(sport, []):
        try:
            response = requests.post(
                f"{BASE_URL}/scrape",
                headers={"Authorization": f"Bearer {MANTIS_API_KEY}"},
                json={"url": src["url"], "render_js": True, "timeout": 30000}
            )

            extraction = requests.post(
                f"{BASE_URL}/extract",
                headers={"Authorization": f"Bearer {MANTIS_API_KEY}"},
                json={
                    "content": response.json()["content"],
                    "schema": InjuryReport.model_json_schema(),
                    "prompt": src["prompt"],
                    "multiple": True
                }
            )

            injuries = [InjuryReport(**i) for i in extraction.json()["data"]]
            all_injuries.extend(injuries)
        except Exception as e:
            print(f"❌ Injury source error: {e}")

    return all_injuries


def assess_injury_impact(injury: InjuryReport, conn) -> dict:
    """Assess the betting impact of an injury using historical data."""

    cursor = conn.cursor()

    # Get player's recent stats to gauge importance
    cursor.execute("""
        SELECT AVG(points), AVG(assists), AVG(rebounds), AVG(minutes_played)
        FROM player_stats
        WHERE player_name = ? AND sport = ?
        ORDER BY game_date DESC LIMIT 10
    """, (injury.player_name, injury.sport))

    recent_stats = cursor.fetchone()

    # Get team's record with vs without this player
    cursor.execute("""
        SELECT
            COUNT(CASE WHEN minutes_played > 0 THEN 1 END) as games_played,
            COUNT(CASE WHEN minutes_played = 0 OR minutes_played IS NULL THEN 1 END) as games_missed
        FROM player_stats
        WHERE player_name = ? AND sport = ? AND season = '2025-26'
    """, (injury.player_name, injury.sport))

    availability = cursor.fetchone()

    impact = {
        "player": injury.player_name,
        "team": injury.team,
        "status": injury.status,
        "avg_stats": recent_stats,
        "games_played": availability[0] if availability else 0,
        "games_missed": availability[1] if availability else 0,
        "estimated_line_impact": "TBD"
    }

    return impact

Step 5: AI-Powered Edge Detection & Analysis

Use GPT-4o to find betting edges, generate fantasy projections, and produce actionable insights:

from openai import OpenAI

client = OpenAI()

def find_betting_edges(odds_data: list, injuries: list, stats: list):
    """Use AI to identify potential betting edges across markets."""

    analysis = client.chat.completions.create(
        model="gpt-4o",
        messages=[{
            "role": "system",
            "content": """You are a professional sports analyst and quantitative 
            betting researcher. Analyze odds, injury, and statistical data to:

            1. LINE SHOPPING: Identify the best available odds across sportsbooks 
               for each market. Flag lines where one book is significantly off 
               from consensus (potential value).

            2. INJURY IMPACT: Assess how current injuries should move lines vs 
               how they actually have moved. Flag potential under-reactions.

            3. SHARP vs PUBLIC: When ticket% and money% diverge, that indicates 
               sharp action. Flag reverse line movements.

            4. STATISTICAL EDGES: Identify players whose recent performance 
               suggests their props are mispriced (e.g., a player averaging 28 PPG 
               last 5 games with an O/U of 23.5).

            5. CORRELATION PLAYS: Identify same-game parlay correlations that 
               books may not properly account for.

            Be specific with numbers. Provide expected value calculations.
            Always note this is analysis, not gambling advice."""
        }, {
            "role": "user",
            "content": f"""Analyze today's sports betting landscape:

CURRENT ODDS (across sportsbooks):
{json.dumps([o.model_dump() for o in odds_data[:50]], indent=2)}

ACTIVE INJURIES:
{json.dumps([i.model_dump() for i in injuries[:30]], indent=2)}

RECENT PLAYER STATS:
{json.dumps([s.model_dump() for s in stats[:40]], indent=2)}

Find edges, line discrepancies, and value opportunities."""
        }]
    )

    return analysis.choices[0].message.content


def generate_fantasy_projections(stats: list, injuries: list, sport: str):
    """Generate AI-powered fantasy sports projections."""

    projections = client.chat.completions.create(
        model="gpt-4o",
        messages=[{
            "role": "system",
            "content": f"""You are an expert {sport.upper()} fantasy sports analyst.
            Generate player projections based on:

            1. RECENT FORM: Weight recent games more heavily (last 5 > last 15 > season)
            2. MATCHUP: Consider opponent's defensive rankings at each position
            3. INJURY CONTEXT: Adjust for teammates out (more opportunity) or 
               player limitations (reduced minutes/snap count)
            4. PACE & GAME ENVIRONMENT: High totals = more fantasy points
            5. HOME/AWAY SPLITS: Some players perform significantly differently

            Output specific stat projections and DFS salary value ratings.
            Format as a ranked list with confidence levels."""
        }, {
            "role": "user",
            "content": f"""Generate fantasy projections for today's {sport.upper()} slate:

PLAYER STATS (recent):
{json.dumps([s.model_dump() for s in stats[:30]], indent=2)}

INJURIES AFFECTING SLATE:
{json.dumps([i.model_dump() for i in injuries[:20]], indent=2)}

Provide specific stat projections, DFS value ratings, and stack recommendations."""
        }]
    )

    return projections.choices[0].message.content


def detect_line_movement(conn, current_odds: list) -> list[LineMovement]:
    """Detect significant line movements by comparing to stored odds."""

    movements = []
    cursor = conn.cursor()

    for odds in current_odds:
        cursor.execute("""
            SELECT line, odds_american, scraped_at
            FROM betting_odds
            WHERE sportsbook = ? AND event_name = ? AND market_type = ? AND selection = ?
            ORDER BY scraped_at ASC LIMIT 1
        """, (odds.sportsbook, odds.event_name, odds.market_type, odds.selection))

        opening = cursor.fetchone()

        if opening and odds.line is not None:
            open_line, open_odds, open_time = opening
            if open_line is not None:
                movement = abs(odds.line - open_line)
                if movement >= 0.5:  # Significant movement threshold
                    # Check for reverse line movement (sharp indicator)
                    cursor.execute("""
                        SELECT tickets_pct, money_pct
                        FROM betting_odds
                        WHERE event_name = ? AND market_type = ?
                        ORDER BY scraped_at DESC LIMIT 1
                    """, (odds.event_name, odds.market_type))

                    action = cursor.fetchone()
                    sharp = False
                    if action and action[0] and action[1]:
                        # Reverse line movement: line moves opposite to public tickets
                        if action[0] > 60 and odds.line < open_line:
                            sharp = True
                        elif action[0] < 40 and odds.line > open_line:
                            sharp = True

                    movements.append(LineMovement(
                        sport=odds.sport,
                        event_name=odds.event_name,
                        event_date=odds.event_date,
                        market_type=odds.market_type,
                        sportsbook=odds.sportsbook,
                        opening_line=open_line,
                        current_line=odds.line,
                        opening_odds=open_odds,
                        current_odds=odds.odds_american,
                        movement_size=movement,
                        movement_pct=(movement / abs(open_line)) * 100 if open_line != 0 else 0,
                        tickets_pct=action[0] if action else None,
                        money_pct=action[1] if action else None,
                        sharp_indicator=sharp,
                        timestamp=datetime.now().isoformat()
                    ))

    return movements

Step 6: Real-Time Alerting

Send alerts to Slack or Discord when edges are detected:

import os

SLACK_WEBHOOK = os.environ.get("SLACK_WEBHOOK_URL")
DISCORD_WEBHOOK = os.environ.get("DISCORD_WEBHOOK_URL")

def send_sports_alert(edges: str, movements: list, injuries: list):
    """Send sports intelligence alerts."""

    blocks = [{
        "type": "header",
        "text": {"type": "plain_text", "text": "🏈 Sports Intelligence Alert"}
    }]

    # Sharp line movements
    sharp_moves = [m for m in movements if m.sharp_indicator]
    if sharp_moves:
        move_text = "\n".join([
            f"• 🔥 {m.event_name}: {m.market_type} moved {m.opening_line} → {m.current_line} "
            f"({m.sportsbook}) — SHARP ACTION (public {m.tickets_pct:.0f}% vs money {m.money_pct:.0f}%)"
            for m in sharp_moves[:5]
        ])
        blocks.append({
            "type": "section",
            "text": {"type": "mrkdwn", "text": f"*🎯 Sharp Money Detected:*\n{move_text}"}
        })

    # High-impact injuries
    high_impact = [i for i in injuries if i.status in ("out", "doubtful")]
    if high_impact:
        injury_text = "\n".join([
            f"• 🏥 {i.player_name} ({i.team}) — {i.status.upper()}: {i.injury_type}"
            for i in high_impact[:8]
        ])
        blocks.append({
            "type": "section",
            "text": {"type": "mrkdwn", "text": f"*🏥 Key Injuries:*\n{injury_text}"}
        })

    # AI edge analysis (truncated)
    if edges:
        summary = edges[:2000] + "..." if len(edges) > 2000 else edges
        blocks.append({
            "type": "section",
            "text": {"type": "mrkdwn", "text": f"*🤖 AI Edge Analysis:*\n{summary}"}
        })

    requests.post(SLACK_WEBHOOK, json={"blocks": blocks})

    # Also send to Discord for the sports community
    if DISCORD_WEBHOOK:
        discord_msg = "## 🏈 Sports Intelligence Alert\n\n"
        if sharp_moves:
            discord_msg += "**Sharp Money:**\n" + "\n".join([
                f"- {m.event_name}: {m.opening_line} → {m.current_line} (SHARP)"
                for m in sharp_moves[:5]
            ]) + "\n\n"
        requests.post(DISCORD_WEBHOOK, json={"content": discord_msg[:2000]})

🏈 Start Building Your Sports Intelligence Agent

Track odds across sportsbooks, monitor injuries, and detect edges automatically. Free tier includes 100 API calls/month.

Get Your API Key →

Cost Comparison: Traditional vs. AI Agent

Platform	Annual Cost	Coverage	Customization
Sportradar	$100K - $2M+	Real-time feeds, all major leagues	API access, fixed schemas
Stats Perform (Opta)	$500K - $2M	Soccer, cricket, deep event data	API + widgets
SportsDataIO	$6K - $24K	US sports, odds, projections	REST API
Action Network Pro	$600 - $2,400	Odds, sharp action, basic analytics	Dashboard only
AI Agent + Mantis	$348 - $3,588	Any public source, real-time	Fully customizable, your models

Sportradar and Stats Perform own the real-time, in-play data market — that's genuinely hard to replicate. But for pre-game odds comparison, injury monitoring, line movement tracking, and historical stats, an AI agent checking public sources every few minutes delivers tremendous value at a fraction of the cost.

Use Cases by Segment

1. Sports Bettors & Syndicates

Line shop across 10+ sportsbooks simultaneously to always get the best number. Track line movements from open to close to identify sharp vs. public action. Monitor injury news and quantify expected line impact before books adjust. Build custom models using historical odds and results data that would cost $50K+ from a data provider.

2. Fantasy Sports Players (DFS & Season-Long)

Generate ownership projections by scraping DFS optimizer sites and forums. Track late-breaking injury news that affects player value minutes before lock. Build custom projection models using multi-source stat data. Monitor lineup percentages to find contrarian plays in GPP tournaments.

3. Sports Media & Content Creators

Automate data-driven content: "This week's biggest line movements" or "Injury report breakdown." Generate real-time odds comparison graphics for social media. Track betting market consensus to identify the games where sharps and public disagree most — great content hooks.

4. Sports Analytics Startups

Bootstrap your data layer without $1M+ Sportradar contracts. Build MVP products using scraped public data, then upgrade to official feeds once you have revenue. Focus your funding on model development and UX instead of data acquisition. Validate product-market fit before committing to expensive data partnerships.

Advanced: Multi-Book Arbitrage & Value Detection

Combine odds from multiple sportsbooks to find guaranteed profit opportunities:

def find_arbitrage_opportunities(all_odds: list) -> list:
    """Find arbitrage opportunities across sportsbooks."""

    # Group odds by event and market
    markets = {}
    for odds in all_odds:
        key = f"{odds.event_name}|{odds.market_type}"
        if key not in markets:
            markets[key] = []
        markets[key].append(odds)

    arb_opportunities = []

    for market_key, odds_list in markets.items():
        # For two-way markets (spread, moneyline, total)
        # Find best odds on each side across all books
        sides = {}
        for odds in odds_list:
            selection_side = odds.selection.split()[0] if odds.selection else "unknown"
            if selection_side not in sides:
                sides[selection_side] = []
            sides[selection_side].append(odds)

        if len(sides) == 2:
            side_keys = list(sides.keys())
            best_a = max(sides[side_keys[0]], key=lambda x: x.odds_american)
            best_b = max(sides[side_keys[1]], key=lambda x: x.odds_american)

            # Convert to implied probability
            prob_a = american_to_implied(best_a.odds_american)
            prob_b = american_to_implied(best_b.odds_american)

            total_implied = prob_a + prob_b

            if total_implied < 1.0:  # Arbitrage exists!
                profit_pct = (1.0 / total_implied - 1.0) * 100
                arb_opportunities.append({
                    "market": market_key,
                    "profit_pct": profit_pct,
                    "side_a": {
                        "selection": best_a.selection,
                        "sportsbook": best_a.sportsbook,
                        "odds": best_a.odds_american,
                        "stake_pct": (prob_b / (prob_a + prob_b)) * 100
                    },
                    "side_b": {
                        "selection": best_b.selection,
                        "sportsbook": best_b.sportsbook,
                        "odds": best_b.odds_american,
                        "stake_pct": (prob_a / (prob_a + prob_b)) * 100
                    }
                })

    return sorted(arb_opportunities, key=lambda x: -x["profit_pct"])


def american_to_implied(odds: int) -> float:
    """Convert American odds to implied probability."""
    if odds > 0:
        return 100 / (odds + 100)
    else:
        return abs(odds) / (abs(odds) + 100)


def find_expected_value_bets(odds: list, model_probabilities: dict) -> list:
    """Compare model probabilities to market odds to find +EV bets."""

    ev_bets = []
    for o in odds:
        market_key = f"{o.event_name}|{o.selection}"
        if market_key in model_probabilities:
            model_prob = model_probabilities[market_key]
            implied_prob = american_to_implied(o.odds_american)

            edge = model_prob - implied_prob
            if edge > 0.03:  # 3%+ edge threshold
                # Kelly Criterion for optimal bet sizing
                decimal_odds = (o.odds_american / 100 + 1) if o.odds_american > 0 else (100 / abs(o.odds_american) + 1)
                kelly = (model_prob * decimal_odds - 1) / (decimal_odds - 1)

                ev_bets.append({
                    "selection": o.selection,
                    "event": o.event_name,
                    "sportsbook": o.sportsbook,
                    "odds": o.odds_american,
                    "model_probability": model_prob,
                    "implied_probability": implied_prob,
                    "edge": edge,
                    "kelly_fraction": kelly,
                    "half_kelly": kelly / 2  # Conservative sizing
                })

    return sorted(ev_bets, key=lambda x: -x["edge"])

Compliance & Responsible Gambling

Sports data scraping carries unique legal and ethical considerations:

Public odds data: Sportsbook odds displayed on public-facing websites are generally considered public information. However, some books explicitly prohibit scraping in their Terms of Service. Respect robots.txt and rate limits.
Stats are facts: Sports statistics (scores, box scores, player stats) are factual data and generally not copyrightable. However, proprietary advanced metrics and compiled databases may have protections.
Sportsbook ToS: Many sportsbooks prohibit automated odds scraping. Using scraped odds for personal analysis is different from redistributing them commercially. Be aware of the distinction.
State regulations: Sports betting legality varies by state. Ensure your use complies with local gambling regulations. Some states restrict certain types of betting analysis tools.
Responsible gambling: Any system that facilitates sports betting should include responsible gambling resources. Set loss limits, track ROI honestly, and never bet more than you can afford to lose.
Rate limiting: Sportsbook websites handle heavy traffic but aggressive scraping can trigger IP bans. Use reasonable intervals (30+ seconds between requests to the same domain) and cache effectively.
No insider information: Using non-public injury or team information for betting purposes may violate sports integrity regulations. Only use publicly available data.

Getting Started

Ready to build your sports intelligence system? Here's the quick start:

Get a Mantis API key at mantisapi.com — free tier includes 100 API calls/month
Start with one sport — Pick NFL, NBA, or MLB and scrape odds from 3-4 sportsbooks
Add injury monitoring — Scrape ESPN and CBS Sports injury reports every 30 minutes
Track line movements — Store odds snapshots and detect significant movements
Layer in AI analysis — GPT-4o finds edges, generates projections, and identifies sharp action
Scale across sports — Once your pipeline works for one sport, adding others is straightforward

🏈 Build Your Sports Intelligence Agent

Track odds, injuries, and line movements across every major sport. Free tier includes 100 API calls/month.