Web Scraping for Insurance & InsurTech: How AI Agents Track Premiums, Claims & Risk Data in 2026

Published March 11, 2026 · 14 min read · Insurance InsurTech AI Agents Web Scraping

The global insurance industry generates over $6 trillion in annual premiums, making it one of the largest financial sectors on the planet. Property & casualty alone accounts for $2.5 trillion, with commercial lines growing 8-12% annually as climate risk, cyber threats, and regulatory complexity increase.

Yet the insurance industry remains one of the most data-dependent — and data-expensive — sectors. Verisk Analytics ($2.5B revenue), LexisNexis Risk Solutions, and AM Best charge $5,000–$30,000/month for the actuarial data, loss ratios, and competitive intelligence that carriers need to price risk accurately.

What if an AI agent could monitor competitor rate filings, track catastrophe events, scrape claims data, and analyze market hardening trends — all automatically, for a fraction of the cost?

In this guide, you'll build an AI-powered insurance intelligence system that scrapes premium rates, regulatory filings, catastrophe data, and competitor products — then uses GPT-4o to generate underwriting insights and Slack alerts.

Why AI Agents Are Transforming Insurance Data

Insurance data has unique characteristics that make it ideal for AI agent automation:

Regulatory transparency: State insurance departments publish rate filings, financial statements, and market conduct reports — all public data, but scattered across 50+ state portals.
Catastrophe sensitivity: A single hurricane, wildfire, or cyber breach can shift an entire market segment. Real-time monitoring of weather events, claims reports, and loss estimates is critical.
Competitive density: The US alone has 5,900+ insurance companies. Tracking competitor rate changes, new product launches, and market exits requires massive scale.
Market cycles: Insurance markets alternate between "hard" (rising rates, tighter capacity) and "soft" (falling rates, excess capacity) cycles. Detecting inflection points early creates significant competitive advantage.

🛡️ The opportunity: Traditional insurance data platforms charge $5K-$30K/month for what an AI agent can deliver for $29-$299/month using the Mantis WebPerception API.

Architecture: The 6-Step Insurance Intelligence Pipeline

Here's the complete system architecture:

Source Discovery — Identify state DOI portals, NAIC databases, AM Best, carrier websites, and catastrophe data sources
AI-Powered Extraction — Use Mantis WebPerception API to scrape and structure insurance data from complex regulatory portals
SQLite Storage — Store historical rate filings, financial data, and catastrophe events locally
Change Detection — Flag rate changes >5%, new filings, catastrophe events, and competitor product launches
GPT-4o Analysis — AI interprets market conditions, predicts impact, recommends underwriting actions
Slack/Email Alerts — Real-time notifications for underwriters, actuaries, and product managers

Step 1: Define Your Insurance Data Models

First, create Pydantic schemas for structured insurance data extraction:

from pydantic import BaseModel
from typing import Optional, List
from datetime import datetime
from enum import Enum

class InsuranceLine(str, Enum):
    PERSONAL_AUTO = "personal_auto"
    HOMEOWNERS = "homeowners"
    COMMERCIAL_PROPERTY = "commercial_property"
    GENERAL_LIABILITY = "general_liability"
    WORKERS_COMP = "workers_comp"
    CYBER = "cyber"
    PROFESSIONAL_LIABILITY = "professional_liability"
    D_AND_O = "d_and_o"
    COMMERCIAL_AUTO = "commercial_auto"
    UMBRELLA = "umbrella"

class RateFiling(BaseModel):
    """State DOI rate filing data from SERFF or state portals."""
    state: str                      # Two-letter state code
    company_name: str
    naic_code: Optional[str] = None
    line_of_business: str
    filing_type: str                # "rate", "rule", "form", "rate_and_rule"
    serff_tracking: Optional[str] = None
    rate_change_pct: Optional[float] = None   # Overall rate change requested
    effective_date: Optional[str] = None
    status: str                     # "pending", "approved", "disapproved", "withdrawn"
    premium_impact: Optional[str] = None       # Dollar impact estimate
    filing_date: str
    disposition_date: Optional[str] = None
    url: Optional[str] = None

class CarrierFinancial(BaseModel):
    """Insurance company financial data from NAIC or AM Best."""
    company_name: str
    naic_code: str
    am_best_rating: Optional[str] = None       # A++, A+, A, A-, B++, etc.
    direct_written_premium: Optional[float] = None
    net_written_premium: Optional[float] = None
    loss_ratio: Optional[float] = None          # Losses / earned premium
    combined_ratio: Optional[float] = None      # Loss ratio + expense ratio
    surplus: Optional[float] = None
    year: int
    line_of_business: Optional[str] = None

class CatastropheEvent(BaseModel):
    """Natural catastrophe and large-loss event data."""
    event_name: str
    event_type: str             # "hurricane", "wildfire", "tornado", "flood", "cyber", "earthquake"
    date: str
    location: str               # State/region affected
    estimated_insured_loss: Optional[float] = None   # In dollars
    estimated_economic_loss: Optional[float] = None
    pcs_number: Optional[str] = None    # Verisk PCS catastrophe number
    affected_lines: List[str] = []      # Lines of business impacted
    status: str                 # "developing", "estimated", "final"
    source: str

class CompetitorProduct(BaseModel):
    """Competitor insurance product and pricing intelligence."""
    company_name: str
    product_name: str
    line_of_business: str
    target_market: str          # "small_commercial", "middle_market", "personal", etc.
    key_features: List[str] = []
    coverage_highlights: Optional[str] = None
    pricing_model: Optional[str] = None     # "usage_based", "parametric", "traditional"
    distribution: Optional[str] = None      # "direct", "agent", "broker", "embedded"
    launch_date: Optional[str] = None
    states_available: List[str] = []
    url: str

Step 2: Scrape State DOI Rate Filings

State insurance departments publish all rate filings — this is the most valuable competitive intelligence in insurance:

import requests
import json
import sqlite3
from datetime import datetime

MANTIS_API_KEY = "your-mantis-api-key"
BASE_URL = "https://api.mantisapi.com/v1"

def scrape_rate_filings(state: str, doi_url: str) -> list[RateFiling]:
    """Scrape rate filings from a state DOI portal or SERFF."""

    # Step 1: Capture the filing search results
    response = requests.post(
        f"{BASE_URL}/scrape",
        headers={"Authorization": f"Bearer {MANTIS_API_KEY}"},
        json={
            "url": doi_url,
            "render_js": True,
            "wait_for": "table, .filing-results, .search-results",
            "timeout": 30000
        }
    )

    page_data = response.json()

    # Step 2: AI-powered extraction of filing data
    extraction = requests.post(
        f"{BASE_URL}/extract",
        headers={"Authorization": f"Bearer {MANTIS_API_KEY}"},
        json={
            "content": page_data["content"],
            "schema": RateFiling.model_json_schema(),
            "prompt": f"""Extract all insurance rate filings from this {state} DOI portal.
            For each filing, capture:
            - Company name and NAIC code
            - Line of business (auto, homeowners, commercial, cyber, etc.)
            - Filing type (rate, rule, form)
            - Requested rate change percentage (look for +X% or -X%)
            - Effective date and filing status
            - SERFF tracking number if available
            Return as a list of filing records. Pay special attention to
            rate change percentages — these are the most important data points.""",
            "multiple": True
        }
    )

    filings = [RateFiling(state=state, **f) for f in extraction.json()["data"]]
    return filings

# Key state DOI portals — focus on largest premium states first
state_doi_portals = {
    "CA": "https://interactive.web.insurance.ca.gov/apex_extprd/f?p=142:1",
    "FL": "https://apps.fldfs.com/SRDW/Search/RateFilingSearch.aspx",
    "TX": "https://filingaccess.serff.com/sfa/search/filingSummary.xhtml?state=TX",
    "NY": "https://myportal.dfs.ny.gov/web/guest/rate-applications",
    "PA": "https://filingaccess.serff.com/sfa/search/filingSummary.xhtml?state=PA",
    "IL": "https://filingaccess.serff.com/sfa/search/filingSummary.xhtml?state=IL",
    "OH": "https://filingaccess.serff.com/sfa/search/filingSummary.xhtml?state=OH",
    "NJ": "https://filingaccess.serff.com/sfa/search/filingSummary.xhtml?state=NJ",
    "GA": "https://filingaccess.serff.com/sfa/search/filingSummary.xhtml?state=GA",
    "NC": "https://filingaccess.serff.com/sfa/search/filingSummary.xhtml?state=NC",
}

all_filings = []
for state, url in state_doi_portals.items():
    try:
        filings = scrape_rate_filings(state, url)
        all_filings.extend(filings)
        print(f"✅ {state}: {len(filings)} rate filings captured")
    except Exception as e:
        print(f"❌ {state}: {e}")

Step 3: Track Carrier Financial Health

Monitor carrier financial strength, loss ratios, and combined ratios from NAIC and AM Best:

def scrape_carrier_financials() -> list[CarrierFinancial]:
    """Scrape carrier financial data from NAIC and AM Best."""

    financial_sources = [
        {
            "url": "https://content.naic.org/cipr-topics/insurance-industry-financial-results",
            "prompt": """Extract insurance industry financial results including:
            - Direct written premium by line of business
            - Loss ratios and combined ratios by line
            - Year-over-year premium growth rates
            - Policyholder surplus trends
            Focus on the most recent year and prior year for comparison."""
        },
        {
            "url": "https://web.ambest.com/ratings-services/best-ratings",
            "prompt": """Extract AM Best rating actions including:
            - Company name and NAIC code
            - Current AM Best rating (A++, A+, A, etc.)
            - Rating outlook (stable, positive, negative)
            - Any recent upgrades or downgrades
            - Financial strength rating rationale"""
        }
    ]

    all_financials = []

    for source in financial_sources:
        response = requests.post(
            f"{BASE_URL}/scrape",
            headers={"Authorization": f"Bearer {MANTIS_API_KEY}"},
            json={"url": source["url"], "render_js": True}
        )

        extraction = requests.post(
            f"{BASE_URL}/extract",
            headers={"Authorization": f"Bearer {MANTIS_API_KEY}"},
            json={
                "content": response.json()["content"],
                "schema": CarrierFinancial.model_json_schema(),
                "prompt": source["prompt"],
                "multiple": True
            }
        )

        financials = [CarrierFinancial(**f) for f in extraction.json()["data"]]
        all_financials.extend(financials)

    return all_financials

Step 4: Monitor Catastrophe Events & Loss Estimates

Track natural catastrophes, cyber events, and large losses in real-time:

def scrape_catastrophe_events() -> list[CatastropheEvent]:
    """Scrape catastrophe events from insurance industry sources."""

    cat_sources = [
        {
            "url": "https://www.iii.org/fact-statistic/facts-statistics-catastrophes",
            "prompt": """Extract recent catastrophe events with insured loss estimates.
            For each event, capture: name, type (hurricane, wildfire, tornado, flood),
            date, location, insured loss estimate, economic loss estimate,
            and which insurance lines are affected."""
        },
        {
            "url": "https://www.ncei.noaa.gov/access/billions/",
            "prompt": """Extract billion-dollar weather and climate disaster events.
            For each, capture: event name, type, date range, states affected,
            total cost estimate, and deaths. Focus on events from the past 12 months."""
        },
        {
            "url": "https://www.artemis.bm/news/",
            "prompt": """Extract recent catastrophe loss estimates and reinsurance-relevant events.
            Focus on: named storms, wildfires, earthquakes, flooding events,
            and any industry loss warranty (ILW) trigger events.
            Include estimated insured losses and affected reinsurance layers."""
        }
    ]

    all_events = []

    for source in cat_sources:
        response = requests.post(
            f"{BASE_URL}/scrape",
            headers={"Authorization": f"Bearer {MANTIS_API_KEY}"},
            json={"url": source["url"], "render_js": True}
        )

        extraction = requests.post(
            f"{BASE_URL}/extract",
            headers={"Authorization": f"Bearer {MANTIS_API_KEY}"},
            json={
                "content": response.json()["content"],
                "schema": CatastropheEvent.model_json_schema(),
                "prompt": source["prompt"],
                "multiple": True
            }
        )

        events = [CatastropheEvent(**e) for e in extraction.json()["data"]]
        all_events.extend(events)

    return all_events

Step 5: Track Competitor Products & InsurTech Launches

Monitor competitor product launches, InsurTech funding, and new market entrants:

def scrape_competitor_products() -> list[CompetitorProduct]:
    """Scrape competitor insurance products and InsurTech launches."""

    product_sources = [
        {
            "url": "https://www.insurancejournal.com/news/national/",
            "prompt": """Extract new insurance product announcements including:
            - Company name and product name
            - Line of business and target market
            - Key features and coverage innovations
            - Distribution model (direct, agent, embedded)
            - States where available
            Focus on product launches from the past 30 days."""
        },
        {
            "url": "https://www.insurtech.com/news/",
            "prompt": """Extract InsurTech company news including:
            - New product launches and partnerships
            - Funding rounds and valuations
            - Technology innovations (AI underwriting, parametric, usage-based)
            - Market expansion announcements"""
        },
        {
            "url": "https://www.carriermanagement.com/news/",
            "prompt": """Extract carrier management news including:
            - New program launches and appetite changes
            - Market entry/exit announcements
            - Leadership changes at major carriers
            - M&A activity in the insurance space"""
        }
    ]

    all_products = []

    for source in product_sources:
        response = requests.post(
            f"{BASE_URL}/scrape",
            headers={"Authorization": f"Bearer {MANTIS_API_KEY}"},
            json={"url": source["url"], "render_js": True}
        )

        extraction = requests.post(
            f"{BASE_URL}/extract",
            headers={"Authorization": f"Bearer {MANTIS_API_KEY}"},
            json={
                "content": response.json()["content"],
                "schema": CompetitorProduct.model_json_schema(),
                "prompt": source["prompt"],
                "multiple": True
            }
        )

        products = [CompetitorProduct(**p) for p in extraction.json()["data"]]
        all_products.extend(products)

    return all_products

Step 6: Store Everything in SQLite

Create a local database for historical tracking and trend analysis:

def init_insurance_db():
    """Initialize SQLite database for insurance intelligence."""
    conn = sqlite3.connect("insurance_intel.db")
    c = conn.cursor()

    c.execute("""CREATE TABLE IF NOT EXISTS rate_filings (
        id INTEGER PRIMARY KEY AUTOINCREMENT,
        state TEXT, company_name TEXT, naic_code TEXT,
        line_of_business TEXT, filing_type TEXT,
        serff_tracking TEXT UNIQUE,
        rate_change_pct REAL, effective_date TEXT,
        status TEXT, premium_impact TEXT,
        filing_date TEXT, disposition_date TEXT, url TEXT,
        scraped_at TEXT DEFAULT CURRENT_TIMESTAMP
    )""")

    c.execute("""CREATE TABLE IF NOT EXISTS carrier_financials (
        id INTEGER PRIMARY KEY AUTOINCREMENT,
        company_name TEXT, naic_code TEXT,
        am_best_rating TEXT,
        direct_written_premium REAL, net_written_premium REAL,
        loss_ratio REAL, combined_ratio REAL,
        surplus REAL, year INTEGER,
        line_of_business TEXT,
        scraped_at TEXT DEFAULT CURRENT_TIMESTAMP,
        UNIQUE(naic_code, year, line_of_business)
    )""")

    c.execute("""CREATE TABLE IF NOT EXISTS catastrophe_events (
        id INTEGER PRIMARY KEY AUTOINCREMENT,
        event_name TEXT, event_type TEXT,
        date TEXT, location TEXT,
        estimated_insured_loss REAL,
        estimated_economic_loss REAL,
        pcs_number TEXT UNIQUE,
        affected_lines TEXT, status TEXT, source TEXT,
        scraped_at TEXT DEFAULT CURRENT_TIMESTAMP
    )""")

    c.execute("""CREATE TABLE IF NOT EXISTS competitor_products (
        id INTEGER PRIMARY KEY AUTOINCREMENT,
        company_name TEXT, product_name TEXT,
        line_of_business TEXT, target_market TEXT,
        key_features TEXT, coverage_highlights TEXT,
        pricing_model TEXT, distribution TEXT,
        launch_date TEXT, states_available TEXT, url TEXT UNIQUE,
        scraped_at TEXT DEFAULT CURRENT_TIMESTAMP
    )""")

    conn.commit()
    return conn

def store_filings(conn, filings: list[RateFiling]):
    """Store rate filings with deduplication."""
    c = conn.cursor()
    new_count = 0
    for f in filings:
        try:
            c.execute("""INSERT OR IGNORE INTO rate_filings
                (state, company_name, naic_code, line_of_business,
                 filing_type, serff_tracking, rate_change_pct,
                 effective_date, status, premium_impact,
                 filing_date, disposition_date, url)
                VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)""",
                (f.state, f.company_name, f.naic_code, f.line_of_business,
                 f.filing_type, f.serff_tracking, f.rate_change_pct,
                 f.effective_date, f.status, f.premium_impact,
                 f.filing_date, f.disposition_date, f.url))
            if c.rowcount > 0:
                new_count += 1
        except sqlite3.IntegrityError:
            pass
    conn.commit()
    return new_count

Step 7: Anomaly Detection & AI Analysis

Detect significant rate changes, catastrophe developments, and market shifts — then use GPT-4o to generate underwriting insights:

from openai import OpenAI

client = OpenAI()

def detect_insurance_anomalies(conn) -> list[dict]:
    """Detect anomalies in insurance data."""
    c = conn.cursor()
    anomalies = []

    # 1. Large rate increases (>10%)
    c.execute("""
        SELECT state, company_name, line_of_business,
               rate_change_pct, filing_date, status
        FROM rate_filings
        WHERE scraped_at > datetime('now', '-24 hours')
        AND rate_change_pct > 10
        ORDER BY rate_change_pct DESC
    """)

    for row in c.fetchall():
        anomalies.append({
            "type": "LARGE_RATE_INCREASE",
            "severity": "critical" if row[3] > 25 else "high",
            "state": row[0],
            "company": row[1],
            "line": row[2],
            "rate_change_pct": row[3],
            "filing_date": row[4],
            "status": row[5]
        })

    # 2. Rate decreases (potential market softening signal)
    c.execute("""
        SELECT state, company_name, line_of_business,
               rate_change_pct, filing_date
        FROM rate_filings
        WHERE scraped_at > datetime('now', '-24 hours')
        AND rate_change_pct < -5
        ORDER BY rate_change_pct ASC
    """)

    for row in c.fetchall():
        anomalies.append({
            "type": "RATE_DECREASE",
            "severity": "important",
            "state": row[0],
            "company": row[1],
            "line": row[2],
            "rate_change_pct": row[3],
            "filing_date": row[4]
        })

    # 3. New catastrophe events or updated loss estimates
    c.execute("""
        SELECT event_name, event_type, location,
               estimated_insured_loss, status
        FROM catastrophe_events
        WHERE scraped_at > datetime('now', '-24 hours')
        AND (status = 'developing' OR estimated_insured_loss > 1000000000)
    """)

    for row in c.fetchall():
        anomalies.append({
            "type": "CATASTROPHE_EVENT",
            "severity": "critical" if (row[3] or 0) > 5e9 else "high",
            "event_name": row[0],
            "event_type": row[1],
            "location": row[2],
            "insured_loss": row[3],
            "status": row[4]
        })

    # 4. AM Best rating downgrades
    c.execute("""
        SELECT company_name, am_best_rating, combined_ratio
        FROM carrier_financials
        WHERE scraped_at > datetime('now', '-7 days')
        AND combined_ratio > 110
    """)

    for row in c.fetchall():
        anomalies.append({
            "type": "POOR_COMBINED_RATIO",
            "severity": "important",
            "company": row[0],
            "am_best_rating": row[1],
            "combined_ratio": row[2]
        })

    return anomalies


def analyze_insurance_market(anomalies: list[dict], filings: list, events: list) -> str:
    """Use GPT-4o to generate strategic insurance market analysis."""

    market_context = {
        "anomalies": anomalies,
        "filing_summary": {
            "total_filings": len(filings),
            "avg_rate_change": sum(f.rate_change_pct or 0 for f in filings) / len(filings) if filings else 0,
            "max_increase": max((f.rate_change_pct or 0 for f in filings), default=0),
            "max_decrease": min((f.rate_change_pct or 0 for f in filings), default=0),
            "lines_with_increases": list(set(
                f.line_of_business for f in filings if (f.rate_change_pct or 0) > 5
            )),
            "states_with_hardening": list(set(
                f.state for f in filings if (f.rate_change_pct or 0) > 10
            ))
        },
        "catastrophe_summary": {
            "active_events": len([e for e in events if e.status == "developing"]),
            "total_insured_losses": sum(e.estimated_insured_loss or 0 for e in events),
            "event_types": list(set(e.event_type for e in events))
        }
    }

    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{
            "role": "system",
            "content": """You are an insurance market analyst AI. Analyze the following
            market data and provide:
            1. MARKET STATUS: Hard or soft market? Direction of rates by line of business.
            2. RATE TRENDS: Which lines are hardening/softening? Which states show the most movement?
            3. CATASTROPHE IMPACT: How are recent events affecting pricing and capacity?
            4. COMPETITOR SIGNALS: What are rate filings telling us about competitor strategy?
            5. UNDERWRITING RECOMMENDATIONS: Where should we grow? Where should we pull back?
            6. REINSURANCE IMPLICATIONS: How might cat losses affect treaty renewals?

            Be specific with numbers. Distinguish between personal and commercial lines.
            Flag anything requiring immediate underwriting action."""
        }, {
            "role": "user",
            "content": f"Insurance market data:\n{json.dumps(market_context, indent=2, default=str)}"
        }],
        temperature=0.3
    )

    return response.choices[0].message.content

Step 8: Real-Time Alerts via Slack

Send structured alerts to underwriters, actuaries, and product managers:

def send_insurance_alert(anomalies: list[dict], analysis: str):
    """Send insurance market alerts via Slack."""
    import requests as req

    severity_emoji = {
        "critical": "🔴",
        "high": "🟠",
        "important": "🟡",
        "minor": "🔵"
    }

    anomaly_text = ""
    for a in sorted(anomalies, key=lambda x: {"critical": 0, "high": 1, "important": 2, "minor": 3}[x["severity"]]):
        emoji = severity_emoji[a["severity"]]

        if a["type"] == "LARGE_RATE_INCREASE":
            anomaly_text += f"{emoji} *RATE INCREASE* — {a['company']} ({a['state']}): "
            anomaly_text += f"+{a['rate_change_pct']:.1f}% on {a['line']} ({a['status']})\n"
        elif a["type"] == "RATE_DECREASE":
            anomaly_text += f"{emoji} *RATE DECREASE* — {a['company']} ({a['state']}): "
            anomaly_text += f"{a['rate_change_pct']:.1f}% on {a['line']}\n"
        elif a["type"] == "CATASTROPHE_EVENT":
            loss_str = f"${a['insured_loss']/1e9:.1f}B" if a['insured_loss'] else "TBD"
            anomaly_text += f"{emoji} *CAT EVENT* — {a['event_name']}: "
            anomaly_text += f"{a['event_type']} in {a['location']} (est. loss: {loss_str})\n"
        elif a["type"] == "POOR_COMBINED_RATIO":
            anomaly_text += f"{emoji} *HIGH COMBINED RATIO* — {a['company']}: "
            anomaly_text += f"{a['combined_ratio']:.1f}% (rated {a['am_best_rating']})\n"

    message = f"""🛡️ *Insurance Market Intelligence Report*
━━━━━━━━━━━━━━━━━━━━━━━━━

*Anomalies Detected: {len(anomalies)}*
{anomaly_text}
━━━━━━━━━━━━━━━━━━━━━━━━━

*AI Analysis:*
{analysis}

_Powered by Mantis WebPerception API — monitoring 50 state DOIs + industry sources_"""

    req.post(SLACK_WEBHOOK, json={"text": message})

Step 9: Automated Scheduling

Run the full pipeline on a schedule — daily for rate filings, hourly for catastrophe events, weekly for financials:

import schedule
import time

def rate_filing_check():
    """Run daily — scrape new rate filings from state DOIs."""
    conn = init_insurance_db()

    for state, url in state_doi_portals.items():
        try:
            filings = scrape_rate_filings(state, url)
            new = store_filings(conn, filings)
            if new > 0:
                print(f"📋 {state}: {new} new rate filings")
        except Exception as e:
            print(f"Error scraping {state}: {e}")

    # Detect anomalies and alert
    anomalies = detect_insurance_anomalies(conn)
    if anomalies:
        events = scrape_catastrophe_events()
        all_filings = []  # Retrieve from DB for context
        analysis = analyze_insurance_market(anomalies, all_filings, events)
        send_insurance_alert(anomalies, analysis)

    conn.close()

def catastrophe_check():
    """Run every 2 hours — monitor active cat events."""
    conn = init_insurance_db()
    events = scrape_catastrophe_events()

    for e in events:
        try:
            conn.execute("""INSERT OR REPLACE INTO catastrophe_events
                (event_name, event_type, date, location,
                 estimated_insured_loss, estimated_economic_loss,
                 pcs_number, affected_lines, status, source)
                VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)""",
                (e.event_name, e.event_type, e.date, e.location,
                 e.estimated_insured_loss, e.estimated_economic_loss,
                 e.pcs_number, json.dumps(e.affected_lines), e.status, e.source))
        except sqlite3.IntegrityError:
            pass
    conn.commit()

    # Alert on developing events with large loss estimates
    developing = [e for e in events if e.status == "developing"]
    if developing:
        print(f"🌪️ {len(developing)} developing catastrophe events")
    conn.close()

def competitor_scan():
    """Run weekly — scan for new competitor products and InsurTech launches."""
    conn = init_insurance_db()
    products = scrape_competitor_products()

    new_products = 0
    for p in products:
        try:
            conn.execute("""INSERT OR IGNORE INTO competitor_products
                (company_name, product_name, line_of_business,
                 target_market, key_features, coverage_highlights,
                 pricing_model, distribution, launch_date,
                 states_available, url)
                VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)""",
                (p.company_name, p.product_name, p.line_of_business,
                 p.target_market, json.dumps(p.key_features),
                 p.coverage_highlights, p.pricing_model, p.distribution,
                 p.launch_date, json.dumps(p.states_available), p.url))
            if conn.execute("SELECT changes()").fetchone()[0] > 0:
                new_products += 1
        except sqlite3.IntegrityError:
            pass
    conn.commit()

    if new_products > 0:
        print(f"🆕 {new_products} new competitor products detected")
    conn.close()

# Schedule the pipeline
schedule.every().day.at("07:00").do(rate_filing_check)
schedule.every().day.at("19:00").do(rate_filing_check)
schedule.every(2).hours.do(catastrophe_check)
schedule.every().monday.at("09:00").do(competitor_scan)

print("🛡️ Insurance intelligence pipeline running...")
while True:
    schedule.run_pending()
    time.sleep(60)

Cost Comparison: Traditional vs. AI Agent Approach

Platform	Monthly Cost	Data Coverage	Real-Time	AI Analysis
Verisk / ISO	$5,000–$30,000	Comprehensive actuarial	Daily	Basic
LexisNexis Risk	$3,000–$20,000	Claims + risk scoring	Yes	Rules-based
AM Best	$2,000–$10,000	Ratings + financials	Weekly	Manual reports
Guidewire / Duck Creek	$10,000–$50,000	Policy admin + data	Yes	Platform-dependent
AI Agent + Mantis	$29–$299	Customizable	Yes (2-hr cat)	GPT-4o powered

🛡️ Key advantage: Traditional insurance data platforms sell fixed reports and standardized scores. An AI agent adapts — it monitors exactly the states, lines of business, and competitors you care about, with natural-language analysis of what rate trends and catastrophe events mean for your book of business.

Use Cases: Who Benefits?

1. Carriers & Underwriters

Monitor competitor rate filings across all 50 states to inform pricing strategy. When State Farm files for a 15% homeowners increase in Florida, your AI agent detects it within hours — not weeks. Track combined ratios across your competitive set to identify carriers under pressure who may exit markets, creating growth opportunities.

2. Managing General Agents (MGAs)

MGAs need to stay ahead of market capacity shifts. An AI agent monitoring catastrophe events, reinsurance market signals, and carrier financial health can predict which carriers might restrict appetite — giving you time to secure alternative capacity. Track rate adequacy across your programs by comparing your pricing to state filing trends.

3. InsurTech Startups

InsurTech companies building embedded insurance, parametric products, or usage-based models need competitive intelligence at startup speed. An AI agent scrapes competitor product launches, funding announcements, and distribution partnerships — building a real-time competitive landscape that would cost $50K+ from a consulting firm.

4. Reinsurance Brokers

Track catastrophe loss development in real-time to advise clients on treaty structure and pricing. Monitor primary carrier rate filings to forecast cedant premium growth and loss trends. An AI agent aggregating data from PCS, NOAA, Artemis, and state DOIs provides the intelligence layer that supports treaty placement discussions.

Advanced: Market Cycle Detection

Use historical rate filing data to detect hard/soft market inflection points:

def detect_market_cycle(conn, line_of_business: str) -> dict:
    """Analyze rate filing trends to detect market cycle position."""
    c = conn.cursor()

    # Get quarterly average rate changes for the past 2 years
    c.execute("""
        SELECT
            strftime('%Y-Q' || ((CAST(strftime('%m', filing_date) AS INTEGER)-1)/3 + 1), filing_date) as quarter,
            AVG(rate_change_pct) as avg_change,
            COUNT(*) as filing_count,
            SUM(CASE WHEN rate_change_pct > 0 THEN 1 ELSE 0 END) as increases,
            SUM(CASE WHEN rate_change_pct < 0 THEN 1 ELSE 0 END) as decreases
        FROM rate_filings
        WHERE line_of_business = ?
        AND filing_date > date('now', '-2 years')
        AND rate_change_pct IS NOT NULL
        GROUP BY quarter
        ORDER BY quarter
    """, (line_of_business,))

    quarterly_data = c.fetchall()

    # Use GPT-4o to interpret the cycle
    analysis = client.chat.completions.create(
        model="gpt-4o",
        messages=[{
            "role": "system",
            "content": """You are an insurance market cycle analyst. Based on the
            quarterly rate filing trends, determine:
            1. Current market position: HARD, HARDENING, STABLE, SOFTENING, or SOFT
            2. Direction: rates accelerating, decelerating, or flat
            3. Inflection signals: any signs of cycle turning
            4. Forecast: expected rate direction for next 2 quarters
            5. Strategic recommendation: grow, hold, or contract in this line"""
        }, {
            "role": "user",
            "content": f"""Rate filing trends for {line_of_business}:
            {json.dumps([{
                'quarter': q[0], 'avg_change': q[1],
                'filings': q[2], 'increases': q[3], 'decreases': q[4]
            } for q in quarterly_data], indent=2)}"""
        }]
    )

    return {
        "line": line_of_business,
        "quarterly_data": quarterly_data,
        "analysis": analysis.choices[0].message.content
    }

Compliance & Best Practices

Insurance data scraping comes with specific considerations:

State DOI filings are public records: Rate filings submitted to state insurance departments are public by law in most states. SERFF (System for Electronic Rates & Forms Filing) is the standard platform and allows public searches.
NAIC data is partially public: The NAIC publishes aggregate industry statistics freely. Individual company financial statement data requires a subscription — respect those terms.
AM Best ratings: Published ratings are public, but detailed financial analysis and credit reports are subscription content. Scrape only publicly available rating information.
Catastrophe data: NOAA, NWS, and FEMA data are public domain. Verisk PCS estimates may have redistribution restrictions — use as directional intelligence, not for republication.
Competitor websites: Product descriptions and publicly listed rates are fair game. Agent portals, quoting engines, and login-protected areas should not be scraped without authorization.
Consumer data: Never scrape individual policyholder information, claims details with PII, or medical records. This violates state privacy laws and potentially HIPAA.

Getting Started

Ready to build your insurance intelligence system? Here's the quick start:

Get a Mantis API key at mantisapi.com — free tier includes 100 API calls/month
Start with one state — Pick your largest premium state and scrape its DOI portal for recent rate filings
Add catastrophe monitoring — NOAA and III provide the most accessible cat event data
Set up anomaly detection — Even simple threshold alerts (rate change >10%) catch the most important signals
Layer in AI analysis — GPT-4o turns rate filing data into underwriting intelligence
Scale to all 50 states — Once your single-state agent works, SERFF provides a consistent interface across most states

🛡️ Start Monitoring Insurance Markets Today

Build your first insurance intelligence agent in under 30 minutes. Free tier includes 100 API calls/month.

Get Your API Key →