Building Reliable Proxy Infrastructure for AI Agents

By Nicholas St. Germain —

Your agent works perfectly in development. It logs into the user's account, performs the task, returns the result. Ship it.

Then you hit production. The first user's bank flags the login as suspicious. Gmail demands re-verification. Shopify blocks the session entirely. Your agent isn't doing anything malicious-but the infrastructure you're running it on looks indistinguishable from a credential stuffing attack.

This is the core infrastructure problem for AI agents that access privileged resources: websites have spent a decade building systems to detect and block exactly the kind of access patterns your agent produces.

This guide covers what those detection systems look for, why common infrastructure choices fail, and how to build agent systems that maintain reliable access at scale.

What Services Actually Detect

Before solving the problem, understand what you're up against. Modern fraud detection analyzes requests across multiple dimensions simultaneously.

IP Reputation

Every IP address carries a reputation score. Services like MaxMind and IPQualityScore categorize IPs by:

  • Type: Residential, mobile, datacenter, VPN, proxy
  • History: Previous abuse patterns, spam reports, fraud associations
  • ASN ownership: Consumer ISP vs cloud provider vs hosting company

When your agent runs from an EC2 instance, the IP is immediately flagged as datacenter infrastructure. Many services apply stricter scrutiny or block outright-before examining anything else about the request.

Session Consistency

Services track session behavior over time:

Login from 73.162.xx.xx (Comcast, Chicago) at 9:00 AM
API call from 73.162.xx.xx at 9:01 AM
API call from 73.162.xx.xx at 9:15 AM
API call from 184.72.xx.xx (AWS, Virginia) at 9:16 AM  ← Suspicious

The IP change mid-session triggers review. If your infrastructure rotates IPs-intentionally or due to proxy pool behavior-you'll hit this constantly.

Geographic Impossibility

Fraud systems flag physically impossible travel:

9:00 AM - Login from New York
9:05 AM - Login from Los Angeles  ← 2,800 miles in 5 minutes

Rotating proxy pools often span geographic regions. Your agent might get a New York IP for one request and an LA IP for the next. The service sees impossible teleportation.

Browser Fingerprinting

For agents using browser automation (Playwright, Puppeteer), services collect:

  • Canvas and WebGL fingerprints
  • Installed fonts and plugins
  • Screen dimensions and color depth
  • Timezone and language settings

Headless browsers have detectable signatures. Services like Cloudflare and PerimeterX specifically look for automation markers.

Why Common Infrastructure Fails

Most teams try one of these approaches first. All have significant problems for authenticated access.

Datacenter IPs (EC2, GCP, etc.)

Running agents from cloud VMs is the obvious first choice. It fails immediately.

Cloud IP ranges are well-documented and widely blocked. Even if not blocked outright, datacenter IPs face:

  • Elevated CAPTCHA rates
  • Stricter rate limits
  • Required additional verification
  • Session flags for manual review

Some services work fine from datacenter IPs. Banking, healthcare, and e-commerce rarely do.

Rotating Residential Proxies

Residential proxies provide IPs from real ISPs, solving the reputation problem. But rotation creates new ones.

Session binding breaks: Many services tie sessions to IP addresses. When your proxy rotates mid-session, you're logged out-or worse, flagged.

Reputation inheritance: Rotating pools share IPs across customers. You might receive an IP that was just used for abuse, inheriting its damaged reputation.

Unpredictable behavior: When an IP change can happen any moment, your agent needs complex retry logic for every authenticated operation.

VPNs

Commercial VPN IPs are catalogued and often blocked. They also frequently rotate or share IPs across users, creating the same session consistency problems.

Static ISP Proxies

Static ISP proxies combine the trust of residential IPs with the reliability of dedicated infrastructure.

Residential classification: The IP belongs to a consumer ISP (Comcast, Verizon, AT&T). IP reputation databases classify it as a normal residential connection.

Dedicated assignment: The IP is yours exclusively. No rotation, no sharing, no inherited reputation from other users.

Consistent identity: Your agent uses the same IP across sessions, building positive reputation over time.

The tradeoff is cost-static ISP proxies are more expensive than rotating alternatives. For authenticated access to protected services, the reliability difference justifies it.

Implementation

Here's how to build agent infrastructure using static ISP proxies. Examples use Python with httpx for HTTP and playwright for browser automation.

Basic Session Management

For API-based access:

import httpx
from dataclasses import dataclass

@dataclass
class AgentConfig:
    proxy_url: str  # http://user:pass@proxy.example.com:3128
    user_agent: str = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36"

class AgentSession:
    def __init__(self, config: AgentConfig):
        self.config = config
        self.client = httpx.Client(
            proxy=config.proxy_url,
            headers={"User-Agent": config.user_agent},
            timeout=30.0,
            follow_redirects=True,
        )

    def login(self, url: str, credentials: dict) -> httpx.Response:
        return self.client.post(url, data=credentials)

    def get(self, url: str) -> httpx.Response:
        return self.client.get(url)

    def post(self, url: str, data: dict) -> httpx.Response:
        return self.client.post(url, json=data)

    def close(self):
        self.client.close()

    def __enter__(self):
        return self

    def __exit__(self, *args):
        self.close()

Usage:

config = AgentConfig(proxy_url="http://user:pass@us-east.statproxies.com:3128")

with AgentSession(config) as session:
    session.login("https://service.com/login", {"email": email, "password": password})
    response = session.get("https://service.com/api/data")

Browser Automation

For services requiring full browser interaction:

from playwright.async_api import async_playwright

async def create_browser_session(proxy_url: str):
    """Create a browser session routed through the proxy."""
    playwright = await async_playwright().start()

    # Parse proxy URL
    # Format: http://user:pass@host:port
    from urllib.parse import urlparse
    parsed = urlparse(proxy_url)

    browser = await playwright.chromium.launch(
        headless=True,
        args=[
            "--disable-blink-features=AutomationControlled",
        ]
    )

    context = await browser.new_context(
        proxy={
            "server": f"{parsed.scheme}://{parsed.hostname}:{parsed.port}",
            "username": parsed.username,
            "password": parsed.password,
        },
        user_agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36",
        viewport={"width": 1920, "height": 1080},
        locale="en-US",
        timezone_id="America/New_York",
    )

    return playwright, browser, context

Note: Headless browser detection is a separate problem. The proxy handles IP reputation; you'll still need to address browser fingerprinting for heavily protected sites.

Multi-User Proxy Assignment

When your platform serves multiple users, each needs a consistent proxy:

import hashlib

class ProxyPool:
    def __init__(self, proxies: list[str]):
        self.proxies = proxies
        self._assignments: dict[str, str] = {}

    def get_proxy(self, user_id: str) -> str:
        """
        Deterministically assign a proxy to a user.
        Same user always receives the same proxy.
        """
        if user_id in self._assignments:
            return self._assignments[user_id]

        # Hash user_id to select proxy
        hash_bytes = hashlib.sha256(user_id.encode()).digest()
        index = int.from_bytes(hash_bytes[:4], "big") % len(self.proxies)

        proxy = self.proxies[index]
        self._assignments[user_id] = proxy
        return proxy

This ensures:

  • User A always gets the same proxy across sessions
  • Proxy assignment is deterministic (survives restarts)
  • Users are distributed across available proxies

Geographic Matching

Assign proxies that match user location to avoid geographic anomalies:

from dataclasses import dataclass

@dataclass
class RegionalPool:
    region: str
    proxies: list[str]

class GeoAwareProxyPool:
    def __init__(self, pools: list[RegionalPool]):
        self.pools = {pool.region: pool.proxies for pool in pools}
        self._assignments: dict[str, str] = {}

    def get_proxy(self, user_id: str, user_region: str) -> str:
        key = f"{user_id}:{user_region}"

        if key in self._assignments:
            return self._assignments[key]

        # Get regional pool, fall back to default
        proxies = self.pools.get(user_region, self.pools.get("us-east", []))
        if not proxies:
            raise ValueError(f"No proxies available for region {user_region}")

        # Deterministic assignment within region
        hash_bytes = hashlib.sha256(user_id.encode()).digest()
        index = int.from_bytes(hash_bytes[:4], "big") % len(proxies)

        proxy = proxies[index]
        self._assignments[key] = proxy
        return proxy

# Setup
pool = GeoAwareProxyPool([
    RegionalPool("us-east", ["http://user:pass@us-east-1.example.com:3128"]),
    RegionalPool("us-west", ["http://user:pass@us-west-1.example.com:3128"]),
    RegionalPool("eu-west", ["http://user:pass@eu-west-1.example.com:3128"]),
])

# User in California gets US West proxy
proxy = pool.get_proxy(user_id="user_123", user_region="us-west")

Handling Failures

Even with proper infrastructure, you'll encounter blocks, challenges, and failures. Build systems that handle them gracefully.

Retry with Fallback

When a proxy fails, retry before falling back:

import httpx
from tenacity import retry, stop_after_attempt, wait_exponential

class ResilientSession:
    def __init__(self, primary_proxy: str, fallback_proxy: str | None = None):
        self.primary = primary_proxy
        self.fallback = fallback_proxy

    @retry(stop=stop_after_attempt(3), wait=wait_exponential(min=1, max=10))
    async def request(self, method: str, url: str, **kwargs) -> httpx.Response:
        async with httpx.AsyncClient(proxy=self.primary, timeout=30.0) as client:
            response = await client.request(method, url, **kwargs)
            response.raise_for_status()
            return response

    async def request_with_fallback(self, method: str, url: str, **kwargs) -> httpx.Response:
        try:
            return await self.request(method, url, **kwargs)
        except Exception as primary_error:
            if not self.fallback:
                raise

            # Try fallback proxy
            async with httpx.AsyncClient(proxy=self.fallback, timeout=30.0) as client:
                return await client.request(method, url, **kwargs)

Scaling

A few considerations as you grow.

Pool Sizing

Estimate proxy needs:

Required = Peak concurrent users × Safety margin

Example:
- 10,000 daily active users
- 20% concurrent at peak = 2,000 concurrent
- 1.5x safety margin = 3,000 proxies

If users access multiple services simultaneously, multiply accordingly.

Cost Management

Static ISP proxies typically run

-5/IP/month. At scale:

Compare against: engineering time debugging rotating proxy failures, user churn from unreliable access, support costs from blocked accounts.

Observability

Track metrics that matter:

What This Doesn't Solve

Static ISP proxies handle IP reputation. They don't solve:

Browser fingerprinting: Heavily protected sites (banking, major e-commerce) run JavaScript fingerprinting. You'll need additional tooling for fingerprint management.

CAPTCHAs: When you do hit challenges, you'll need solving infrastructure or user escalation paths.

Terms of service: Automated access violates ToS for many services. Understand the legal and business risks for your use case.

Rate limits: Having trusted IPs doesn't mean unlimited requests. Respect service-specific rate limits.

Summary

AI agents accessing user accounts need infrastructure that services trust. The path from "works in development" to "reliable in production" requires:

  1. Residential IP classification - datacenter IPs face immediate scrutiny
  2. Session consistency - the same IP across the entire session
  3. Geographic alignment - proxies that match user locations
  4. Failure handling - health checks, retries, and fallbacks

Static ISP proxies provide the foundation. The rest is engineering the systems around them to handle the edge cases, failures, and scale that production demands.