Building Reliable Proxy Infrastructure for AI Agents
By Nicholas St. Germain —
Your agent works perfectly in development. It logs into the user's account, performs the task, returns the result. Ship it.
Then you hit production. The first user's bank flags the login as suspicious. Gmail demands re-verification. Shopify blocks the session entirely. Your agent isn't doing anything malicious-but the infrastructure you're running it on looks indistinguishable from a credential stuffing attack.
This is the core infrastructure problem for AI agents that access privileged resources: websites have spent a decade building systems to detect and block exactly the kind of access patterns your agent produces.
This guide covers what those detection systems look for, why common infrastructure choices fail, and how to build agent systems that maintain reliable access at scale.
What Services Actually Detect
Before solving the problem, understand what you're up against. Modern fraud detection analyzes requests across multiple dimensions simultaneously.
IP Reputation
Every IP address carries a reputation score. Services like MaxMind and IPQualityScore categorize IPs by:
- Type: Residential, mobile, datacenter, VPN, proxy
- History: Previous abuse patterns, spam reports, fraud associations
- ASN ownership: Consumer ISP vs cloud provider vs hosting company
When your agent runs from an EC2 instance, the IP is immediately flagged as datacenter infrastructure. Many services apply stricter scrutiny or block outright-before examining anything else about the request.
Session Consistency
Services track session behavior over time:
Login from 73.162.xx.xx (Comcast, Chicago) at 9:00 AM
API call from 73.162.xx.xx at 9:01 AM
API call from 73.162.xx.xx at 9:15 AM
API call from 184.72.xx.xx (AWS, Virginia) at 9:16 AM ← Suspicious
The IP change mid-session triggers review. If your infrastructure rotates IPs-intentionally or due to proxy pool behavior-you'll hit this constantly.
Geographic Impossibility
Fraud systems flag physically impossible travel:
9:00 AM - Login from New York
9:05 AM - Login from Los Angeles ← 2,800 miles in 5 minutes
Rotating proxy pools often span geographic regions. Your agent might get a New York IP for one request and an LA IP for the next. The service sees impossible teleportation.
Browser Fingerprinting
For agents using browser automation (Playwright, Puppeteer), services collect:
- Canvas and WebGL fingerprints
- Installed fonts and plugins
- Screen dimensions and color depth
- Timezone and language settings
Headless browsers have detectable signatures. Services like Cloudflare and PerimeterX specifically look for automation markers.
Why Common Infrastructure Fails
Most teams try one of these approaches first. All have significant problems for authenticated access.
Datacenter IPs (EC2, GCP, etc.)
Running agents from cloud VMs is the obvious first choice. It fails immediately.
Cloud IP ranges are well-documented and widely blocked. Even if not blocked outright, datacenter IPs face:
- Elevated CAPTCHA rates
- Stricter rate limits
- Required additional verification
- Session flags for manual review
Some services work fine from datacenter IPs. Banking, healthcare, and e-commerce rarely do.
Rotating Residential Proxies
Residential proxies provide IPs from real ISPs, solving the reputation problem. But rotation creates new ones.
Session binding breaks: Many services tie sessions to IP addresses. When your proxy rotates mid-session, you're logged out-or worse, flagged.
Reputation inheritance: Rotating pools share IPs across customers. You might receive an IP that was just used for abuse, inheriting its damaged reputation.
Unpredictable behavior: When an IP change can happen any moment, your agent needs complex retry logic for every authenticated operation.
VPNs
Commercial VPN IPs are catalogued and often blocked. They also frequently rotate or share IPs across users, creating the same session consistency problems.
Static ISP Proxies
Static ISP proxies combine the trust of residential IPs with the reliability of dedicated infrastructure.
Residential classification: The IP belongs to a consumer ISP (Comcast, Verizon, AT&T). IP reputation databases classify it as a normal residential connection.
Dedicated assignment: The IP is yours exclusively. No rotation, no sharing, no inherited reputation from other users.
Consistent identity: Your agent uses the same IP across sessions, building positive reputation over time.
The tradeoff is cost-static ISP proxies are more expensive than rotating alternatives. For authenticated access to protected services, the reliability difference justifies it.
Implementation
Here's how to build agent infrastructure using static ISP proxies. Examples use Python with httpx for HTTP and playwright for browser automation.
Basic Session Management
For API-based access:
import httpx
from dataclasses import dataclass
@dataclass
class AgentConfig:
proxy_url: str # http://user:pass@proxy.example.com:3128
user_agent: str = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36"
class AgentSession:
def __init__(self, config: AgentConfig):
self.config = config
self.client = httpx.Client(
proxy=config.proxy_url,
headers={"User-Agent": config.user_agent},
timeout=30.0,
follow_redirects=True,
)
def login(self, url: str, credentials: dict) -> httpx.Response:
return self.client.post(url, data=credentials)
def get(self, url: str) -> httpx.Response:
return self.client.get(url)
def post(self, url: str, data: dict) -> httpx.Response:
return self.client.post(url, json=data)
def close(self):
self.client.close()
def __enter__(self):
return self
def __exit__(self, *args):
self.close()
Usage:
config = AgentConfig(proxy_url="http://user:pass@us-east.statproxies.com:3128")
with AgentSession(config) as session:
session.login("https://service.com/login", {"email": email, "password": password})
response = session.get("https://service.com/api/data")
Browser Automation
For services requiring full browser interaction:
from playwright.async_api import async_playwright
async def create_browser_session(proxy_url: str):
"""Create a browser session routed through the proxy."""
playwright = await async_playwright().start()
# Parse proxy URL
# Format: http://user:pass@host:port
from urllib.parse import urlparse
parsed = urlparse(proxy_url)
browser = await playwright.chromium.launch(
headless=True,
args=[
"--disable-blink-features=AutomationControlled",
]
)
context = await browser.new_context(
proxy={
"server": f"{parsed.scheme}://{parsed.hostname}:{parsed.port}",
"username": parsed.username,
"password": parsed.password,
},
user_agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36",
viewport={"width": 1920, "height": 1080},
locale="en-US",
timezone_id="America/New_York",
)
return playwright, browser, context
Note: Headless browser detection is a separate problem. The proxy handles IP reputation; you'll still need to address browser fingerprinting for heavily protected sites.
Multi-User Proxy Assignment
When your platform serves multiple users, each needs a consistent proxy:
import hashlib
class ProxyPool:
def __init__(self, proxies: list[str]):
self.proxies = proxies
self._assignments: dict[str, str] = {}
def get_proxy(self, user_id: str) -> str:
"""
Deterministically assign a proxy to a user.
Same user always receives the same proxy.
"""
if user_id in self._assignments:
return self._assignments[user_id]
# Hash user_id to select proxy
hash_bytes = hashlib.sha256(user_id.encode()).digest()
index = int.from_bytes(hash_bytes[:4], "big") % len(self.proxies)
proxy = self.proxies[index]
self._assignments[user_id] = proxy
return proxy
This ensures:
- User A always gets the same proxy across sessions
- Proxy assignment is deterministic (survives restarts)
- Users are distributed across available proxies
Geographic Matching
Assign proxies that match user location to avoid geographic anomalies:
from dataclasses import dataclass
@dataclass
class RegionalPool:
region: str
proxies: list[str]
class GeoAwareProxyPool:
def __init__(self, pools: list[RegionalPool]):
self.pools = {pool.region: pool.proxies for pool in pools}
self._assignments: dict[str, str] = {}
def get_proxy(self, user_id: str, user_region: str) -> str:
key = f"{user_id}:{user_region}"
if key in self._assignments:
return self._assignments[key]
# Get regional pool, fall back to default
proxies = self.pools.get(user_region, self.pools.get("us-east", []))
if not proxies:
raise ValueError(f"No proxies available for region {user_region}")
# Deterministic assignment within region
hash_bytes = hashlib.sha256(user_id.encode()).digest()
index = int.from_bytes(hash_bytes[:4], "big") % len(proxies)
proxy = proxies[index]
self._assignments[key] = proxy
return proxy
# Setup
pool = GeoAwareProxyPool([
RegionalPool("us-east", ["http://user:pass@us-east-1.example.com:3128"]),
RegionalPool("us-west", ["http://user:pass@us-west-1.example.com:3128"]),
RegionalPool("eu-west", ["http://user:pass@eu-west-1.example.com:3128"]),
])
# User in California gets US West proxy
proxy = pool.get_proxy(user_id="user_123", user_region="us-west")
Handling Failures
Even with proper infrastructure, you'll encounter blocks, challenges, and failures. Build systems that handle them gracefully.
Retry with Fallback
When a proxy fails, retry before falling back:
import httpx
from tenacity import retry, stop_after_attempt, wait_exponential
class ResilientSession:
def __init__(self, primary_proxy: str, fallback_proxy: str | None = None):
self.primary = primary_proxy
self.fallback = fallback_proxy
@retry(stop=stop_after_attempt(3), wait=wait_exponential(min=1, max=10))
async def request(self, method: str, url: str, **kwargs) -> httpx.Response:
async with httpx.AsyncClient(proxy=self.primary, timeout=30.0) as client:
response = await client.request(method, url, **kwargs)
response.raise_for_status()
return response
async def request_with_fallback(self, method: str, url: str, **kwargs) -> httpx.Response:
try:
return await self.request(method, url, **kwargs)
except Exception as primary_error:
if not self.fallback:
raise
# Try fallback proxy
async with httpx.AsyncClient(proxy=self.fallback, timeout=30.0) as client:
return await client.request(method, url, **kwargs)
Scaling
A few considerations as you grow.
Pool Sizing
Estimate proxy needs:
Required = Peak concurrent users × Safety margin
Example:
- 10,000 daily active users
- 20% concurrent at peak = 2,000 concurrent
- 1.5x safety margin = 3,000 proxies
If users access multiple services simultaneously, multiply accordingly.
Cost Management
Static ISP proxies typically run