Technology

Debugging Private Networks When Public Queries Are Blocked: Azure SRE Agent + MCP

January 15, 202629 min read
AzureSREPrivate LinkVNetMCPSecurityObservability

Debugging Private Network Issues When Public Queries Are Blocked


⚑ TL;DR

When Azure Private Link blocks public queries to your Log Analytics workspace, Azure SRE Agent can still investigate issuesβ€”by delegating queries to an MCP server running inside your VNet. The MCP server queries Log Analytics from within the private network, and SRE Agent coordinates the investigation remotely.


πŸ” The Private Network Security Challenge

Modern cloud architecture often requires strict network isolation:

πŸ›‘οΈ Security teams say: "No public access to Log Analytics. Private Link with query access blocked from outside the VNet."

πŸ”§ SRE teams say: "How is our AI agent supposed to query logs if it can't reach Log Analytics?"

This creates a real problem. When you configure Azure Monitor Private Link Scope (AMPLS) with queryAccessMode: PrivateOnly and disable public query access on your Log Analytics workspace, all external queries are blockedβ€”including those from Azure SRE Agent.

Try querying from outside the VNet and you'll see:

❌ InsufficientAccessError: The query was blocked due to private link configuration. 
   Access is denied because this request was not made through a private endpoint.

The question: How can an AI agent investigate issues in a workspace it can't query?


πŸŒ‰ The Solution: MCP Server Inside the VNet

The answer is the Model Context Protocol (MCP). Instead of SRE Agent querying Log Analytics directly, we deploy an MCP server inside the VNet that can query Log Analytics on the agent's behalf.

Architecture diagram showing MCP server inside VNet querying Log Analytics

πŸ“ Click to expand: Architecture Diagram (Text Version)
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                   Private VNet (VNet-Integrated Environment)                β”‚
β”‚                                                                             β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”β”‚
β”‚  β”‚                     Azure Container Apps Environment                    β”‚β”‚
β”‚  β”‚                         (VNet-Integrated)                               β”‚β”‚
β”‚  β”‚                                                                          β”‚β”‚
β”‚  β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚β”‚
β”‚  β”‚  β”‚              Log Analytics MCP Server                           β”‚    β”‚β”‚
β”‚  β”‚  β”‚     (log-analytics-mcp-vnet.*.azurecontainerapps.io)           β”‚    β”‚β”‚
β”‚  β”‚  β”‚                                                                  β”‚    β”‚β”‚
β”‚  β”‚  β”‚  Tools:                                                          β”‚    β”‚β”‚
β”‚  β”‚  β”‚  β”œβ”€β”€ query_logs        - Execute any KQL query                  β”‚    β”‚β”‚
β”‚  β”‚  β”‚  β”œβ”€β”€ list_tables       - Discover available tables              β”‚    β”‚β”‚
β”‚  β”‚  β”‚  β”œβ”€β”€ get_workspace_info - Data volume and usage stats          β”‚    β”‚β”‚
β”‚  β”‚  β”‚  β”œβ”€β”€ analyze_errors    - Pre-built error pattern analysis      β”‚    β”‚β”‚
β”‚  β”‚  β”‚  └── check_vm_health   - Heartbeat, CPU, memory, error count   β”‚    β”‚β”‚
β”‚  β”‚  β”‚                                                                  β”‚    β”‚β”‚
β”‚  β”‚  β”‚  Authentication: Managed Identity (Log Analytics Reader)        β”‚    β”‚β”‚
β”‚  β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚β”‚
β”‚  β”‚                              β”‚                                          β”‚β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜β”‚
β”‚                                 β”‚ Internal query via Private Endpoint       β”‚
β”‚                                 β–Ό                                           β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”β”‚
β”‚  β”‚                    Log Analytics Workspace                              β”‚β”‚
β”‚  β”‚                                                                          β”‚β”‚
β”‚  β”‚  Settings:                                                               β”‚β”‚
β”‚  β”‚  β€’ publicNetworkAccessForQuery: Disabled                                β”‚β”‚
β”‚  β”‚  β€’ AMPLS queryAccessMode: PrivateOnly                                   β”‚β”‚
β”‚  β”‚                                                                          β”‚β”‚
β”‚  β”‚  Tables: Syslog, Perf, Heartbeat, ContainerLog, etc.                   β”‚β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                 β”‚
                                 β”‚ Streamable HTTP (MCP Protocol)
                                 β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                         Azure SRE Agent                                     β”‚
β”‚                    (Outside the VNet)                                       β”‚
β”‚                                                                             β”‚
β”‚  "Investigate the errors in my private Log Analytics workspace"            β”‚
β”‚                                                                             β”‚
β”‚  βœ“ Calls MCP server tools over Streamable HTTP                             β”‚
β”‚  βœ“ MCP server queries Log Analytics internally                              β”‚
β”‚  βœ“ Results returned to agent for analysis                                   β”‚
β”‚  βœ“ No direct VNet access required                                           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ’‘ Key Insight

The MCP server acts as a trusted bridge:

Capability Description
🏠 Runs inside the VNet Can query the Private Link-protected workspace
🌐 Exposes HTTP endpoint SRE Agent can call it from anywhere
πŸ”‘ Uses Managed Identity No secrets needed for authentication
πŸ—£οΈ Natural language to KQL Transforms your questions into queries

🚨 The 3 AM Incident

Let's walk through a real scenario.

πŸ“± The Alert

It's 3:17 AM. Your phone buzzes:

⚠️ CRITICAL: Payment Service - High Error Rate
πŸ“ Affected: app-vm (10.0.1.10)
πŸ“Š Error rate: 47% (threshold: 5%)

Alert notification

In the old world, you'd be scrambling for VPN credentials. But you have Azure SRE Agent.


πŸ” The Investigation

You open SRE Agent from your phone (or laptopβ€”no VPN needed) and type:

"I got an alert about high error rates on app-vm in my private network. The VM has no public IP, but logs should be in Log Analytics. Can you investigate?"

Within seconds, SRE Agent responds:

SRE Agent investigating the issue

πŸ”Ž Click to expand: SRE Agent Investigation Output
I'll query Log Analytics for recent errors from app-vm. Let me check 
the Syslog and performance data...

Found 47 error entries in the last 15 minutes from app-vm. 
Here's what I'm seeing:

πŸ“Š Error Pattern Analysis:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
β€’ 23 Γ— "Connection to database timed out after 30000ms"
β€’ 12 Γ— "Failed to process transaction: Connection refused"  
β€’ 8 Γ— "Circuit breaker OPEN for database connection pool"
β€’ 4 Γ— "OutOfMemoryError in transaction handler thread"

πŸ” Root Cause Indicators:
The errors suggest a cascade failure:
1. Database connectivity issues started first
2. Connection pool exhausted, circuit breaker opened
3. Memory pressure built up from queued requests
4. Application became unresponsive

Let me check the performance metrics and correlate with db-vm...

πŸ”— Cross-VM Correlation

Without any additional prompting, SRE Agent digs deeper:

Cross-VM correlation analysis

πŸ“ˆ Click to expand: Performance Analysis Output
πŸ“ˆ Performance Analysis (Last 30 minutes):

app-vm:
β”œβ”€β”€ Memory: 94% used (up from 45% baseline) ⚠️
β”œβ”€β”€ CPU: 78% (elevated)
└── First error: 03:02 AM

db-vm:
β”œβ”€β”€ Disk I/O: 450 MB/s writes (unusual spike) ⚠️
β”œβ”€β”€ MySQL errors: "Too many connections", "Lock wait timeout"
└── First error: 02:58 AM

web-vm:
β”œβ”€β”€ nginx: "upstream timed out", "502 Bad Gateway"
└── First error: 03:05 AM

πŸ”— Correlation Timeline:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
02:58 - db-vm: High disk I/O, MySQL connection issues begin
03:02 - app-vm: Database timeouts start, memory climbing
03:05 - web-vm: Upstream timeouts, 502 errors to users
03:17 - Alert triggered (47% error rate)

πŸ’‘ Root Cause:
The issue originated on db-vm with abnormal disk I/O causing 
MySQL connection failures. This cascaded to app-vm (connection 
timeouts, memory pressure) and finally to web-vm (502 errors).

Recommendation: Investigate what caused the disk I/O spike on 
db-vm. Check for runaway queries, backup jobs, or disk issues.

βœ… The Resolution

Based on SRE Agent's analysis, you know exactly where to focus. The issue isn't the app-vm memory (a symptom)β€”it's the db-vm disk I/O (the root cause).

You can even ask SRE Agent for specific KQL queries to run:

"Give me a KQL query to find the exact MySQL statements that caused the lock wait timeouts"

Syslog
| where TimeGenerated > ago(1h)
| where Computer == "db-vm"
| where SyslogMessage contains "Lock wait" or SyslogMessage contains "Deadlock"
| project TimeGenerated, SyslogMessage
| order by TimeGenerated desc

SRE Agent providing KQL query

⏱️ Time from alert to root cause identification: Under 5 minutes.

🚫 No VPN. 🚫 No Bastion. 🚫 No SSH. βœ… Just answers.


πŸ›‘οΈ Why This Matters

Security Benefits

Concern How This Architecture Addresses It
πŸ”’ Data in transit Private Link keeps all data on Azure backbone
🌐 Public exposure VMs have zero public IPs
πŸ”‘ Access control RBAC on Log Analytics, no direct VM access needed
πŸ“ Audit trail All SRE Agent queries logged in Azure Activity Log
βœ… Compliance Data never leaves your Azure region

Operational Benefits

🐒 Old Way ⚑ New Way
VPN + Bastion + SSH Open browser, ask a question
grep across 10 VMs Single query, all VMs at once
Requires network expertise Natural language queries
30+ minutes to understand Under 5 minutes to root cause

πŸ” Security Deep Dive: MCP Bridge vs. Pure Private Link

For security-minded readers, let's be clear about the architecture, the tradeoffs, and how this pattern compares to pure Private Link solutions.

Understanding the Two Patterns

πŸ”— Pure Private Link (Multitenant SaaS Pattern)

The Azure Private Link in multitenant solutions pattern is designed for SaaS providers exposing services to tenants:

Aspect How It Works
Direction Tenant VNet β†’ SaaS Provider (inbound to provider)
Purpose Allow tenants to access a SaaS service privately from their VNet
Network Path Tenant VNet β†’ Private Endpoint β†’ Provider's Load Balancer/PaaS
Traffic Entirely on Azure backboneβ€”no public internet
Use Case Enterprise customers accessing SaaS from their private networks
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Tenant A      β”‚         β”‚   Tenant B      β”‚         β”‚   Your SaaS     β”‚
β”‚     VNet        β”‚         β”‚     VNet        β”‚         β”‚   Service       β”‚
β”‚                 β”‚         β”‚                 β”‚         β”‚                 β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚         β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚         β”‚                 β”‚
β”‚ β”‚  Private    │─┼─────────┼──  Private    │─┼──────────►               β”‚
β”‚ β”‚  Endpoint   β”‚ β”‚   Azure β”‚ β”‚  Endpoint   β”‚ β”‚  Azure  β”‚                 β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚  Backboneβ”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ Backboneβ”‚                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

βœ… Entire path is privateβ€”no public internet exposure

πŸŒ‰ MCP Bridge Pattern (Our Approach)

Our pattern is designed for a different problem: enabling an external AI agent to query resources that are (correctly!) protected by Private Link:

Aspect How It Works
Direction External AI Agent β†’ MCP Server (in VNet) β†’ Private Resource
Purpose Allow external AI to query customer's own private resources
Network Path Public HTTPS β†’ MCP Server β†’ Private Endpoint β†’ Log Analytics
Traffic Public on ingress, private on the data plane
Use Case AI-assisted investigation of Private Link-protected resources
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        Customer's VNet                               β”‚
β”‚                                                                      β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  Private   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                 β”‚
β”‚  β”‚  MCP Server     β”‚  Endpoint  β”‚  Log Analytics  β”‚                 β”‚
β”‚  β”‚  (Container App)│───────────►│  (Private Only) β”‚                 β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β–²β”€β”€β”€β”€β”€β”€β”€β”€β”˜            β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                 β”‚
β”‚           β”‚                                                          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
            β”‚ Public HTTPS
            β”‚ (API Key + TLS)
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”
    β”‚  SRE Agent    β”‚
    β”‚  (external)   β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

⚠️ Public ingress point exists, but Log Analytics remains fully private

Key Difference: The Public Ingress Point

Let's be explicit about the tradeoff:

Aspect Pure Private Link MCP Bridge Pattern
Ingress from client Private (VNet/VPN/ExpressRoute) Public HTTPS
Data plane access Private Private (via Private Endpoint)
Works with external AI ❌ Noβ€”client must be in VNet βœ… Yesβ€”that's the purpose
Zero public exposure βœ… Yes ⚠️ MCP server has public endpoint

Why the public endpoint?
SRE Agent runs as a cloud serviceβ€”it's not deployed in your VNet. To enable AI-assisted investigation, we need a bridge between the external agent and the private resources. The MCP server is that bridge.

What's Protected, What's Exposed

Component Access Level Why
Log Analytics Workspace πŸ”’ Private Only queryAccessMode: PrivateOnly enforcedβ€”no public queries
MCP Server Ingress 🌐 Public HTTPS Required for SRE Agent to reach it
MCP Server β†’ Log Analytics πŸ”’ Private Uses Private Endpoint inside VNet
Query Results πŸ”’ Encrypted Returned over TLS, processed by MCP server

πŸ’‘ Key insight: Log Analytics is never exposed publicly. The MCP server acts as a controlled gateway that only exposes specific operations (the MCP tools), not raw Log Analytics access.

How We Secure the Public Ingress

The MCP server's public endpoint is protected by multiple layers:

Control Implementation Purpose
πŸ”‘ API Key X-API-Key header required on all requests Only authorized clients can call the MCP server
πŸ” TLS HTTPS encryption for all traffic Data in transit is encrypted
πŸͺͺ Managed Identity MCP server uses Azure MI for Log Analytics No secrets in code or config
πŸ›‘οΈ RBAC MCP server has only Log Analytics Reader role Minimal permissions
πŸ“¦ Scoped Tools MCP exposes only specific operations Can't run arbitrary commands
πŸ“ Audit Logging All queries logged in Container Apps Full audit trail

Production Security Hardening

For production deployments, add these additional controls:

Security Layer How to Implement Impact
πŸͺͺ Entra ID Authentication Enable Container Apps Easy Auth with Entra ID Only authenticated users/apps can call MCP
🌐 IP Allowlisting Restrict ingress to known SRE Agent IP ranges Limits who can reach the endpoint
πŸ”’ Key Vault for API Keys Store MCP_API_KEY in Azure Key Vault Secrets rotation, no plaintext keys
πŸ“Š Azure Monitor Alerts Alert on unusual query patterns Detect potential abuse
🚦 Rate Limiting Configure Container Apps scaling limits Prevent denial of service
πŸ” Diagnostic Logging Enable Container Apps system logs Forensic investigation capability
πŸ“ Click to expand: Adding Entra ID Authentication
# Enable Easy Auth with Entra ID on the Container App
az containerapp auth update \
  --name log-analytics-mcp-vnet \
  --resource-group vnet-test-rg \
  --enabled true \
  --provider aad \
  --client-id "your-app-registration-client-id" \
  --client-secret "your-app-registration-secret" \
  --issuer "https://login.microsoftonline.com/your-tenant-id/v2.0"
πŸ“ Click to expand: IP Restriction Configuration
# Restrict ingress to specific IP ranges
az containerapp ingress access-restriction set \
  --name log-analytics-mcp-vnet \
  --resource-group vnet-test-rg \
  --action Allow \
  --ip-address 52.x.x.x/32 \
  --rule-name "SREAgentIP" \
  --description "Allow SRE Agent service IPs"

When to Use Each Pattern

Use Pure Private Link When:

Scenario Example
All clients are in Azure VNets Internal microservices
Clients connect via VPN/ExpressRoute Enterprise users on corporate network
Zero public exposure is mandated Compliance requirement for no public endpoints
You're a SaaS provider serving enterprise tenants B2B API where all partners have Azure VNets
You control both ends Your app calling your database

Use MCP Bridge When:

Scenario Example
External AI agent needs to query private resources SRE Agent investigating private Log Analytics
Can't deploy the AI client inside the VNet Cloud-hosted AI services
Need natural language interface to private data "Show me errors from my private VMs"
Customer wants AI assistance but has Private Link enabled AI for security-conscious enterprise
You want investigation without VPN/Bastion access 3 AM incident response from phone

Decision Tree

                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚ Does the client run inside an       β”‚
                    β”‚ Azure VNet (or via VPN/ExpressRoute)?β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                     β”‚
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β–Ό                                 β–Ό
                   YES                               NO
                    β”‚                                 β”‚
                    β–Ό                                 β–Ό
         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
         β”‚ Use Private Link β”‚              β”‚ Is the client an β”‚
         β”‚ (end-to-end      β”‚              β”‚ AI agent/tool?   β”‚
         β”‚  private)        β”‚              β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                       β”‚
                                      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                                      β–Ό                           β–Ό
                                     YES                          NO
                                      β”‚                           β”‚
                                      β–Ό                           β–Ό
                           β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                           β”‚ Use MCP Bridge   β”‚        β”‚ Consider:        β”‚
                           β”‚ (proxy pattern)  β”‚        β”‚ - API Gateway    β”‚
                           β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β”‚ - Azure Front    β”‚
                                                       β”‚   Door + WAF     β”‚
                                                       β”‚ - App Service    β”‚
                                                       β”‚   with auth      β”‚
                                                       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Risk Assessment

Risk Likelihood Impact Mitigation
API key exposure Low High Store in Key Vault, rotate regularly, use Entra ID instead
Unauthorized MCP access Low Medium IP allowlisting, Entra ID auth, rate limiting
Overly permissive queries Low Medium MCP tools are scopedβ€”can't run arbitrary commands
Data exfiltration via MCP Very Low High Audit logging, query result size limits, alerting
MCP server compromise Very Low High Container scanning, regular updates, network policies

Bottom Line: Is This Pattern Secure?

Yes, with proper configuration.

Security Question Answer
Is Log Analytics exposed publicly? Noβ€”it remains Private Link-protected
Can anyone query via the MCP server? Noβ€”API key/Entra ID required
Are queries audited? Yesβ€”Container Apps logs + Azure Activity Log
Is this as secure as pure Private Link? Almostβ€”public ingress exists but is well-protected
Is this secure enough for production? Yesβ€”with the hardening steps above

πŸ” The pattern is secure for its intended purpose: enabling an external AI agent to query private resources without exposing those resources publicly.


πŸ§ͺ Proving It Works: Blocked vs. Allowed

Let's prove that Private Link actually blocks queries and that the VNet MCP server can bypass this block.

❌ Test 1: Query from Outside the VNet (BLOCKED)

First, try querying Log Analytics from a server outside the VNet. This simulates what happens when SRE Agent tries to query directly:

πŸ“ Click to expand: cURL command
# MCP server running OUTSIDE the VNet
curl -X POST "https://log-analytics-mcp.happydune-ad37d82a.eastus2.azurecontainerapps.io/mcp/" \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -H "X-API-Key: your-api-key" \
  -d '{
    "jsonrpc": "2.0",
    "method": "tools/call",
    "params": {
      "name": "list_tables",
      "arguments": {}
    },
    "id": 1
  }'

Result: BLOCKED ❌

{
  "jsonrpc": "2.0",
  "id": 1,
  "error": {
    "code": -32603,
    "message": "InsufficientAccessError: The query was blocked due to private link 
    configuration. Access is denied because this request was not made through 
    a private endpoint..."
  }
}

βœ… Test 2: Query from Inside the VNet (SUCCESS)

Now try the same query using the MCP server running inside the VNet:

πŸ“ Click to expand: cURL command
# MCP server running INSIDE the VNet
curl -X POST "https://log-analytics-mcp-vnet.redbay-525c1802.eastus.azurecontainerapps.io/mcp/" \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -H "X-API-Key: your-api-key" \
  -d '{
    "jsonrpc": "2.0",
    "method": "tools/call",
    "params": {
      "name": "list_tables",
      "arguments": {}
    },
    "id": 1
  }'

Result: SUCCESS βœ…

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "content": [
      {
        "type": "text",
        "text": "Found 15 tables in the workspace:\n\n1. AzureDiagnostics\n2. ContainerAppConsoleLogs\n3. ContainerAppSystemLogs\n..."
      }
    ]
  }
}

πŸ’‘ This proves the architecture works: The VNet MCP server can access the Private Link-protected workspace while external access is blocked.


πŸš€ Try It Yourself

Want to experience this pattern firsthand? Here's how to deploy the full demo.

πŸ“‹ Prerequisites

  • βœ… Azure subscription with Contributor access
  • βœ… Azure CLI installed
  • βœ… Docker (for building the MCP server image)

1️⃣ Deploy the VNet-Integrated Environment

πŸ“ Click to expand: Azure CLI commands
# Create resource group
az group create --name vnet-test-rg --location eastus

# Create VNet
az network vnet create \
  --resource-group vnet-test-rg \
  --name vnet-test-vnet \
  --address-prefix 10.0.0.0/16 \
  --subnet-name infrastructure \
  --subnet-prefix 10.0.0.0/23

# Create VNet-integrated Container Apps environment
az containerapp env create \
  --name vnet-test-env \
  --resource-group vnet-test-rg \
  --location eastus \
  --infrastructure-subnet-resource-id "/subscriptions/{sub}/resourceGroups/vnet-test-rg/providers/Microsoft.Network/virtualNetworks/vnet-test-vnet/subnets/infrastructure" \
  --internal-only false

# Create Log Analytics workspace
az monitor log-analytics workspace create \
  --resource-group vnet-test-rg \
  --workspace-name workspace-vnettest \
  --location eastus

2️⃣ Configure Private Link (Block Public Access)

πŸ“ Click to expand: Azure CLI commands
# Create Azure Monitor Private Link Scope
az monitor private-link-scope create \
  --name vnet-test-ampls \
  --resource-group vnet-test-rg

# Link workspace to AMPLS
az monitor private-link-scope scoped-resource create \
  --name log-analytics-link \
  --resource-group vnet-test-rg \
  --scope-name vnet-test-ampls \
  --linked-resource "/subscriptions/{sub}/resourceGroups/vnet-test-rg/providers/Microsoft.OperationalInsights/workspaces/workspace-vnettest"

# Set AMPLS to Private Only (block public queries)
az monitor private-link-scope update \
  --name vnet-test-ampls \
  --resource-group vnet-test-rg \
  --query-access PrivateOnly

# Disable public query access on workspace
az monitor log-analytics workspace update \
  --resource-group vnet-test-rg \
  --workspace-name workspace-vnettest \
  --set properties.publicNetworkAccessForQuery=Disabled

3️⃣ Deploy the MCP Server

πŸ“ Click to expand: Azure CLI commands
# Create Azure Container Registry (VNet environments can't pull from public registries)
az acr create \
  --resource-group vnet-test-rg \
  --name yourregistryname \
  --sku Basic

# Build and push the MCP server image
az acr build \
  --registry yourregistryname \
  --image log-analytics-mcp:latest \
  --file Dockerfile .

# Deploy Container App with Managed Identity
az containerapp create \
  --name log-analytics-mcp-vnet \
  --resource-group vnet-test-rg \
  --environment vnet-test-env \
  --image yourregistryname.azurecr.io/log-analytics-mcp:latest \
  --target-port 8000 \
  --ingress external \
  --env-vars "LOG_ANALYTICS_WORKSPACE_ID=your-workspace-guid" "MCP_API_KEY=your-api-key" \
  --system-assigned \
  --registry-server yourregistryname.azurecr.io

# Grant Log Analytics Reader role to the Container App
az role assignment create \
  --assignee-object-id $(az containerapp show --name log-analytics-mcp-vnet --resource-group vnet-test-rg --query identity.principalId -o tsv) \
  --role "Log Analytics Reader" \
  --scope "/subscriptions/{sub}/resourceGroups/vnet-test-rg/providers/Microsoft.OperationalInsights/workspaces/workspace-vnettest"

4️⃣ Configure SRE Agent

Add the MCP server to your SRE Agent subagent configuration:

name: PrivateVNetObservabilityAgent
description: Investigate issues in private VNet resources via MCP
system_prompt: |
  You investigate issues in private VNet environments where Log Analytics 
  queries are blocked by Private Link. Use the MCP tools to query logs.

mcp_tools:
  - LogAnalyticsMCP

tool_definitions:
  LogAnalyticsMCP:
    type: mcp
    description: Query Log Analytics in private VNet
    transport: streamable_http
    url: https://log-analytics-mcp-vnet.redbay-525c1802.eastus.azurecontainerapps.io/mcp/
    headers:
      X-API-Key: "${MCP_API_KEY}"

πŸ€– Configuring the Subagent in SRE Agent Portal

To use this MCP server with Azure SRE Agent, you need to configure both an MCP Connection and a Subagent. Here's the complete setup.

Step 1: Configure the MCP Connection

In the Azure SRE Agent portal, navigate to Settings β†’ MCP Connections and add:

Field Value
Name LogAnalyticsMCP
URL https://log-analytics-mcp-vnet.{your-env}.{region}.azurecontainerapps.io/mcp/
Transport HTTP (Streamable)
Authentication API Key
Header Name X-API-Key
Header Value Your MCP server API key

Step 2: Create the Subagent

In the Subagent Builder, create a new subagent with the following configuration:

Subagent Name

PrivateVNetObservabilityAgent

Description (Handoff Description)

This is critical for proper routing. The description tells the meta-agent when to hand off to this subagent:

Hand off to this agent when the user needs to:
- Debug resources in private VNets that can't be accessed directly
- Investigate network connectivity issues in isolated environments
- Query logs from VMs or Container Apps behind Private Link
- Analyze errors from resources without public network access
This agent uses Log Analytics MCP to access logs via Private Link.

System Prompt

You are a specialized Site Reliability Engineer focused on debugging issues with 
resources deployed in private Azure Virtual Networks.

## Your Expertise
- Troubleshooting connectivity issues in private VNets
- Analyzing logs from resources that aren't directly accessible via public internet
- Identifying network configuration problems
- Diagnosing DNS resolution failures in private networks
- Debugging Azure Container Apps deployed in private VNet environments

## Available MCP Tools
You have access to the Log Analytics MCP server which queries logs from resources
connected to Log Analytics workspaces via Private Link:

1. **query_logs** - Execute KQL queries against the Log Analytics workspace
   - Parameters: 
     - query (string, required): The KQL query to execute
     - timespan (string, optional): Time range like "PT1H", "P1D", "P7D"
   - Example: query_logs(query="ContainerAppConsoleLogs_CL | take 10", timespan="PT24H")

2. **list_tables** - Discover available log tables in the workspace
   - No parameters required
   - Returns table names and their schema

3. **get_workspace_info** - Get workspace configuration details
   - No parameters required
   - Returns workspace ID, location, retention settings

4. **analyze_errors** - Quick error trend analysis
   - Parameters: hours (integer, optional, default: 24)
   - Returns error counts and patterns

5. **check_vm_health** - Check VM heartbeat and connectivity status
   - No parameters required
   - Returns VM health status from Heartbeat table

## Troubleshooting Workflow
1. First understand the problem scope (which resources, what symptoms)
2. Use **list_tables** to discover available log sources
3. Check resource health:
   - For VMs: use **check_vm_health** or query Heartbeat table
   - For Container Apps: query ContainerAppSystemLogs_CL
4. Search for errors using **query_logs** with KQL
5. Use **analyze_errors** for quick error pattern detection
6. Correlate errors with recent deployments or changes
7. Provide actionable remediation steps

## Common Log Tables
| Table | Description |
|-------|-------------|
| Heartbeat | VM connectivity status |
| ContainerAppSystemLogs_CL | Container App system events (start, stop, scale) |
| ContainerAppConsoleLogs_CL | Container App console output (stdout/stderr) |
| AppExceptions | Application exceptions |
| AppTraces | Application traces |
| AzureDiagnostics | Azure resource diagnostics |

## Sample KQL Queries

**Recent Container App events:**
ContainerAppSystemLogs_CL 
| where TimeGenerated > ago(1h)
| project TimeGenerated, ContainerAppName_s, Type_s, Reason_s, Log_s
| order by TimeGenerated desc

**Container App errors:**
ContainerAppConsoleLogs_CL
| where TimeGenerated > ago(24h)
| where Log_s contains "error" or Log_s contains "exception"
| project TimeGenerated, ContainerAppName_s, Log_s

**VM connectivity check:**
Heartbeat
| where TimeGenerated > ago(1h)
| summarize LastHeartbeat = max(TimeGenerated) by Computer
| extend Status = iff(LastHeartbeat < ago(5m), "Disconnected", "Connected")

## Best Practices
- Always start with a time-bounded query (use timespan parameter)
- Use summarize and project to keep results manageable
- Correlate errors across multiple tables when investigating
- Provide specific, actionable recommendations
- When no data is found, suggest checking if the resource is sending logs

MCP Tools

Select the MCP connection you created:

LogAnalyticsMCP

Handoffs

Enable handoff back to the main agent:

meta_agent

Settings

Setting Value
Temperature 0.3
Allow Parallel Tool Calls βœ… Enabled
Max Reflection Count 2

Complete YAML Configuration

For reference, here's the complete YAML that you can import into the Subagent Builder:

πŸ“‹ Click to expand: Full Subagent YAML
name: PrivateVNetObservabilityAgent

system_prompt: |
  You are a specialized Site Reliability Engineer focused on debugging issues with 
  resources deployed in private Azure Virtual Networks.
  
  ## Your Expertise
  - Troubleshooting connectivity issues in private VNets
  - Analyzing logs from resources that aren't directly accessible via public internet
  - Identifying network configuration problems
  - Diagnosing DNS resolution failures in private networks
  - Debugging Azure Container Apps deployed in private VNet environments
  
  ## Available MCP Tools
  You have access to the Log Analytics MCP server which queries logs from resources
  connected to Log Analytics workspaces via Private Link:
  
  1. **query_logs** - Execute KQL queries against the Log Analytics workspace
     - Parameters: 
       - query (string, required): The KQL query to execute
       - timespan (string, optional): Time range like "PT1H", "P1D", "P7D"
     - Example: query_logs(query="ContainerAppConsoleLogs_CL | take 10", timespan="PT24H")
  
  2. **list_tables** - Discover available log tables in the workspace
     - No parameters required
     - Returns table names and their schema
  
  3. **get_workspace_info** - Get workspace configuration details
     - No parameters required
     - Returns workspace ID, location, retention settings
  
  4. **analyze_errors** - Quick error trend analysis
     - Parameters: hours (integer, optional, default: 24)
     - Returns error counts and patterns
  
  5. **check_vm_health** - Check VM heartbeat and connectivity status
     - No parameters required
     - Returns VM health status from Heartbeat table
  
  ## Troubleshooting Workflow
  1. First understand the problem scope (which resources, what symptoms)
  2. Use **list_tables** to discover available log sources
  3. Check resource health:
     - For VMs: use **check_vm_health** or query Heartbeat table
     - For Container Apps: query ContainerAppSystemLogs_CL
  4. Search for errors using **query_logs** with KQL
  5. Use **analyze_errors** for quick error pattern detection
  6. Correlate errors with recent deployments or changes
  7. Provide actionable remediation steps
  
  ## Common Log Tables
  | Table | Description |
  |-------|-------------|
  | Heartbeat | VM connectivity status |
  | ContainerAppSystemLogs_CL | Container App system events (start, stop, scale) |
  | ContainerAppConsoleLogs_CL | Container App console output (stdout/stderr) |
  | AppExceptions | Application exceptions |
  | AppTraces | Application traces |
  | AzureDiagnostics | Azure resource diagnostics |
  
  ## Sample KQL Queries
  
  **Recent Container App events:**
  ContainerAppSystemLogs_CL 
  | where TimeGenerated > ago(1h)
  | project TimeGenerated, ContainerAppName_s, Type_s, Reason_s, Log_s
  | order by TimeGenerated desc
  
  **Container App errors:**
  ContainerAppConsoleLogs_CL
  | where TimeGenerated > ago(24h)
  | where Log_s contains "error" or Log_s contains "exception"
  | project TimeGenerated, ContainerAppName_s, Log_s
  
  **VM connectivity check:**
  Heartbeat
  | where TimeGenerated > ago(1h)
  | summarize LastHeartbeat = max(TimeGenerated) by Computer
  | extend Status = iff(LastHeartbeat < ago(5m), "Disconnected", "Connected")
  
  ## Best Practices
  - Always start with a time-bounded query (use timespan parameter)
  - Use summarize and project to keep results manageable
  - Correlate errors across multiple tables when investigating
  - Provide specific, actionable recommendations
  - When no data is found, suggest checking if the resource is sending logs

handoff_description: |
  Hand off to this agent when the user needs to:
  - Debug resources in private VNets that can't be accessed directly
  - Investigate network connectivity issues in isolated environments
  - Query logs from VMs or Container Apps behind Private Link
  - Analyze errors from resources without public network access
  This agent uses Log Analytics MCP to access logs via Private Link.

# MCP tools from the LogAnalyticsMCP connection
mcp_tools:
  - LogAnalyticsMCP

# Allow handoff back to meta_agent for general queries
handoffs:
  - meta_agent

# Agent settings
temperature: 0.3
allow_parallel_tool_calls: true
max_reflection_count: 2

5️⃣ Test the Integration

Ask SRE Agent to investigate:

πŸ’¬ "Show me the tables available in my private Log Analytics workspace"

πŸ’¬ "Query the last 10 container logs from my application"

πŸ’¬ "Check if there are any errors in the system logs"


πŸ”§ The Architecture in Detail

Why This Works

The key is that data ingestion and query access use different network paths:

Path Direction Network
πŸ“₯ Azure Monitor Agent β†’ Log Analytics Ingest Can use Private Endpoint OR public
❌ External Query β†’ Log Analytics Query BLOCKED by Private Link
βœ… VNet MCP β†’ Log Analytics Query ALLOWED via Private Endpoint
🌐 SRE Agent β†’ MCP Server Query Public HTTPS (MCP protocol)

πŸ’‘ The MCP server acts as a query proxy that runs inside the trusted network boundary.


MCP Server Configuration

The Log Analytics MCP Server is open-source and uses:

Component Technology
🐍 Runtime Python 3.10+ with mcp[cli] framework
πŸ“Š SDK azure-monitor-query for Log Analytics
🌐 Transport Streamable HTTP for remote access
πŸ”‘ Auth Managed Identity for Azure authentication

πŸ“¦ GitHub Repository: github.com/BandaruDheeraj/log-analytics-mcp-server


Available MCP Tools

Tool Description
πŸ” query_logs Execute any KQL query against the workspace
πŸ“‹ list_tables Discover available tables and their schemas
πŸ“Š get_workspace_info Get workspace metadata and data volume stats
⚠️ analyze_errors Pre-built error pattern analysis
πŸ’“ check_vm_health Heartbeat, CPU, memory, error count summary

πŸ”’ Security Considerations

When deploying an MCP server as a query proxy, consider:

Concern Mitigation
πŸ”‘ API Authentication Use API key in header (X-API-Key)
πŸͺͺ Managed Identity Container App uses system-assigned MI for Azure auth
🏠 Network Isolation MCP server runs in VNet-integrated environment
πŸ“ Query Auditing All queries logged in Container Apps system logs
πŸ›‘οΈ RBAC Grant only Log Analytics Reader role to MI

πŸ”„ Beyond Log Analytics

This MCP proxy pattern extends to any Azure service with Private Link:

Service Private Link MCP Server Potential
Log Analytics βœ… Query logs via KQL
Application Insights βœ… Query app telemetry
Azure Data Explorer βœ… Query Kusto clusters
Azure SQL βœ… Query databases
Cosmos DB βœ… Query collections
Storage Accounts βœ… Access blob/table data

πŸ’‘ The pattern: Deploy an MCP server inside the private network, grant it access to the resource, and expose it to SRE Agent via Streamable HTTP.


🎯 Key Takeaways

πŸ” Private Link Blocks Queries

When you set publicNetworkAccessForQuery: Disabled and queryAccessMode: PrivateOnly, Log Analytics queries from outside the VNet are actually blocked. This is real security, not just data-in-transit encryption.

πŸŒ‰ MCP Servers as Bridges

MCP servers running inside your VNet can act as trusted query proxies. They authenticate with Managed Identity, query the protected resource, and return results over public HTTPS.

πŸš€ Streamable HTTP for Remote Access

The Streamable HTTP transport (/mcp/ endpoint) enables MCP servers to be called from anywhere, not just local STDIO. This is essential for cloud-hosted MCP servers.

⚑ Zero VPN, Zero Bastion

You don't need VPN access to debug issues in private networks. SRE Agent coordinates with the MCP server, which does the actual querying from inside the trusted boundary.

πŸ”„ Reusable Pattern

This isn't just for Log Analytics. Any Private Link-protected resource can be made accessible to AI agents through a VNet-deployed MCP server.


🏁 Get Started

Ready to bring this pattern to your environment?

Step Action
1️⃣ Deploy a VNet-integrated Container Apps environment
2️⃣ Set up Azure Monitor Private Link Scope with queryAccessMode: PrivateOnly
3️⃣ Disable public query access on your Log Analytics workspace
4️⃣ Deploy the Log Analytics MCP server with Managed Identity
5️⃣ Configure SRE Agent to use the MCP server via Streamable HTTP
6️⃣ Start investigating without ever needing VPN access

πŸ“š Resources

Resource Link
πŸ“¦ Log Analytics MCP Server GitHub
πŸ”— Azure Monitor Private Link Documentation
🐳 Container Apps VNet Integration Documentation
🌐 MCP Streamable HTTP Transport Specification
πŸ“– Sample Environment README

πŸ“ This blog post is based on a real working demo. The MCP servers referenced (log-analytics-mcp-vnet and log-analytics-mcp) are live and demonstrate the blocked vs. allowed query behavior.

πŸ’¬ Questions or feedback? Join the discussion on the Azure SRE Agent community.