← Back to Projects
Monitoring System Architecture
A distributed monitoring solution designed to collect metrics from multiple agent types and route them to different Azure services based on message paths. The system consists of a central collector service and various specialized agents that can be deployed across different environments.
System Architecture
graph TB
subgraph "Azure Cloud Services"
SB[Azure Service Bus dev_sql_events]
EG[Azure Event Grid plain-e]
EG2[Azure Event Grid alerts-e]
end
subgraph "SHIR Collector Service"
COL[SHIR Collector port 9090]
HTTP[HTTP Receiver receive]
ROUTER[Message Router messagePaths]
SB_PUB[Service Bus Publisher]
EG_PUB[Event Grid Publisher]
EG2_PUB[Event Grid Publisher]
HEALTH[Health Server port 8080]
COL --> HTTP
HTTP --> ROUTER
ROUTER --> SB_PUB
ROUTER --> EG_PUB
ROUTER --> EG2_PUB
COL --> HEALTH
end
subgraph "Agent Ecosystem"
SA[SHIR Agent messagePath shir]
AA[Alert Agent messagePath alerts]
CA1[Custom Agent 1]
CA2[Custom Agent 2]
end
subgraph "Monitoring Targets"
VM1[VM Instance 1]
VM2[VM Instance 2]
VM3[VM Instance 3]
APP[Application Services]
DB[Database Systems]
end
SA -->|HTTP POST WorkerMessage| HTTP
AA -->|HTTP POST WorkerMessage| HTTP
CA1 -->|HTTP POST WorkerMessage| HTTP
CA2 -->|HTTP POST WorkerMessage| HTTP
SB_PUB -->|Publish| SB
EG_PUB -->|Publish| EG
EG2_PUB -->|Publish| EG2
SA --> VM1
SA --> VM2
SA --> VM3
AA --> APP
CA1 --> DB
CA2 --> APP
Message Flow
sequenceDiagram
participant Agent as SHIR Agent
participant Collector as SHIR Collector
participant Router as Message Router
participant Publisher as Service Bus or Event Grid
participant Azure as Azure Service
Note over Agent,Azure: Message Path shir
Agent->>Collector: HTTP POST receive
Note right of Agent: WorkerMessage with messagePath shir
Collector->>Router: Route by messagePath
Router->>Router: Lookup publisher for shir
Router->>Publisher: Forward MonitoringPayload
Publisher->>Azure: Publish to Service Bus
Azure-->>Publisher: Acknowledge
Publisher-->>Router: Success
Router-->>Collector: Success
Collector-->>Agent: HTTP 200 OK
Note over Agent,Azure: Different Message Path alerts
participant AlertAgent as Alert Agent
participant EG_Pub as Event Grid Publisher
AlertAgent->>Collector: HTTP POST receive
Note right of AlertAgent: WorkerMessage with messagePath alerts
Collector->>Router: Route by messagePath
Router->>Router: Lookup publisher for alerts
Router->>EG_Pub: Forward MonitoringPayload
EG_Pub->>Azure: Publish to Event Grid
Azure-->>EG_Pub: Acknowledge
EG_Pub-->>Router: Success
Router-->>Collector: Success
Collector-->>AlertAgent: HTTP 200 OK
Components
SHIR Collector (Distributed Service)
The central hub that receives, processes, and forwards monitoring data from all connected agents.
- Multi-path message routing
- Support for multiple publishers (Service Bus, Event Grid)
- HTTP-based receiver service
- Health monitoring and status tracking
- Configurable message paths and destinations
SHIR Agents (Worker Services)
Specialized monitoring agents that collect specific types of metrics and send them to the collector.
- SHIR Agent: Monitors VM metrics and SHIR service status
- Alert Agent: Handles alert notifications and events
- Custom Agents: Extensible for specific monitoring needs
Configuration
Collector Configuration
messagePaths:
shir:
publisher: "servicebus"
serviceBus:
connectionString: "..."
queueName: "dev_sql_events"
alerts:
publisher: "eventgrid"
eventGrid:
endpoint: "https://..."
topic: "alerts-e"
accessKey: "..."
custom1:
publisher: "servicebus"
serviceBus:
connectionString: "..."
queueName: "custom-metrics"
Agent Configuration
Worker: id: "agent-unique-id" DistributorUrl: "http://collector:9090" sendTimeout: 30 messagePath: "shir"
Scalability
- Conservative: 100-500 concurrent agents
- Optimistic: 1,000-2,000 concurrent agents
- With Scaling: 10,000+ concurrent agents
Development: localhost:9090 Collector | :8080 Health | :8081 SHIR Agent | :8082 Alert Agent
Production: collector.internal.company.com:9090 Collector | agent-*.internal.company.com:8081 Agents
Data Models
WorkerMessage (Agent to Collector)
{
"machineName": "server-01",
"timestamp": "2024-01-01T12:00:00Z",
"vm": {
"cpuPercent": 45.2,
"memoryPercent": 67.8,
"diskPercent": 23.1,
"uptimeSeconds": 86400
},
"shir": {
"serviceStatus": "Running",
"nodeStatus": "Ready",
"version": "5.0.0.0"
},
"environment": "production",
"WorkerId": "server-01-1704110400",
"messagePath": "shir"
}
← Back to Projects