Groq Observability
Overview
Set up comprehensive observability for Groq integrations.
Prerequisites
- Prometheus or compatible metrics backend
- OpenTelemetry SDK installed
- Grafana or similar dashboarding tool
- AlertManager configured
Metrics Collection
Key Metrics
| Metric | Type | Description |
|---|---|---|
groq_requests_total | Counter | Total API requests |
groq_request_duration_seconds | Histogram | Request latency |
groq_errors_total | Counter | Error count by type |
groq_rate_limit_remaining | Gauge | Rate limit headroom |
Prometheus Metrics
import { Registry, Counter, Histogram, Gauge } from 'prom-client'; const registry = new Registry(); const requestCounter = new Counter({ name: 'groq_requests_total', help: 'Total Groq API requests', labelNames: ['method', 'status'], registers: [registry], }); const requestDuration = new Histogram({ name: 'groq_request_duration_seconds', help: 'Groq request duration', labelNames: ['method'], buckets: [0.05, 0.1, 0.25, 0.5, 1, 2.5, 5], registers: [registry], }); const errorCounter = new Counter({ name: 'groq_errors_total', help: 'Groq errors by type', labelNames: ['error_type'], registers: [registry], });
Instrumented Client
async function instrumentedRequest<T>( method: string, operation: () => Promise<T> ): Promise<T> { const timer = requestDuration.startTimer({ method }); try { const result = await operation(); requestCounter.inc({ method, status: 'success' }); return result; } catch (error: any) { requestCounter.inc({ method, status: 'error' }); errorCounter.inc({ error_type: error.code || 'unknown' }); throw error; } finally { timer(); } }
Distributed Tracing
OpenTelemetry Setup
import { trace, SpanStatusCode } from '@opentelemetry/api'; const tracer = trace.getTracer('groq-client'); async function tracedGroqCall<T>( operationName: string, operation: () => Promise<T> ): Promise<T> { return tracer.startActiveSpan(`groq.${operationName}`, async (span) => { try { const result = await operation(); span.setStatus({ code: SpanStatusCode.OK }); return result; } catch (error: any) { span.setStatus({ code: SpanStatusCode.ERROR, message: error.message }); span.recordException(error); throw error; } finally { span.end(); } }); }
Logging Strategy
Structured Logging
import pino from 'pino'; const logger = pino({ name: 'groq', level: process.env.LOG_LEVEL || 'info', }); function logGroqOperation( operation: string, data: Record<string, any>, duration: number ) { logger.info({ service: 'groq', operation, duration_ms: duration, ...data, }); }
Alert Configuration
Prometheus AlertManager Rules
# groq_alerts.yaml groups: - name: groq_alerts rules: - alert: GroqHighErrorRate expr: | rate(groq_errors_total[5m]) / rate(groq_requests_total[5m]) > 0.05 for: 5m labels: severity: warning annotations: summary: "Groq error rate > 5%" - alert: GroqHighLatency expr: | histogram_quantile(0.95, rate(groq_request_duration_seconds_bucket[5m]) ) > 2 for: 5m labels: severity: warning annotations: summary: "Groq P95 latency > 2s" - alert: GroqDown expr: up{job="groq"} == 0 for: 1m labels: severity: critical annotations: summary: "Groq integration is down"
Dashboard
Grafana Panel Queries
{ "panels": [ { "title": "Groq Request Rate", "targets": [{ "expr": "rate(groq_requests_total[5m])" }] }, { "title": "Groq Latency P50/P95/P99", "targets": [{ "expr": "histogram_quantile(0.5, rate(groq_request_duration_seconds_bucket[5m]))" }] } ] }
Instructions
Step 1: Set Up Metrics Collection
Implement Prometheus counters, histograms, and gauges for key operations.
Step 2: Add Distributed Tracing
Integrate OpenTelemetry for end-to-end request tracing.
Step 3: Configure Structured Logging
Set up JSON logging with consistent field names.
Step 4: Create Alert Rules
Define Prometheus alerting rules for error rates and latency.
Output
- Metrics collection enabled
- Distributed tracing configured
- Structured logging implemented
- Alert rules deployed
Error Handling
| Issue | Cause | Solution |
|---|---|---|
| Missing metrics | No instrumentation | Wrap client calls |
| Trace gaps | Missing propagation | Check context headers |
| Alert storms | Wrong thresholds | Tune alert rules |
| High cardinality | Too many labels | Reduce label values |
Examples
Quick Metrics Endpoint
app.get('/metrics', async (req, res) => { res.set('Content-Type', registry.contentType); res.send(await registry.metrics()); });
Resources
Next Steps
For incident response, see groq-incident-runbook.