monitoring-setup
Provides configuration templates and best practices for setting up Prometheus, Grafana, and alerting systems. Includes practical examples for Kubernetes monitoring, RED/USE methodologies, and SLO definitions. Focuses on operational observability rather than automated deployment.
Packaged view
This page reorganizes the original catalog entry around fit, installability, and workflow context first. The original raw source lives below.
Install command
npx @skill-hub/cli install timequity-plugins-monitoring-setup
Repository
Skill path: craft-coder/infra/monitoring-setup
Provides configuration templates and best practices for setting up Prometheus, Grafana, and alerting systems. Includes practical examples for Kubernetes monitoring, RED/USE methodologies, and SLO definitions. Focuses on operational observability rather than automated deployment.
Open repositoryBest for
Primary workflow: Run DevOps.
Technical facets: DevOps.
Target audience: DevOps engineers and SREs who need to implement monitoring for Kubernetes applications.
License: Unknown.
Original source
Catalog source: SkillHub Club.
Repository owner: timequity.
This is still a mirrored public skill entry. Review the repository before installing into production workflows.
What it helps with
- Install monitoring-setup into Claude Code, Codex CLI, Gemini CLI, or OpenCode workflows
- Review https://github.com/timequity/plugins before adding monitoring-setup to shared team environments
- Use monitoring-setup for devops workflows
Works across
Favorites: 0.
Sub-skills: 0.
Aggregator: No.
Original source / Raw SKILL.md
---
name: monitoring-setup
description: Observability stack with Prometheus, Grafana, and alerting.
---
# Monitoring Setup
## The Three Pillars
| Pillar | Tool | Purpose |
|--------|------|---------|
| **Metrics** | Prometheus | Time-series data |
| **Logs** | Loki / ELK | Event records |
| **Traces** | Jaeger / Tempo | Request flow |
## Prometheus
```yaml
# prometheus.yml
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
```
## Grafana Dashboard
```json
{
"panels": [
{
"title": "Request Rate",
"targets": [
{
"expr": "rate(http_requests_total[5m])",
"legendFormat": "{{method}} {{path}}"
}
]
}
]
}
```
## Alert Rules
```yaml
groups:
- name: app
rules:
- alert: HighErrorRate
expr: rate(http_requests_total{status=~"5.."}[5m]) > 0.1
for: 5m
labels:
severity: critical
annotations:
summary: "High error rate detected"
- alert: PodCrashLooping
expr: rate(kube_pod_container_status_restarts_total[15m]) > 0
for: 5m
labels:
severity: warning
```
## Key Metrics
### RED Method (Services)
- **Rate** - Requests per second
- **Errors** - Failed requests
- **Duration** - Response time
### USE Method (Resources)
- **Utilization** - % busy
- **Saturation** - Queue depth
- **Errors** - Error count
## SLIs/SLOs
```
SLI: 99th percentile latency < 200ms
SLO: 99.9% of requests meet SLI
Error Budget: 0.1% of requests can exceed SLI
```