Back to skills
SkillHub ClubRun DevOpsDevOps

monitoring-setup

Provides configuration templates and best practices for setting up Prometheus, Grafana, and alerting systems. Includes practical examples for Kubernetes monitoring, RED/USE methodologies, and SLO definitions. Focuses on operational observability rather than automated deployment.

Packaged view

This page reorganizes the original catalog entry around fit, installability, and workflow context first. The original raw source lives below.

Stars
5
Hot score
82
Updated
March 20, 2026
Overall rating
A7.5
Composite score
4.8
Best-practice grade
B81.2

Install command

npx @skill-hub/cli install timequity-plugins-monitoring-setup
monitoringobservabilityprometheusgrafanakubernetes

Repository

timequity/plugins

Skill path: craft-coder/infra/monitoring-setup

Provides configuration templates and best practices for setting up Prometheus, Grafana, and alerting systems. Includes practical examples for Kubernetes monitoring, RED/USE methodologies, and SLO definitions. Focuses on operational observability rather than automated deployment.

Open repository

Best for

Primary workflow: Run DevOps.

Technical facets: DevOps.

Target audience: DevOps engineers and SREs who need to implement monitoring for Kubernetes applications.

License: Unknown.

Original source

Catalog source: SkillHub Club.

Repository owner: timequity.

This is still a mirrored public skill entry. Review the repository before installing into production workflows.

What it helps with

  • Install monitoring-setup into Claude Code, Codex CLI, Gemini CLI, or OpenCode workflows
  • Review https://github.com/timequity/plugins before adding monitoring-setup to shared team environments
  • Use monitoring-setup for devops workflows

Works across

Claude CodeCodex CLIGemini CLIOpenCode

Favorites: 0.

Sub-skills: 0.

Aggregator: No.

Original source / Raw SKILL.md

---
name: monitoring-setup
description: Observability stack with Prometheus, Grafana, and alerting.
---

# Monitoring Setup

## The Three Pillars

| Pillar | Tool | Purpose |
|--------|------|---------|
| **Metrics** | Prometheus | Time-series data |
| **Logs** | Loki / ELK | Event records |
| **Traces** | Jaeger / Tempo | Request flow |

## Prometheus

```yaml
# prometheus.yml
global:
  scrape_interval: 15s

scrape_configs:
  - job_name: 'kubernetes-pods'
    kubernetes_sd_configs:
      - role: pod
    relabel_configs:
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
        action: keep
        regex: true
```

## Grafana Dashboard

```json
{
  "panels": [
    {
      "title": "Request Rate",
      "targets": [
        {
          "expr": "rate(http_requests_total[5m])",
          "legendFormat": "{{method}} {{path}}"
        }
      ]
    }
  ]
}
```

## Alert Rules

```yaml
groups:
  - name: app
    rules:
      - alert: HighErrorRate
        expr: rate(http_requests_total{status=~"5.."}[5m]) > 0.1
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "High error rate detected"

      - alert: PodCrashLooping
        expr: rate(kube_pod_container_status_restarts_total[15m]) > 0
        for: 5m
        labels:
          severity: warning
```

## Key Metrics

### RED Method (Services)

- **Rate** - Requests per second
- **Errors** - Failed requests
- **Duration** - Response time

### USE Method (Resources)

- **Utilization** - % busy
- **Saturation** - Queue depth
- **Errors** - Error count

## SLIs/SLOs

```
SLI: 99th percentile latency < 200ms
SLO: 99.9% of requests meet SLI
Error Budget: 0.1% of requests can exceed SLI
```
monitoring-setup | SkillHub