Skip to main content

Better Stack

Overview

Better Stack comprises two MCP servers: Uptime (monitoring, incidents, on-call, status pages) and Telemetry (logs, metrics, dashboards, error tracking). Both share the same team/org context but require separate auth tokens.

How to Add Better Stack

  1. 1
    Add to Civic

    Add the Better Stack Uptime and/or Better Stack Telemetry servers to your Civic environment through the server directory. These are two separate servers — add whichever you need (or both).

  2. 2
    Authorize

    On first use, you will be redirected to Better Stack to authorize the connection. No API keys or secrets to manage manually.

    info

    Two separate authorizations — Uptime and Telemetry are independent products. You will need to authorize each one separately the first time you use it.

  3. 3
    Test Connection
    • For Uptime: try "List all monitors" or "Who is currently on-call?"
    • For Telemetry: try "List all log sources" or "List all dashboards"

What You Can Do

Uptime Monitoring

Manage monitors, view response times and availability, track incidents and heartbeats

Incident Management

List, acknowledge, resolve, escalate, and comment on incidents with full timeline access

On-Call & Escalation

View on-call schedules, escalation policies, and severity notification channels

Status Pages

List status pages, manage resources, create incident reports, and post status updates

Log Querying

Query logs via ClickHouse, explore sources, extract JSON fields, and group by patterns

Metrics & Dashboards

List metrics, check cardinality, build queries, create and arrange dashboard charts

Error Tracking

List applications, query errors, resolve or ignore error patterns, and view session replays

Heartbeat Monitoring

Monitor cron jobs and scheduled tasks — incidents trigger automatically on missed pings

Use Cases

Uptime & Incidents

  • Monitor Health: "Show current status of all monitors"
  • Availability Reports: "Get availability for the API monitor for March 2026"
  • Incident Triage: "List all open incidents""Acknowledge incident 1234""Resolve incident 1234"
  • Incident Timeline: "Show the timeline for incident 1234"
  • On-Call Check: "Who is currently on-call?"
  • Status Page Updates: "Create an incident report: title 'API Degradation', message 'Investigating elevated error rates'"

Telemetry & Logs

  • Log Search: "Show the last 50 error logs from my-app"
  • Error Patterns: "Find the most common error patterns in the last 24 hours"
  • Slow Requests: "Show slow API requests over 1 second from today"
  • Metrics Exploration: "List available metrics for source 12345"
  • Dashboard Management: "List all dashboards""Add a line chart showing log volume per hour"
  • Error Tracking: "List recent unresolved errors""Resolve error pattern NullPointerException in UserService"

Available Tools

Uptime — Monitors & Heartbeats

list_monitors

list_monitors — List all monitors and their current status

get_monitor_response_times

get_monitor_response_times — Get response times broken down by region and phase (DNS, connect, TLS, transfer). Returns ~24h of data in 15-min buckets.

get_monitor_availability

get_monitor_availability — Get availability percentage for a monitor over a date range

list_heartbeats

list_heartbeats — List all heartbeat monitors for cron jobs and scheduled tasks

get_heartbeat_availability

get_heartbeat_availability — Get availability for a heartbeat over a date range

Uptime — Incidents

list_incidents

list_incidents — List incidents with filtering by status, monitor, heartbeat, and date range

acknowledge_incident

acknowledge_incident — Acknowledge an open incident

resolve_incident

resolve_incident — Resolve an incident

reopen_incident

reopen_incident — Reopen a resolved incident (within 24 hours of resolution only)

escalate_incident

escalate_incident — Escalate to a user, team, schedule, or policy. Call get_incident_escalation_options first to discover valid targets.

add_incident_comment

add_incident_comment — Add a comment to an incident timeline

get_incident_timeline

get_incident_timeline — Full audit trail of state changes, notifications, and acknowledgements

get_incident_escalation_options

get_incident_escalation_options — Discover valid escalation targets (User, Team, Schedule, Policy)

Uptime — On-Call & Status Pages

list_on_call_calendars

list_on_call_calendars — List on-call rotations and events

list_escalation_policies

list_escalation_policies — List all escalation policies

list_status_pages

list_status_pages — List all status pages and their resources

get_status_page_resources

get_status_page_resources — Show monitors and resources on a status page

create_incident_report

create_incident_report — Create a public incident report with affected resources

post_status_update

post_status_update — Post a status update to an incident report with optional subscriber notification

Telemetry — Sources & Logs

list_sources

list_sources — List all log sources

get_source_details

get_source_details — Get ingestion token, host URL, table name, and retention settings for a source

get_source_fields

get_source_fields — Get available fields for a source (returns nothing if source has no recent data)

telemetry_build_explore_query_tool

telemetry_build_explore_query_tool — Generate a ClickHouse query from plain English

telemetry_query

telemetry_query — Execute a direct ClickHouse query (requires cloud connection credentials)

create_source

create_source — Create a new log source with a specified platform type

Telemetry — Metrics & Dashboards

list_metrics

list_metrics — List available metrics for a source

get_metrics_and_cardinality

get_metrics_and_cardinality — Show cardinality for all metrics

get_metric_query_instructions

get_metric_query_instructions — Get correct aggregation functions and example queries for a metric

list_dashboards

list_dashboards — List all dashboards

get_dashboard_layout

get_dashboard_layout — Show the layout and charts of a dashboard

add_chart

add_chart — Add a chart to a dashboard

edit_chart

edit_chart — Edit an existing chart

move_charts

move_charts — Rearrange chart positions on a dashboard (validates layout atomically)

export_dashboard

export_dashboard — Export a dashboard as JSON

import_dashboard

import_dashboard — Import a dashboard from JSON

get_chart_building_instructions

get_chart_building_instructions — Documentation for supported chart types, units, axes, and layout rules

telemetry_chart

telemetry_chart — Preview chart queries with automatic error surfacing

Telemetry — Error Tracking

list_applications

list_applications — List all error-tracking applications

create_application

create_application — Create a new error-tracking application with a platform type (e.g. javascript_errors, python_errors)

get_errors_query_instructions

get_errors_query_instructions — Get the query schema for error tracking (different from logs)

update_error_state

update_error_state — Set error state to resolved, ignored, or unresolved. When ignoring, ignore_next_count must be 10, 100, or 1000.

create_cloud_connection

create_cloud_connection — Create direct ClickHouse credentials for raw SQL queries. Connections expire after 1 hour.


note

Two separate auth tokens — Uptime and Telemetry are independent products, each requiring separate authorization.

Team ID required for most create/list operations. Use list_teams to find it.

Log query tips:

  • Use remote(<table_name>) for recent data (<30 min); use s3Cluster(primary, <table_name_s3>) with _row_type = 1 for historical.
  • All log fields live inside the raw JSON column. Extract with JSONExtract(raw, 'field', 'Nullable(String)').
  • Group noisy logs by _pattern to surface recurring message structures.

Dashboard grid is 12 columns wide. Use {{source}}, {{start_time}}, {{end_time}}, and {{time}} variables in dashboard queries.

No cross-tool linking — Uptime incidents and Telemetry logs are not automatically correlated.


Guardrails

This server is covered by the 14 universal guardrails. Server-specific guardrails are coming soon.

tip

Configure guardrails via the Civic UI or ask the Configurator Agent: "Add guardrails to my Better Stack server."