How to automate Grafana dashboard analysis

Viacheslav Smirnov

How to automate

  • Generation via jsonnet

  • Visual hints for analysis

Dashboards as code

with jsonnet

  • GitOps

  • Query management

  • Design patterns

  • github.com/polarnik/grafana-analysis-automation
  • polarnik.github.io/grafana-analysis-automation/

Dashboards as code

Demo

Visual hints

  • Change points

  • Priorities

Visual hints

  • Change points

  • Priorities

Change points will show the most important events

  • Releases

  • Restarts

  • Settings updates

💡 Top level panel with Versions can show Release moments

💡 Annotations can show Restarts and Settings updates

Next step: to use a Rollback plan

Next step: to make a hot-fix

Automation of showing Change points

  • Create a jsonnet library for versions and annotations

  • Include the library into dashboards

  • Regenerate dashboards

Change points

Demo

Visual hints

  • Change points

  • Priorities

Priorities will highlight bottlenecks

  • Simple dashboards

  • TOPs

  • Gradients

Priorities will highlight bottlenecks

  • Simple dashboards

  • TOPs

  • Gradients

Meta dashboards

Simple dashboards

Simple, small, fast, opened dashboards

are my favorite

Automation of making simple dashboards

      row.new('🗂 File Descriptors'),
      
      panels.combo.stat.a_bigger_value_is_a_problem(
        '🗂 FDS', 
        queries.diff_over_time(queries.process.open_fds)
      ),
      
      panels.combo.timeSeries.current_vs_prev(
        '🗂 FDS', 
        queries.start_prev_current_diff(queries.process.open_fds), 
        queries.process.open_fds.unit
      ),

Priorities will highlight bottlenecks

  • Simple dashboards

  • TOPs

  • Gradients

TOPs in Tables are my favorite

We can sort by the Total Duration

TOPs in Legends is the easiest way

We can sort by Mean

TOPs in Time Series

Automation of getting TOP series

sort_desc(    
  topk($top,        
    sum_over_time(
      (                
        sum(increase(                            
            youtrack_Workflow_OnScheduleFull_TotalDuration{
              instance=~"$instance"
            }[2m]                    
          )                
        ) by (script)            
      )[$__range:2m]        
)))

Automation of getting TOP series

sort_desc(    
  topk($top,        
    sum_over_time(  
      (                
        sum(increase(                            
            youtrack_Workflow_OnScheduleFull_TotalDuration{
              instance=~"$instance"
            }[2m]                    
          )                
        ) by (script)            
      )[$__range:2m]        
)))

Priorities will highlight bottlenecks

  • Simple dashboards

  • TOPs

  • Gradients

Red colors will highlight bottlenecks

  • Blue-White-Red

  • Blue-Red

  • Rainbow

  • White-Rainbow

Automation of changing themes via flags

colors: 
  if (std.extVar("EXT_THEME") == "blue_white_red") then
    self.blue_white_red
  else if (std.extVar("EXT_THEME") == "blue_red") then
    self.blue_red
  else if (std.extVar("EXT_THEME") == "rainbow") then
    self.rainbow
  else if (std.extVar("EXT_THEME") == "white_rainbow") then
    self.white_rainbow
  else
    self.blue_white_red,

Demo

Simple dashboards with TOPs and Gradients

Visual hints

  • Change points

  • Priorities

Navigation will save your time

  • Summary and Templates

  • Schemas

Navigation via Drill-Down

  • Summary dashboard

    • uses Tables and Stat panels

  • Template dashboards

    • use Time series with Text variables

Links will connect relevant dashboards

Schemas will visualize connections

  • The Diagram plugin with Mermaid.js Flowchart

  • The Text panel with images of Miro diagrams

Navigation

Demo

Hints for automation analysis

Run-books

  • Runbook templates

  • Runbook-first alerts

  • Markdown pages: Writerside, Hugo

Hints for automation analysis

Prompts

  • Is it connected with a release?
  • Is it connected with a restart?
  • Is it connected with new settings?
  • Is it connected with the specific item from the TOP?
  • Is it a correct panel and query?

Hints for automation analysis

Grafana Snapshots

  • perfana/perfana-snapshot

  • Sonnet 3.7, Sonnet 4.0

  • Google Gemini 2.5 Pro

Hints for automation analysis

Demo

Smirnov Viacheslav or Slava

  • perfqa (linkedin)

  • perftrack (grafana)

  • smirnovqa (telegram)

  • qapositive (gmail)

  • polarnik (github)