Opened 7 years ago

Last modified 3 months ago

#447 assigned enhancement

Better error detection/handling

Reported by: lkstrand - lars at linpro no Owned by: janl
Priority: normal Milestone: Munin 2.2
Component: master Version:
Severity: normal Keywords:
Cc:

Description

I would really like to see the following improvement in Munin:

  • More verbosity upon detecting errors when parsing the munin.conf file. As of today, pinning down exactly what is wrong in the config file can be a tiresome exercise when dealing with aggregated graphs. Example: When grap_order lists a unknown field, the graph just don't generate. Some sort of verbosity level setting would be nice.
  • Some way of telling when a host no long is responsive (no contact, host is down for say 15 minutes) on the front page. This can for example be denoted by a red star (*) after the hostname. I am aware that we usually have Nagios or other tools to detect more immediate alerts, but take for example: I just checked a host by random today, and munin had not getting any data for the last 5 days(!). The host was up and Nagios reported no problems. It turns out that the munin-node was not running on that particular hosts. This could easily been detected by some form of notice on the front-page.


Change History (4)

comment:1 Changed 7 years ago by janl

  • Milestone set to Munin 1.4
  • Owner changed from nobody to janl
  • Status changed from new to assigned
  • Version 1.2.5 deleted

Pt. 1 is not high on my priority list. Pt. 2 is on my agenda already.

comment:2 Changed 4 years ago by janl

  • Milestone changed from Munin 1.4 to Munin 1.5

comment:3 Changed 17 months ago by snide

  • Milestone changed from Munin 2.0 to Munin 2.1

comment:4 Changed 3 months ago by jwarnier

I second this, if nothing else, because the cron munin-graph is generating over 300MB of logs for only 8 servers with a few services each.
Using Munin 2.0.6.

Note: See TracTickets for help on using tickets.