Opened 7 years ago

Closed 23 months ago

Last modified 19 months ago

#444 closed enhancement (fixed)

RRD performance problems (updated)

Reported by: janl Owned by: snide
Priority: normal Milestone: Munin 2.0.0
Component: design Version:
Severity: normal Keywords:
Cc:

Description (last modified by janl)

Any reasonably sized university installation of munin will suffer performance problems. If they can get munin-update to complete it's run in 5 minutes it still takes some I/O to update ~26000 rrd files.

Rrd 1.4 (1.3?) introduces a rrdcache daemon which collects updates in memory until they are needed on disk (at graph time, and munin-limits time). This consolidated the updates to disk. We must optimize to use this properly.

Another consolidation that would be to store all datasieres values from a plugin into one rrd file as detailed in a comment on this ticket.

Guessing this is suitable for 3.0.

Change History (12)

comment:1 Changed 7 years ago by janl

  • Milestone changed from Munin 1.6 to Munin 1.4
  • Owner changed from nobody to janl
  • Status changed from new to assigned

comment:2 Changed 6 years ago by janl

Noting:

  • rrd 1.3.late has MMAP based io, doubling update speeds
  • our munin-cgi-graph script is getting pretty nifty, maybe we can switch to on-demand graphing?

comment:3 Changed 4 years ago by janl

  • Owner changed from janl to kjellm
  • Status changed from assigned to new

comment:4 Changed 4 years ago by janl

  • Milestone changed from Munin 1.4 to Munin 1.5

comment:5 Changed 4 years ago by janl

  • Owner changed from kjellm to janl
  • Status changed from new to assigned

comment:6 Changed 4 years ago by ligne

RRDtool 1.4 includes rrdcached, which does exactly this: <http://oss.oetiker.ch/rrdtool-trac/wiki/RRDtool14>. probably more sensible than rolling our own. less work, too :-)

comment:7 Changed 3 years ago by snide

  • Milestone changed from Munin 1.5 to Z-later

I'm pushing this one to "later", since it would require extensive work.

Since the main performance contender is the graphing subsystem, the 1.5 will provide a greater on-demand graphing experience and therefore should improve the performance issue.

comment:8 Changed 3 years ago by janl

Before we do this we should consolidate plugin time series to single rrd files. This has better graph performance, and probably also update performance. Jimmy writes:

# one datasource per rrdfile
for i in $(seq -w 1 1000); do rrdtool create single-$i.rrd DS:42:GAUGE:600:U:U RRA:AVERAGE:0.5:1:576 RRA:MIN:0.5:1:576 RRA:MAX:0.5:1:576 RRA:AVERAGE:0.5:6:432 RRA:MIN:0.5:6:432 RRA:MAX:0.5:6:432 RRA:AVERAGE:0.5:24:540 RRA:MIN:0.5:24:540 RRA:MAX:0.5:24:540 RRA:AVERAGE:0.5:288:450 RRA:MIN:0.5:288:450 RRA:MAX:0.5:288:450; done

time for i in $(seq -w 1 1000); do rrdtool update single-$i.rrd N:9492432; done
real    0m10.918s
user    0m4.604s
sys     0m6.440s

# 10 datasources per rrdfile
for i in $(seq -w 1 100); do rrdtool create multi-$i.rrd $(for j in $(seq -w 1 10); do echo -n "DS:$j:GAUGE:600:U:U "; done)  RRA:AVERAGE:0.5:1:576 RRA:MIN:0.5:1:576 RRA:MAX:0.5:1:576 RRA:AVERAGE:0.5:6:432 RRA:MIN:0.5:6:432 RRA:MAX:0.5:6:432 RRA:AVERAGE:0.5:24:540 RRA:MIN:0.5:24:540 RRA:MAX:0.5:24:540 RRA:AVERAGE:0.5:288:450 RRA:MIN:0.5:288:450 RRA:MAX:0.5:288:450; done

time for i in $(seq -w 1 100); do rrdtool update multi-$i.rrd N:9492432:9492432:9492432:9492432:9492432:9492432:9492432:9492432:9492432:9492432; done
real    0m1.178s
user    0m0.508s
sys     0m0.676s

Tenfold better!

comment:9 Changed 2 years ago by jorne

Revision 4083 enables very basic rrdcached support.

The optimizations mentioned by janl are still valid, so they should be included in a future release.

comment:10 Changed 23 months ago by janl

  • Description modified (diff)
  • Milestone changed from Z-later to Munin 3.0
  • Summary changed from RRD performance problems to RRD performance problems (updated)

I think that rrdcache + cgi graphing takes care of the main issue stated in the original ticket text. Updated the ticket text.

comment:11 Changed 23 months ago by janl

  • Owner changed from janl to snide
  • Status changed from assigned to new

comment:12 Changed 23 months ago by janl

  • Milestone changed from Munin 3.0 to Munin 2.0
  • Resolution set to fixed
  • Status changed from new to closed

... Brian De Wolf has pointed out that consolidating all the dataseries of a plugin into one rrd file makes it problematic to add and delete data fields. If we take away this one thing and add that rrd 1.5 will add a rrd server we think that the optimizations mentioned here become obsolete. Closing the ticket as fixed. Get your 2.0 today! :-)

Note: See TracTickets for help on using tickets.