We upgraded from Munin 1.2.5/6 to Munin 1.4.3 and used the Packman RPMs available under http://packman.links2linux.de/package/munin to rebuild a RPM for SLES10.
Upgrading went fine but neither the master's munin-update ran (I will report this bug separately) nor the clients delivered any output to the munin-master.
I debugged for about 3 hours on a "clean machine" now and can definitely say the following:
- All paths are correct
- Userid:groupid gets changed to nobody:munin
- All configured plugins are found and listed in the logfile when munin-node is ran with --debug
- Running munin-run <plugin> yields the expected result
- Telnetting either from localhost or from another (allowed) host and calling fetch <plugin> yields nothing except a "."
- Enabling the print statements in Munin::Node::OS::read_from_child show that no bytes are read when the plugin is run in "TCP/IP" mode
- I tried 1.4.3 as well as trunk-r3327, both "behave" equally
e.g. cpu:
startup
Configuring cpu
Atempting to read from plugins stderr
Read 58 bytes from plugin stderr
Atempting to read from plugins stderr
Read 25 bytes from plugin stderr
Atempting to read from plugins stderr
Read 47 bytes from plugin stderr
Atempting to read from plugins stdout
Read 165 bytes from plugin stdout
Atempting to read from plugins stdout
Read 93 bytes from plugin stdout
Atempting to read from plugins stdout
Read 147 bytes from plugin stdout
Atempting to read from plugins stdout
Read 116 bytes from plugin stdout
Atempting to read from plugins stdout
Read 44 bytes from plugin stdout
Atempting to read from plugins stdout
Read 130 bytes from plugin stdout
Atempting to read from plugins stdout
Read 207 bytes from plugin stdout
Atempting to read from plugins stdout
Read 193 bytes from plugin stdout
Atempting to read from plugins stdout
Read 167 bytes from plugin stdout
Atempting to read from plugins stdout
Read 0 bytes from plugin stdout
2010/01/25-15:20:12 [541] Error output from cpu:
2010/01/25-15:20:12 [541] # Set /rgid/ruid/egid/euid/ to /109/65534/109 109 /65534/
2010/01/25-15:20:12 [541] # Setting up environment
2010/01/25-15:20:12 [541] # About to run '/etc/munin/plugins/cpu config'
Adding to node cat-serv-vm4.catworkx.de
startup
---
telnet
2010/01/25-15:21:56 [1039] DEBUG: < config cpu
2010/01/25-15:21:56 [1039] DEBUG: Running command "config cpu^M".
Atempting to read from plugins stderr
Read 58 bytes from plugin stderr
Atempting to read from plugins stderr
Read 25 bytes from plugin stderr
Atempting to read from plugins stderr
Read 47 bytes from plugin stderr
Atempting to read from plugins stdout
Read 0 bytes from plugin stdout
2010/01/25-15:21:57 [1039] Error output from cpu:
2010/01/25-15:21:57 [1039] # Set /rgid/ruid/egid/euid/ to /109/65534/109 109 /65534/
2010/01/25-15:21:57 [1039] # Setting up environment
2010/01/25-15:21:57 [1039] # About to run '/etc/munin/plugins/cpu config'
2010/01/25-15:21:57 [1039] DEBUG: > .
telnet
---
munin-run
# munin-run cpu config
graph_title CPU usage
graph_order system user nice idle iowait irq softirq
graph_args --base 1000 -r --lower-limit 0 --upper-limit 200
graph_vlabel %
graph_scale no
graph_info This graph shows how CPU time is spent.
graph_category system
graph_period second
system.label system
system.draw AREA
system.min 0
system.type DERIVE
system.info CPU time spent by the kernel in system activities
user.label user
user.draw STACK
user.min 0
user.type DERIVE
user.info CPU time spent by normal programs and daemons
nice.label nice
nice.draw STACK
nice.min 0
nice.type DERIVE
nice.info CPU time spent by nice(1)d programs
idle.label idle
idle.draw STACK
idle.min 0
idle.type DERIVE
idle.info Idle CPU time
iowait.label iowait
iowait.draw STACK
iowait.min 0
iowait.type DERIVE
iowait.info CPU time spent waiting for I/O operations to finish when there is nothing else to do.
irq.label irq
irq.draw STACK
irq.min 0
irq.type DERIVE
irq.info CPU time spent handling interrupts
softirq.label softirq
softirq.draw STACK
softirq.min 0
softirq.type DERIVE
softirq.info CPU time spent handling "batched" interrupts
steal.label steal
steal.draw STACK
steal.min 0
steal.type DERIVE
steal.info The time that a virtual CPU had runnable tasks, but the virtual CPU itself was not running
munin-run