Ticket #846 (new defect)

Opened 2 months ago

Last modified 2 months ago

Fetching plugin data or config via telnet yields no output

Reported by: holle Assigned to: nobody
Priority: normal Milestone: Munin 1.4.4
Component: plugins Version: 1.4.3
Severity: blocker Keywords: plugins network server
Cc:

Description

We upgraded from Munin 1.2.5/6 to Munin 1.4.3 and used the Packman RPMs available under http://packman.links2linux.de/package/munin to rebuild a RPM for SLES10.

Upgrading went fine but neither the master's munin-update ran (I will report this bug separately) nor the clients delivered any output to the munin-master.

I debugged for about 3 hours on a "clean machine" now and can definitely say the following:

  • All paths are correct
  • Userid:groupid gets changed to nobody:munin
  • All configured plugins are found and listed in the logfile when munin-node is ran with --debug
  • Running munin-run <plugin> yields the expected result
  • Telnetting either from localhost or from another (allowed) host and calling fetch <plugin> yields nothing except a "."
  • Enabling the print statements in Munin::Node::OS::read_from_child show that no bytes are read when the plugin is run in "TCP/IP" mode
  • I tried 1.4.3 as well as trunk-r3327, both "behave" equally

e.g. cpu:

startup

Configuring cpu
Atempting to read from plugins stderr
Read 58 bytes from plugin stderr
Atempting to read from plugins stderr
Read 25 bytes from plugin stderr
Atempting to read from plugins stderr
Read 47 bytes from plugin stderr
Atempting to read from plugins stdout
Read 165 bytes from plugin stdout
Atempting to read from plugins stdout
Read 93 bytes from plugin stdout
Atempting to read from plugins stdout
Read 147 bytes from plugin stdout
Atempting to read from plugins stdout
Read 116 bytes from plugin stdout
Atempting to read from plugins stdout
Read 44 bytes from plugin stdout
Atempting to read from plugins stdout
Read 130 bytes from plugin stdout
Atempting to read from plugins stdout
Read 207 bytes from plugin stdout
Atempting to read from plugins stdout
Read 193 bytes from plugin stdout
Atempting to read from plugins stdout
Read 167 bytes from plugin stdout
Atempting to read from plugins stdout
Read 0 bytes from plugin stdout
2010/01/25-15:20:12 [541] Error output from cpu:
2010/01/25-15:20:12 [541]       # Set /rgid/ruid/egid/euid/ to /109/65534/109 109 /65534/
2010/01/25-15:20:12 [541]       # Setting up environment
2010/01/25-15:20:12 [541]       # About to run '/etc/munin/plugins/cpu config'
        Adding to node cat-serv-vm4.catworkx.de

startup

---

telnet

2010/01/25-15:21:56 [1039] DEBUG: < config cpu
2010/01/25-15:21:56 [1039] DEBUG: Running command "config cpu^M".
Atempting to read from plugins stderr
Read 58 bytes from plugin stderr
Atempting to read from plugins stderr
Read 25 bytes from plugin stderr
Atempting to read from plugins stderr
Read 47 bytes from plugin stderr
Atempting to read from plugins stdout
Read 0 bytes from plugin stdout
2010/01/25-15:21:57 [1039] Error output from cpu:
2010/01/25-15:21:57 [1039]      # Set /rgid/ruid/egid/euid/ to /109/65534/109 109 /65534/
2010/01/25-15:21:57 [1039]      # Setting up environment
2010/01/25-15:21:57 [1039]      # About to run '/etc/munin/plugins/cpu config'
2010/01/25-15:21:57 [1039] DEBUG: > .

telnet

---

munin-run

# munin-run cpu config
graph_title CPU usage
graph_order system user nice idle iowait irq softirq
graph_args --base 1000 -r --lower-limit 0 --upper-limit 200
graph_vlabel %
graph_scale no
graph_info This graph shows how CPU time is spent.
graph_category system
graph_period second
system.label system
system.draw AREA
system.min 0
system.type DERIVE
system.info CPU time spent by the kernel in system activities
user.label user
user.draw STACK
user.min 0
user.type DERIVE
user.info CPU time spent by normal programs and daemons
nice.label nice
nice.draw STACK
nice.min 0
nice.type DERIVE
nice.info CPU time spent by nice(1)d programs
idle.label idle
idle.draw STACK
idle.min 0
idle.type DERIVE
idle.info Idle CPU time
iowait.label iowait
iowait.draw STACK
iowait.min 0
iowait.type DERIVE
iowait.info CPU time spent waiting for I/O operations to finish when there is nothing else to do.
irq.label irq
irq.draw STACK
irq.min 0
irq.type DERIVE
irq.info CPU time spent handling interrupts
softirq.label softirq
softirq.draw STACK
softirq.min 0
softirq.type DERIVE
softirq.info CPU time spent handling "batched" interrupts
steal.label steal
steal.draw STACK
steal.min 0
steal.type DERIVE
steal.info The time that a virtual CPU had runnable tasks, but the virtual CPU itself was not running

munin-run

Change History

01/25/10 15:41:18 changed by holle

Rlated issue regarding the munin-master componenent as mentioned in the report: #847

(follow-up: ↓ 3 ) 01/25/10 16:21:31 changed by holle

Ok, I seem to be closing in into the problem: I recompiled the RPM (using the exact same sources as on the SLE10) un my openSUSE 11.1 box, and everything worked.

So, the million bucks prize question is: what is different from a Perl 5.8.8 (on SLE10) to a Perl 5.10.0 (on a openSUSE 11.x/SLE11) that _this_ above behaviour appears. And: can it be fixed apart from upgrading the whole perl (we might install a /usr/local or /opt perl 5.10.x)

(in reply to: ↑ 2 ) 01/25/10 16:42:10 changed by dkrotil

I have exact same problem on cca 50 servers with no option to upgrade Perl, these are SLES10 SP2/ OES2 SP1 servers. Upgrading to SLES10 SP3 change nothing Perl is same 5.8.8.

David