[Reconnoiter-devel] Memory leak in noitd

Michal Taborsky michal at taborsky.cz
Wed May 2 09:16:36 EDT 2012


No, the console is used only manually when we're changing configuration.
Nothing that would periodically login and do stuff.

I tried to enable one of the failing checks in our production and re-ran
for about 5 minutes with valgrind. Stratcon was running against this noit
and the leaks file is a mess.
https://gist.github.com/2576462

--
Michal Taborsky
http://www.taborsky.cz



2012/5/2 Theo Schlossnagle <jesus at omniti.com>

> you don't happen to automate anything through the telnet interface do
> you?  Like regular health checks or configuration stuff?
>
> On Wed, May 2, 2012 at 8:35 AM, Michal Taborsky <michal at taborsky.cz>
> wrote:
> > It leaks, but it's behaving inconsistently.
> >
> > To be 100% sure I did a fresh install on a new Centos 6 virtual machine,
> > configured a minimal noit.conf to reproduce the issue and re-ran
> valgrind.
> > The results together with the config are
> > here: https://gist.github.com/2576023
> >
> > On one try, when noitd ran untouched, the leak did not occur. Then I
> tried
> > again and used the telnet console a bit. There are some leaks visible
> there,
> > but I do not think that's it. With valgrind running I cannot observe the
> > memory increase, because as far as I know valgrind will eat memory as
> part
> > of the observation. Then I ran noitd again without valgrind and could see
> > the memory increasing.
> >
> > I know... big help this does.
> >
> > --
> > Michal Taborsky
> > http://www.taborsky.cz
> >
> >
> >
> > 2012/5/2 Theo Schlossnagle <jesus at omniti.com>
> >>
> >> if valgrind doesn't show anything significant, something very subtle
> >> is going on.  Usually valgrind will reveal these sorts of programming
> >> errors quite obviously.  Are you sure it is leaking?
> >>
> >> On Tue, May 1, 2012 at 11:07 AM, Michal Taborsky <michal at taborsky.cz>
> >> wrote:
> >> > Hello Theo,
> >> >
> >> > I am not sure we finished this. So I ran the fixed code, valgrind
> >> > doesn't
> >> > show anything significant now. But as I wrote earlier, the problem is
> >> > somewhere in the snmp check timeout handling. I can avoid it for the
> >> > moment
> >> > by disabling the check that times out now. It should be pretty easy to
> >> > simulate.
> >> >
> >> > --
> >> > Michal Taborsky
> >> > http://www.taborsky.cz
> >> >
> >> >
> >> >
> >> > 2012/4/22 Theo Schlossnagle <jesus at omniti.com>
> >> >>
> >> >> A whole bunch of fixes for mem-related stuff today... the two places
> >> >> that
> >> >> look bad in your leak diagnostics where the noit.gunzip lua wrapper
> and
> >> >> snmp
> >> >> in the event of timeouts.  However, both were quite small and would
> >> >> take a
> >> >> long time to accumulate to anything noticeable.  In otherwords,
> there's
> >> >> likely another more nefarious leak in there.
> >> >>
> >> >> Can you update everything to latest and run under valgrind again --
> >> >> feel
> >> >> free to run it for an hour or so (as long as it performs well enough
> to
> >> >> stay
> >> >> current).
> >> >>
> >> >>
> >> >> On Sun, Apr 22, 2012 at 11:29 AM, Michal Taborsky <
> michal at taborsky.cz>
> >> >> wrote:
> >> >>>
> >> >>> I am not familiar with valgrind, but the output is
> >> >>> here https://gist.github.com/2464624
> >> >>>
> >> >>> MT.
> >> >>>
> >> >>> 2012/4/22 Theo Schlossnagle <jesus at omniti.com>
> >> >>>>
> >> >>>> Are you familiar with valgrind?  If you could, compile with
> debugging
> >> >>>> symbols (-g) and run:
> >> >>>>
> >> >>>> valgrind --log-file=noitd.leaks --leak-check=full ./noitd -D (other
> >> >>>> args
> >> >>>> like -M or -c if you use those)
> >> >>>>
> >> >>>> Let it run for about two minutes, telnet into noitd and run
> shutdown
> >> >>>>
> >> >>>> valgrind should spit a ton of junk out in noitd.leaks -- that
> should
> >> >>>> pinpoint the problem.
> >> >>>>
> >> >>>> 2012/4/22 Michal Taborsky <michal at taborsky.cz>
> >> >>>>>
> >> >>>>> After upgrading to the latest code from master, I am experiencing
> >> >>>>> some
> >> >>>>> memory leak in noitd. This noit runs about 200 checks per minute
> >> >>>>> with
> >> >>>>> various modules and looses about a 500k per minute. It's CentOS
> 6.2
> >> >>>>> 64bit.
> >> >>>>>
> >> >>>>> [root at server etc]# date;  ps uax | grep noitd
> >> >>>>> Sun Apr 22 13:18:21 CEST 2012
> >> >>>>> root     22405  0.0  0.1 146844  1088 ?        S    12:59   0:00
> >> >>>>> /usr/local/sbin/noitd -c /usr/local/etc/noit.conf
> >> >>>>> root     22406  1.1  3.9 940652 40108 ?        Sl   12:59   0:13
> >> >>>>> /usr/local/sbin/noitd -c /usr/local/etc/noit.conf
> >> >>>>> [root at server etc]# date;  ps uax | grep noitd
> >> >>>>> Sun Apr 22 13:42:30 CEST 2012
> >> >>>>> root     22405  0.0  0.1 146844  1088 ?        S    12:59   0:00
> >> >>>>> /usr/local/sbin/noitd -c /usr/local/etc/noit.conf
> >> >>>>> root     22406  1.1  5.0 951612 51560 ?        Sl   12:59   0:28
> >> >>>>> /usr/local/sbin/noitd -c /usr/local/etc/noit.conf
> >> >>>>>
> >> >>>>> I know this does not help much and I am ready to provide more info
> >> >>>>> if
> >> >>>>> requested. For now, I did not have the time to try to disable the
> >> >>>>> checks one
> >> >>>>> by one to see which one causes it. I'll do it later, if necessary.
> >> >>>>> Right now I am looking for some possible quick fix or advice how
> to
> >> >>>>> find it, if there is any.
> >> >>>>>
> >> >>>>> --
> >> >>>>> Michal Taborsky
> >> >>>>> http://www.taborsky.cz
> >> >>>>>
> >> >>>>>
> >> >>>>> _______________________________________________
> >> >>>>> Reconnoiter-devel mailing list
> >> >>>>> Reconnoiter-devel at lists.omniti.com
> >> >>>>> http://lists.omniti.com/mailman/listinfo/reconnoiter-devel
> >> >>>>>
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>> --
> >> >>>>
> >> >>>> Theo Schlossnagle
> >> >>>>
> >> >>>> http://omniti.com/is/theo-schlossnagle
> >> >>>>
> >> >>>>
> >> >>>
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >>
> >> >> Theo Schlossnagle
> >> >>
> >> >> http://omniti.com/is/theo-schlossnagle
> >> >>
> >> >>
> >> >
> >>
> >>
> >>
> >> --
> >> Theo Schlossnagle
> >>
> >> http://omniti.com/is/theo-schlossnagle
> >
> >
>
>
>
> --
> Theo Schlossnagle
>
> http://omniti.com/is/theo-schlossnagle
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.omniti.com/pipermail/reconnoiter-devel/attachments/20120502/b2c11150/attachment.html>


More information about the Reconnoiter-devel mailing list