Modules HowTo

Theory

There are 4 support mechanisms authoring checks in Reconnoiter in two classes: in-core and external. I've listed them in order from most performant and scalable to least performant and scalable.

  1. in-core in C. Reconnoiter sports a C API extension layer in which generic modules, check modules and even "loader" modules may be authored. This is the most performant way to author modules, but also the most challenging. Reconnoiter's non-blocking design requires some finesse when coding in C or your risk causing "hiccups" in the core of Reconnoiter. If you know what you are doing and have strong C coding skills, this might be the way for you. Otherwise, use another method.
  2. in-core in lua. Because the C extension system proves to be a unforgiving barrier to entry for in-core checks, we implemented (in C) a lua loader. This lua loader allows modules to be written in lua. These checks run just as the C checks run (right in the internals of reconnoiter), but with a higher level language and the complexities of non-blocking, event-driven programming handled for you it can be EASY! The lua checks are nominally slower than native C checks. The performance different is so small, we write many of the core, shipped reconnoiter checks using lua including things like HTTP and NTP. If you are focusing on networked programming, the lua extension systems makes it easy to write code that appears to be blocking, but actually has significant "BFM" (black f@#$ing magic) under the hood. You can have tens of thousands of lua checks running concurrently.
  3. external checks in Java. Because so many client libraries are conveniently accessible via Java (think JDBC), it can make a lot of sense to code checks in native Java. While we could have embedded a JVM in noitd (as we did lua), we elected not to for several reasons: (1) JVM memory heap management will cause complications like bloating (2) the design of threads in Java (like most other languages) make it very hard to transparently provide non-blocking plumbing underneath an API. This would require programmers to alter their programming paradigms (crappy) and many of the client libraries you'd use would be incompatible (useless). So, the Java checks are executed in a separate process called Jezebel that is actually a webserver to which check details are POSTed and Resmon XML is returned. Writing Jezebel modules is simple and easy, but not as performant or single-system-scalable as lua. You need a thread for every running check, so tens of thousands of concurrent checks is likely unachievable.
  4. external checks as standalone scripts. The extproc module (shipped with reconnoiter) will fork and execute scripts like those used with Nagios. It respects the return codes and outputs as does Nagios. Running 500 nagios checks a second is damn hard to do because of its design. As the extproc module adopts its design, it scales very poorly as well. The extproc module provides a bridge from Nagios and new modules should not be written to this API. It should be used as a crutch only.

Practice

lua

Lua modules are simple text files written in the lua programming language. They are their own packages which must implement 4 functions:

  1. onload(image) - accepts and image and it called once after the module is compiled. It must return 0 on success. The primary use for this function is to call the image.xml_description function with XML documentation for the module you are writing. This allows for automatic documentation generation and online console help.
  2. config(module, options) - run immediately before init and a table of options (from the <config> that applied to this module in noit.conf. It must return 0 on success.
  3. init(module) - run as the module is loaded and initial setup may be performed. Note that state may not be shared between different lua executions. It must return 0 on success.
  4. initiate(check, config) - called each time the check is to be performed. The check contains the attributes of the check itself (target, name, etc.) and metrics, status and availability should be set within this function on the check object.