TL;DR:

Supervisord is a powerful tool to daemonize processes. It provides control over processes, like starting and stopping with simple commands, and nice features such as auto restart, run as a user, etc.

The typical usage of supervisord is:

  • log in to a server
  • start certain processes
  • check the process status

This process is fine to start, stop, kill processes, but when it comes to know the status of the process, it is cumbersome and too proactive. We start the process once, yes, but we wonder its up status for longer durations. You need to go over several steps to gather status information. And this becomes a problem in an environment managed with CI/CD and especially when the developers, who are less enthusiastic to know the underlying components of the system, do not have access to the servers supervisord is running on (frankly, they shouldn’t have access to anything aside from Slack and mail client).

One problem we faced in our workflow was, the build was successful, the process could be started, but it would die after a few seconds during the boot up phase. The deployer would not be aware that the process could not be started at all, assumes there is something wrong with the deployment process.

We had a monitoring problem.

Monitor everything, monitoring is everything.

We have a remote team, so we cannot put monitors like this at every home; but we can implement something that will follow you around: Events and Slack!

Let’s start with events and continue with Slack integration.

supervisord events

Supervisord provides quite a number of events, starting from process state changes to process log outputs and even ticks. In our situation it was important to know the status changes. So I added an eventlistener configuration to supervisord to listen to PROCESS_STATE changes:

[eventlistener:x]
events=PROCESS_STATE
command=x
buffer_size=1024

Let’s examine line by line:

  1. Events can be multiple here. I did not specify all status event types because PROCESS_STATE is a parent event type of all state change events.
  2. As I’ll explain in the following section, I have prepared a binary that will be triggered by event listener mechanism.
  3. If your event listener cannot keep up with incoming events, supervisord puts incoming messages to buffer. If buffer is full, supervisord discards the oldest message. I did not want to lose any messages.

event listener program

Supervisord runs the program and communicates with it through standard input and output. It expects outputs such as READY,OK, RESULT 2 to understand that if the listener is ready, received the message, and acknowledged it. It sends a header line from standard input, then a payload line if there is any.

supervisor-event-handler library takes care of this communication protocol and allows processing event in 5 lines of code (considering the main function):

package main

import (
	"fmt"
	"os"

	eventhandler "github.com/mtyurt/supervisor-event-handler"
)

func main() {
	handler := eventhandler.New()
	handler.HandleEvent("PROCESS_STATE", func(header eventhandler.HeaderTokens, payload map[string]string) {
		fmt.Fprintf(os.Stderr, "event: %s, payload: %v\n", header.EventName, payload)
	})
	handler.Start()
}

Start is a blocking process since this program needs to use stdout and stdin, infinitely until the program is killed. Also I used goroutines to handle events to not overflow the buffer.

This program will print something like:

event: PROCESS_STATE_STOPPING, payload: map[groupname:group from_state:RUNNING pid:9 processname:process]

sending messages to slack

So we can process the messages so far. It is easy to do the rest. I have also prepared a simple tool to demonstrate the usage of these events: supervisor-event-to-slack. It’s just a template, you can do whatever you want starting from there. But it is ready to be used with following supervisord configuration:

[eventlistener:status_listener]
command=/path/to/supervisor-event-to-slack
events=PROCESS_STATE
autostart=true
buffer_size=1024
environment=SLACK_TOKEN="your-slack-token",SLACK_CHANNNEL="channel-to-post-messages"

Please configure path to binary, Slack token and channel; then you are good to go! You will start to receive messages like this:

You can have more meaningful messages if you are not making a demo.

If you want to handle other events, you can configure as well in eventlistener configuration; payload is provided as a string to string map anyway. Remember, content of payload changes for each event type, each payload is specified after the event type in the documentation.

conclusion & outcome

  • It doesn’t solve all problems, but we covered more bases in terms of monitoring.
  • Developers have more insight about the status of processes.
  • I get less mentions such as is this working
  • Everybody is happier.

Happy hacking & monitoring!