Monitoring Supervisord Processes in Slack with Go
TL;DR:
- supervisord has event listener mechanism for status changes and other things
- It is a little messy
- I have abstracted the communication protocol via a small library: https://github.com/mtyurt/supervisor-event-handler
- I have implemented a small tool to send status change messages to Slack: https://github.com/mtyurt/supervisor-event-to-slack which can be improved for customized workflows / events.
–
Supervisord is a powerful tool to daemonize processes. It provides control over processes, like starting and stopping with simple commands, and nice features such as auto restart, run as a user, etc.
The typical usage of supervisord is:
- log in to a server
- start certain processes
- check the process status
This process is fine to start, stop, kill processes, but when it comes to know the status of the process, it is cumbersome and too proactive. We start the process once, yes, but we wonder its up status for longer durations. You need to go over several steps to gather status information. And this becomes a problem in an environment managed with CI/CD and especially when the developers, who are less enthusiastic to know the underlying components of the system, do not have access to the servers supervisord is running on (frankly, they shouldn’t have access to anything aside from Slack and mail client).
One problem we faced in our workflow was, the build was successful, the process could be started, but it would die after a few seconds during the boot up phase. The deployer would not be aware that the process could not be started at all, assumes there is something wrong with the deployment process.
We had a monitoring problem.
We have a remote team, so we cannot put monitors like this at every home; but we can implement something that will follow you around: Events and Slack!
Let’s start with events and continue with Slack integration.
supervisord events
Supervisord provides quite a number of events, starting from process state changes to process log outputs and even
ticks. In our situation it was important to know the status changes. So I added an eventlistener
configuration to
supervisord to listen to PROCESS_STATE
changes:
[eventlistener:x]
events=PROCESS_STATE
command=x
buffer_size=1024
Let’s examine line by line:
- Events can be multiple here. I did not specify all status event types because
PROCESS_STATE
is a parent event type of all state change events. - As I’ll explain in the following section, I have prepared a binary that will be triggered by event listener mechanism.
- If your event listener cannot keep up with incoming events, supervisord puts incoming messages to buffer. If buffer is full, supervisord discards the oldest message. I did not want to lose any messages.
event listener program
Supervisord runs the program and communicates with it through standard input and output. It expects outputs such as
READY
,OK
, RESULT 2
to understand that if the listener is ready, received the message, and acknowledged it. It
sends a header line from standard input, then a payload line if there is any.
supervisor-event-handler library takes care of this communication protocol and allows processing event in 5 lines of code (considering the main function):
package main
import (
"fmt"
"os"
eventhandler "github.com/mtyurt/supervisor-event-handler"
)
func main() {
handler := eventhandler.New()
handler.HandleEvent("PROCESS_STATE", func(header eventhandler.HeaderTokens, payload map[string]string) {
fmt.Fprintf(os.Stderr, "event: %s, payload: %v\n", header.EventName, payload)
})
handler.Start()
}
Start
is a blocking process since this program needs to use stdout and stdin, infinitely until the program is killed.
Also I used goroutines to handle events to not overflow the buffer.
This program will print something like:
event: PROCESS_STATE_STOPPING, payload: map[groupname:group from_state:RUNNING pid:9 processname:process]
sending messages to slack
So we can process the messages so far. It is easy to do the rest. I have also prepared a simple tool to demonstrate the usage of these events: supervisor-event-to-slack. It’s just a template, you can do whatever you want starting from there. But it is ready to be used with following supervisord configuration:
[eventlistener:status_listener]
command=/path/to/supervisor-event-to-slack
events=PROCESS_STATE
autostart=true
buffer_size=1024
environment=SLACK_TOKEN="your-slack-token",SLACK_CHANNNEL="channel-to-post-messages"
Please configure path to binary, Slack token and channel; then you are good to go! You will start to receive messages like this:
If you want to handle other events, you can configure as well in eventlistener
configuration; payload
is provided
as a string to string map anyway. Remember, content of payload changes for each event type, each payload is specified after the
event type in the documentation.
conclusion & outcome
- It doesn’t solve all problems, but we covered more bases in terms of monitoring.
- Developers have more insight about the status of processes.
- I get less mentions such as
is this working
- Everybody is happier.
Happy hacking & monitoring!