Rancher and Grafana – A long case-study to learn some DevOps

Motivation

Cryptotrading is so cool and profitable these days… So I decided to try it out. But if you want to do something well, you need a lot of information and tools. I wanted some visualization, with chat (Slack) integration, and some alerting tools/bots. This post will describe the tools I found and started playing with. I found a Github project called Gekko, and they said “it can’t be hosted on Heroku, you need Docker!” so my experimenter instincts just kicked in (big grin)  I love trying out “new” things, and Docker was on my list already. So this was the point where I started. (ProTip.: If you want to use this post as a tutorial; please read through this post before you start, it’s not written as a tutorial.)

I’m going to show you some interesting tools, their configurations and options for them to work together. The list is: Docker, Rancher, Prometheus, Grafana, HuBot, Postgresql, Gekko.

(If you have close to zero Linux and devops/hosting experience, I suggest you to read through all the available options of these tools, all the UIs and search on Google if you are stuck at a step; this will not be a step-by-step tutorial.)

rancher and grafana devops wanari

The cost of the unknown

My motivation was to make money. If you spend a lot of money to make a little money, you are doing it wrong. I wanted to host my own containers (on my own infra, hosted) in the cloud, and it’s relatively costly. So first things first, I needed to buy “cheap” VMs. 2 big competitor, AWS and Azure, for nearly the same price; approx 20$/month 2 host, 2 vCPU, 2Gb ram. It’s not cheap compared to Heroku, and if you have a memory-hungry (e.g. JVM based) application, that will eat a host with 1Gb ram in no time…  But I found Scaleway. 3 EUR/month for 2 core and 2Gb ram, or if you can handle ARM architecture, you can get a BareMetal with 4 core and 2Gb ram. It has limitations of course (e.g. only 10 public ip/account), but for the scale I want, this price is super good. At the beginning, I tried to find some directly “container hosts” (with some easy to use fancy UI, save the time of install and configure OS things), but I didn’t find anything in this price range (maybe I found a huge market gap?).

I bought a domain too. I can create sub-domains for my containers. It’s really intuitive and fun, worth that 20GBP for 2 years (big grin) (If you want to do this too; config your DNS settings to something like the picture below.)

dns config rancher grafanaThe most costly part was the time it took to learn to use all the things. But the Christmas holiday is good for these types of tasks.

Rancher

Terminal is good. A nice UI is better (big grin). (At least in the beginning, while you have no idea how the big picture works.) I found two options for handling containers with UI.

The first one is Portainer. It didn’t provide me with a great UX. Strange lists, not so logical site structure. If you want to use clusters, you need to figure it out yourself…

The other one is Rancher. Much better UI, built-in host/cluster handling, some really cool and handy features. The only problem is the memory; because it’s Java based.

In the end, I chose Rancher. The install is relatively easy.

  1. You start an SSH connection to your freshly bought (ubuntu based) VM. (All of my commands will require root privileges or sudo.) Update the packages with sudo apt-get update and sudo apt-get upgrade. Then query the available docker-engine versions with sudo apt-cache policy docker-engine, and install one compatible from the official list (for ex. sudo apt-get install docker-engine=17.03.1~ce-0~ubuntu-xenial). This step is much harder if your VM is arm based, see the note at the end of this section.
  2. Start the rancher server on port 8080 with sudo docker run --name=rancher -d --restart=unless-stopped -p 8080:8080 rancher/server:stable (for more info or options read the official Rancher Docs about installation).
  3. Wait… Refresh http://<vmip>:8080 , or query the logs with sudo docker logs -f <CONTAINER_ID> as long as the site is not loaded. (If you use domain name, use http://<yourdomain>:8080 or http://rancher.<yourdomain>:8080) (Of course you can use another port. I don’t recommend 80 or 443 (haproxy and subdomaining later), and some ports (like 3333) may be banned by your VM host.)
  4. If the site is up; our first thing is to secure it. Go to Admin > Access control! I prefer Github auth, the UI tells you everything (if you use domain name; config the urls with it). You could choose local auth too. (For more info read the official Rancher Docs about access control).
  5. The last step is adding the host. Go to Infrastructure > Hosts > Add > Custom! Lot of info, the only thing that matters is the 5th point. (If you are using domain; check the url inside the box, and if it does not match your desired domain or subdomain name, go to Admin > Settings, modify it, and come back.) Copy the text from the 5th block, and paste it to your VMs terminal. After you executed the command and hit the large blue close button on the UI (and wait a bit for the initialization) you will see a host in your host screen.

The UI and the documentation are clear and straightforward! Really nice job!

Adding a second host is really easy too. You need to buy/start a new VM, do the first step (install Docker to the new VM) and the last step (click “Add host” on the Rancher UI, copy the 5th block, and paste to the new VM’s terminal, execute, wait to show up on the hosts screen).

With multihost setups we will have an another problem, with persistent storages, but I left it for a later topic.

I don’t want to introduce the full rancher feature set and UI now. In the next sections when I start a new service I am going to write about them. But at this point it is worth to click through all the menus to get familiar with the UI a bit. (Or at least watch the demo (big grin))

A quick note about ARM. When I needed a second VM I started an ARM-based one. I totally wasted 2 hours… The ARM based system needs ARM based packages, so you need to use the official Docker package repository, and install docker from there (this is inconvenient but I can live with it). The biggest issue is; the ARM based host can only run ARM based containers. So hosting on mixed architectured VMs is not really a good idea. (But I’m not an expert with this deepness of Docker, so help me out in the comments if you don’t agree (big grin))

Grafana

Okay! We have a host! I want graphs. After Googling a bit, I found Graphana and it seemed promising. Let’s try it out!

One nice thing about Rancher is its catalog. You have predefined stacks (container sets) with easily fill-able forms. So I scrolled through the catalog, find Graphana and installed it. After I logged in, I found out, that I haven’t had a PgSql as datasource. I checked the version, and deleted the stack (big grin)

Start again!

I wanted Graphana so I created a stack with the name “Grafana”. I added a container from the UI and delegated an HTTP port, and also made the volume permanent. At first this seemed ok. Problems: If you make a volume permanent, as the picture shows, your app will work on one host. You can configure it on that host, and the next time when it will be rebalanced it will lose all the inner data (ofc. the permanent data will not travel through hosts (big grin) ). So permanent data… I will solve this later. The other problem was the port binding. You don’t need to bind ports on host. You can proxy!grafana config

grafana add service

grafana volumes

Then I realized Rancher has a built in load-balancer… Amazing stuff. I started a new stack (HTTP), added a load balancer (if you have multihost and do subdomaining like me, worth to start it on all host), added my container, and it worked as expected. If you ever configured haproxy by hand, you will know how clever and easy this method is. I want to thank for this feature, whoever implemented it (big grin) (We can update this container at any time if needed.)

grafana update container

 

I didn’t configure grafana securely, but you can add env-vars to make Github auth. Maybe I will update this part if somebody is really interested, but I think from the docker page or from the graphana docs you can do this if you really want to.

NFS

Persistency with containers. I still have no idea if it’s a good approach or a “just working” approach… But if you have a file server (for example your second VM), you can add NFS and then all of your containers can use a network transparent file storage.

I basically followed the offical site with this configuration. (For linux starters you can sudo nano /etc/exports to modify it). In the exports file, it is worth to add some security and instead of the *(rw,sync,no_subtree_check,no_root_squash) use concrete IPs. (For example for two servers you can write it like: /nfs    <IP1>(rw,sync,no_subtree_check,no_root_squash) <IP2>(rw,sync,no_subtree_check,no_root_squash)) It’s a bit tricky to find out what IPs you need to use (public or internal), and you need to add the host to its own list too (big grin)

You can learn more about this topic at the following links:

At this point you can set your Grafana to be storage persistent on any hosts if you want. (I hope you listened to me at the beginning and didn’t start to configure it already (big grin)) The upgrading process is really easy too. You modify the params, and click to upgrade. At this point the original container will stop, and a new container will start somewhere. You can test the new container, and Accept the upgraded one, or revert to the previous.

nfs persistency with containers

 

containers garafana

 

Prometheus

So we can now show graphs. But we have no data yet. Also, we have 2 hosts, but 0 monitoring. Let’s solve this problem next!

Fast Google search, they said Prometheus is the monitoring tool I need. We have an item in the catalog for this. Try it! (We can delete it again as we did with the Graphana before/if we fail…)

It will spin up some containers. Quick review, we don’t need Graphana again (delete it). We don’t want to expose ports, update the Prometheus container and drop the port binding. Other things seem good. (I started before I added my second host, the cadvisor and the node exporter will start on every new host which is a big like!)

Our previously configured Grafana needs a link to this Prometheus thing so upgrade it (I hope at this point you have some persistence already).

grafana containers persistence

 

Let’s play with it a bit! Add our Prometheus datasource (after login with admin/admin or your configured method, topleft corner > datasource > Add), and preload two dashboards 193 and 4170 (topleft > dashboards > import). So easy and goodlooking (big grin)  (You can try out more dashboards, or you can try to make a new one too. I found a really good article for this topic if you interested! And a rancher specific tutorial here.)

http settings prometheus

 

import dashboard grafana

 

comparison rancher grafana

 

HuBot

If I only have my phone near me, that would be good if I could check the graphs without opening the dashboard when I get an alert (later).

Do some Slack integration too!

This is the only thing that has worked for the first run without config modification, so here I want to thank the hubot-slack-docker creators for this container. All I needed was to Google the desired container. Read the docs. Read some Grafana docs, read some Slack docs (that was a waste of time), and add the env-vars correctly.

You can create your own Slack channel. If you have that, you can go to the integration page, and add a new bot integration. (You can name and customize your bot, but the important is the api token.)

If you have your own domain, you will not need s3 integration as the Grafana docs says. You will only need a Graphana api key. (top left > <USER> > api keys > add)

Then start a container like the picture below (I placed it under my Grafana stack):

environment value devops

 

Your bot will come online after a while. And can do cool things like this (ofc. we have no dashboard for this yet):

grafana no dashboard

 

Postgres

Long story short:

I created a new service from the postgres:9.5 base image. No port mappings, no env-vars, volume mount to /var/lib/postgresql/data. (It’s a good question; whether the db being on nfs is a good or a bad thing. My postgres was created before nfs, but it can’t leave that host now…) I took it to a new stack called Gekko(because it is my gekko db).

If you want to mange your new database you have two options. The first is to simply expose the port and connect via any sql manager. It’s easy but because we didn’t configure env-vars, this is really not secure. The second option is to use the terminal in the container. I will show this method to you.

If you are in the stacks view, you can click to a specific service name. After that you will land in the service page.

containers hubot

 

If you click on the dots, you can execute shell, or watch logs. You can now run psql -U postgres and run any sql. (Worth creating a new database for Gekko.)

You can upgrade your Grafana container with a new postgres link. (Like we did with Prometheus before.) And inside the Grafana web you can add a new datasource (really straightforward – the postgres default port is 5432).

Another cool Rancher feature is the “Link graph”. You can navigate to a stack page (clicking to the name of the stack), and right there on the top is a button for this.

rancher link graph

 

Gekko

This project is an interesting one. The code mostly works, but it has some bugs/problems. The documentation is good if you are a tech person, but really bad if you are a guy who wants easy money with a working solution.

I made this to work, but it’s really fragile and hacky. My concept started with this Gekko docker image. I started a new container, named it Gekko-ETH, and added a volume like gekko-src:/usr/src/gekko and linked postgres to it too. After the container started I navigated to the service page, started a console, and started to hack it (big grin). First off all I did a git pull so I got the latest Gekko code. Then I patched the plugins/postgres/writer.js (line 66ish) to not use cache. Made a new folder, and copied my conf files there. (If you can mount your nfs to your personal computer too, you can actually do these modifications from your UI. If you can’t access the nfs, you can copy the files through scp or with echo/cat/copypaste.) Now you can edit the startscript to use your config with vi /service_start_gekko.sh (or if you are a nano person like me you can apt-get install it and use that (big grin)) .

With the log feature you can see whether Gekko can start with your conf or not.

If one instance is working, you can start a new service, attach the same volume as before (so your code patches will live with the new container too). Override the startscript with the other config file.

Why is it fragile? These containers have modified inner data (I actually edit the runscript). So when you want to scale or move them, they might break… It’s working, but really not elegant… If I have some time, I will fork the original repo and move the config to env-var for better “docker feel”. (Actually I did it…)

About my configs:

  • I wanted to watch markets so I modified that
  • I wanted Slack notifications so configured that too
  • I wanted graphs so I enabled the candle-writer, and added the “postgresql” as adapter
  • Then modified config.adapter to “postgresql” too
  • Ofc. I modified config.postgresql to use my db
    config postgresql
  • trading advisor and auto trading is your job (big grin)

After my instances started running as I expected (I got data to my pg tables) I made some graphs to Grafana. This is really easy and fun if you ask me. You can do whatever you want with the data. For example you can easily do 24h graphs like the picture below. You can make dashboards, and pin them to old monitors (I recommend OrangePI light for this).

grafana rancher

 

I already mentioned the Slack integration with Grafana, but I pimped Gekko’s slack integration too, this is my “alert” channel (big grin).

Summary

After my winter holidays I’ve got some Docker insight. I love Rancher so far, really love Grafana too. Gekko is nasty, but I think with some patience it can be tamed. (Don’t worry I’m not trading with it (big grin)) As I said in the title it was a case-study, I hope it will help some people to introduce them to these tools a bit. I made a small Scala service for xchange data to prometheus too if you are interested (big grin)

As always, if you’d like to read more from me, follow our page on Facebook or Twitter. As I said throughout the post, I welcome comments and any suggestions for improvement. Cheers!

 

Gergő Törcsvári

Gergő Törcsvári

Software Developer at Wanari
I would love to change the world, but they won’t give me the source code (yet).