Marathon

We have seen previously how to launch a ad-hoc task on Mesos. Most of the time your application will be a long-running task like a web app or database server. You will rely on tools like Systemd or Daemontools to monitor and make sure your application up and running all the time. Marathon is essentially a Systemd/Daemontools for your cluster. It monitor applications running on Mesos and automatically restart them in case of failure. The neat thing about Mesos/Marathon is that it can run _heterogeneous _workload so your failed applications can be restarted anywhere in the cluster as long as all resources and constraints (if any) are satisfied. Compared to the tradition approach when you have to spin up additional app instances for high availability, Marathon/Mesos is much more scalable.

Marathon itself support HA by running multiple instances. Leader gets elected by Zookeeper and requests to non-leaders are forwarded to leader. We can run Marathon on the same servers and Mesos Master.

On all Mesos Master servers:

yum install marathon
systemctl enable marathon
systemctl start marathon

Lets start a web application on Marathon, on Marathon UI click on "Create Application" button. Create an app with idsimple-python-server with the command python -m SimpleHTTPServer. Wait until the application status changed to RUNNING. Now look under the Instances sub tab for the hostname of the slave where you app is running on. Do a curl server:8080 to verify that your app is running correctly. You will see something like

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"><html>
<title>Directory listing for /</title>
<body>
<h2>Directory listing for /</h2>
<hr>
<ul>
<li><a href="stderr">stderr</a>
<li><a href="stdout">stdout</a>
</ul>
<hr>
</body>
</html>

This is the sand box directory where Mesos agent run our web app. You can also download the stderr/stdout of our app from Marathon. Now log into the slave that run the app and kill our application. Marathon should automatically detect that our app has failed and restart it on another Mesos agent.

Alternatively we can start our application by using Marathon REST API

{
  "id": "/python-simple-server",
  "cmd": "python -m SimpleHTTPServer",
  "cpus": 0.1,
  "mem": 128,
  "disk": 0,
  "instances": 1,
  "healthChecks": [
    {
      "path": "/",
      "protocol": "HTTP",
      "gracePeriodSeconds": 277,
      "intervalSeconds": 60,
      "timeoutSeconds": 20,
      "maxConsecutiveFailures": 3,
      "ignoreHttp1xx": false,
      "port": 8000
    }
  ],
  "portDefinitions": [
    {
      "port": 8000,
      "protocol": "tcp",
      "name": "http",
      "labels": {}
    }
  ],
  "container": null,
  "constraints": [
    [
      "hostname",
      "LIKE",
      "slave-00[1-2].*"
    ]
  ],
  "user": "www-data"
}

There are some important fields that you may find useful:

  • healthChecks: custom health checking condition. Without this Marathon only watch out for non-0 exit code and could not determine if an app or service is healthy (but it will still run it). This is important when you want to do blue-green deployment.
  • portDefinitions: the port exposed by our service
  • constraints: where do we want to run the service. Marathon support many type of constraints. You can see the full list here marathon constraints
  • user: The user under which the app will run. Note that mesos-slave daemon needs to be run at root to be able to setuid. Also there is a bug in Marathon which doesn't allow you to update the user field after creating the job, so you need to make sure you set the correct user from the beginning. This has been fixed in mesosphere/marathon#4679

The Marathon REST API documentation can be found here: https://docs.mesosphere.com/1.8/usage/marathon/rest-api/#!/apps/V2Apps2

results matching ""

    No results matching ""