Did you know that you can navigate the posts by swiping left and right?
I was playing a lil bit with the new Compose v3 syntax and I found it super intuitive and easy. Let me share some notes on how to deploy an Apache Spark Docker Stack microservice with Compose, on a Docker 1.13 Swarm.
Apache Spark is a big-data tool, used to analyze raw data, scale structured data, launch compute jobs.
The initial topology of this microservice will consist of 1 Spark manager and 3 Spark workers. The manager is the cluster controller, while workers are the workhorses.
First, I usually drain the Swarm managers, a best-pratice useful for saving managers from running containers (and overloading Raft):
$ docker node update --availability drain node-1
Now, I create an overlay VxLAN internal network for Spark containers communication, calling it
spark. I assign a specific subnet, the default one used by my images:
$ docker network create --driver overlay --subnet 10.0.0.0/22 spark
Now, let’s go through the Compose v3 YAML file to model the microservice. It’s fairly straightforward, introducing a nice
deploy block, useful for specifying extra Docker Swarm options:
version: '3' services: spark-master: image: fsoppelsa/spark-master networks: - spark ports: - 8080:8080 - 7077:7077 - 6066:6066 deploy: replicas: 1 update_config: parallelism: 1 delay: 10s spark-worker: image: fsoppelsa/spark-worker networks: - spark deploy: replicas: 3 update_config: parallelism: 1 delay: 10s networks: spark: external: true
Let’s save this file as spark.yml and deploy a Stack:
docker stack deploy -c spark.yml spark
After some minutes necessary to download the images from the Hub, we can check the services status:
$ docker stack ls NAME SERVICES spark 2
$ docker service ls ID NAME MODE REPLICAS IMAGE p08gv8jecnr3 spark_spark-master replicated 1/1 fsoppelsa/spark-master:latest shnxps5x2ev5 spark_spark-worker replicated 3/3 fsoppelsa/spark-worker:latest
Nodes running the spark-worker service tasks:
$ docker service ps spark_spark-worker ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS r7skx21oukv9 spark_spark-worker.1 fsoppelsa/spark-worker:latest node-3 Running Running 11 minutes ago ksrck05kf3hr spark_spark-worker.2 fsoppelsa/spark-worker:latest node-4 Running Running 11 minutes ago pwptx8dm1tkx spark_spark-worker.3 fsoppelsa/spark-worker:latest node-2 Running Running 11 minutes ago
We can now connect to port 8080 of any Swarm node and verify that we have 3 workers that joined the master:
Now, we can directly interact with services, for instance scale the service:
$ docker service scale spark_spark-worker=10 spark_spark-worker scaled to 10 $ docker service ls ID NAME MODE REPLICAS IMAGE p08gv8jecnr3 spark_spark-master replicated 1/1 fsoppelsa/spark-master:latest shnxps5x2ev5 spark_spark-worker replicated 10/10 fsoppelsa/spark-worker:latest
And after some minutes, connect again to the Spark web interface and verify that we have 10 workers:
Of course, with the new Compose v3, rather than interacting with services manually, a new typical workflow would become to put the Stack YAML files in a revision control system and update the Stack with
docker stack deploy.