最新消息:比度技术-是关注云计算、大数据、分布式存储、高并发、高性能、人工智能等互联网技术的个人博客。

Swarm节点调度-过滤器-容器失败停止运行Swarm Rescheduling 重新调度(翻译)

云计算 bidu 758浏览

Swarm节点调度

Filters : 过滤器 告诉Swarm调度器在那一台节点机器创建容器 过滤器被分为两个类别:节点过滤器和容器配置过滤器。节点过滤器针对Docker主机特征或Docker daemon配置 特征进行过滤。容器配置过滤器针对容器特征或主机上可用的镜像进行过滤。

每一种过滤器都有一个名字。

节点过滤器分为:

*constraint *health

容器配置过滤器:

  • affinity
  • dependency
  • port

当使用swarm manage 启动swarm manager时 默认是所有的过滤器都是可以使用的。 可以使用–filter选项进行限制。

swarm manage –filter=health –filter=dependency

容器配置过滤器默认是针对所有容器都有效的,包括已停止的容器。

  • 节点约束

默认的机器相关的约束和Dockerhost 属性配置相关:

  • * 默认标签属性

node to refer to the node by ID or name

storagedriver

executiondriver

kernelversion

operatingsystem

  • *普通标签需要在启动Docker daemon 时指定
  • 例如:$ docker daemon –label com.example.environment=”production” –label com.example.storage=”ssd”
  • 当启动容器时可以指定:通过 constraints选项指定默认标签或普通标签
  • 通过指定主机属性 storage=ssd, swarm把容器调度到指定主机节点上。
  • 指定地区 region=us-east`.
  • 指定环境: environment=production.

案例:

通过多个 –label 标签为节点Docker daemon (支持多个标签)指定 普通标签:

docker daemon –label com.example.environment=”production” –label com.example.storage=”ssd”

  • to start node-1 with the storage=ssd label:

docker daemon –label storage=ssd

  • node-2 with storage=disk:

docker daemon –label storage=disk

使用约束运行容器
  • docker tcp://<managerip:managerport> run -d -P -e constraint:storage==ssd –name db mysql

上面是把mysql运行在SSD节点上保证有好的I/O

  • docker tcp://<managerip:managerport> run -d -P -e constraint:storage==disk –name frontend nginx

上面把nginx运行在普通磁盘节点上。

Dockerfile docker build 参数可以指定约束

$ mkdir sinatra
$ cd sinatra
$ echo "FROM ubuntu:14.04" > Dockerfile
$ echo "MAINTAINER Kate Smith <ksmith@example.com>" >> Dockerfile
$ echo "RUN apt-get update && apt-get install -y ruby ruby-dev" >> Dockerfile
$ echo "RUN gem install sinatra" >> Dockerfile
$ docker build --build-arg=constraint:storage==disk -t ouruser/sinatra:v2 .
Sending build context to Docker daemon 2.048 kB
Step 1 : FROM ubunt

health filter 避免容器被调度到不健康的节点上。

容器过滤器

当创建一个容器时,有三类容器过滤器可以使用

  • affinity
  • dependency
  • port

affinity

affinity 过滤器,Swarm会把 容器调度到指定约束的同一台机器上

  • 例如:
  • $ docker tcp://<managerip:managerport> run -d -p 80:80 –name frontend nginx
  • 87c4376856a8
  • 然后:使用参数 :-e affinity:container==frontend 运行容器,swarm会把他们调度到同一个节点上。
  • docker tcp://<managerip:managerport> run -d –name logger -e affinity:container==frontend logger
  • -e affinity:container==frontend(87c4376856a8) 约束可以指定 容器名字frontend 或容器ID 87c4376856a8
  • 可以指定容器运行在镜像被pull下来的节点上 -e affinity:image==redis

$ docker tcp://<managerip:managerport> run -d –name redis1 -e affinity:image==redis redis

  • A label affinity 允许基于普通标签进行过滤


$ docker tcp:// run -d -p 80:80 --label com.example.type=frontend nginx
 87c4376856a8

$ docker tcp:// ps  --filter "label=com.example.type=frontend"
CONTAINER ID        IMAGE               COMMAND             CREATED                  STATUS              PORTS                           NAMES
87c4376856a8        nginx:latest        "nginx"             Less than a second ago   running             192.168.0.42:80->80/tcp         node-1/trusting_yonath

  • 然后使用-e affinity:com.example.type==frontend 运行容器,swarm会把该容器调度到 上面 com.example.type==frontend 容器所在的同一个节点上。

使用依赖过滤器

依赖过滤器会把容器调度到满足所有依赖约束的一台机器上面,如果找不到满足所有依赖的机器,不会创建容器 目前支持三种依赖

  • –volumes-from=dependency (shared volumes)
  • –link=dependency:alias (links)
  • –net=container:dependency (shared network stacks)

应避免多依赖约束: The combination of multiple dependencies are honored if possible. For instance, if you specify –volumes-from=A –net=container:B, the scheduler attempts to co-locate the container on the same node as A and B. If those containers are running on different nodes, Swarm does not schedule the container.

port filter

端口约束可以使用端口映射指定如下:

docker tcp://<managerip:managerport> run -d -p 80:80 nginx

或者在容器配置文件中指定使用主机网络并暴露端口

一个exited 状态的容器还是占用他的端口号的,要想释放端口号需要删除容器。

Node port filter with host networking

–net=host 约束要求:显式指定暴露端口,在Dockerfile文件 使用EXPOSE暴露端口或使用–expose 在命令行暴露端口。swarm结合使用这些信息使用host模式 为新容器选择一个节点运行。端口冲突的节点不会被选中。

  • For example, the following commands start nginx on 3-node cluster. *

$ docker tcp:// run -d --expose=80 --net=host nginx
640297cb29a7
$ docker tcp:// run -d --expose=80 --net=host nginx
7ecf562b1b3f
$ docker tcp:// run -d --expose=80 --net=host nginx
09a92f582bc2

  • 端口绑定信息在对docker ps是不可用的


$ docker tcp:// ps
CONTAINER ID        IMAGE               COMMAND                CREATED                  STATUS              PORTS               NAMES
640297cb29a7        nginx:1             "nginx -g 'daemon of   Less than a second ago   Up 30 seconds                           box3/furious_heisenberg
7ecf562b1b3f        nginx:1             "nginx -g 'daemon of   Less than a second ago   Up 28 seconds                           box2/ecstatic_meitner
09a92f582bc2        nginx:1             "nginx -g 'daemon of   46 seconds ago           Up 27 seconds                           box1/mad_goldstine

How to write filter expressions 支持正则表达式

请参考:https://docs.docker.com/swarm/scheduler/filter/

Swarm Rescheduling 重新调度

可以设定调度策略决定swarm怎样处理失败的容器节点

调度策略

当启动一个容器时可以为他设置重新调度策略,可以通过设置reschedule 环境变量或通过标签 com.docker.swarm.reschedule-policy label 指定重新调度策略,如果不指定调度策略,swarm默认是关闭重新调度的,当容器失败停止运行时不会被重新启动。

  • 通过reschedule 环境变量实现:

$ docker run -d -e reschedule:on-node-failure redis

  • 通过设置com.docker.swarm.reschedule-policy标签实现:

$ docker run -d -l ‘com.docker.swarm.reschedule-policy=[“on-node-failure”]’ redis

查看重新调度日志:

通过docker logs 查看容器重新调度信息: docker logs SWARMMANAGERCONTAINER_ID


[root@07Node ~]# docker logs 68c8f02d5bb2
time="2016-05-24T03:40:11Z" level=info msg="Initializing discovery without TLS" 
time="2016-05-24T03:40:11Z" level=info msg="Listening for HTTP" addr=":2375" proto=tcp 
time="2016-05-24T03:40:11Z" level=info msg="Leader Election: Cluster leadership lost" 
time="2016-05-24T03:40:11Z" level=info msg="New leader elected: 172.16.10.216:2376" 
time="2016-05-24T03:40:11Z" level=info msg="Registered Engine 219Node at 172.16.10.219:2375" 
time="2016-05-24T03:40:11Z" level=info msg="Registered Engine 216Node at 172.16.10.216:2375" 
time="2016-05-24T03:40:12Z" level=info msg="Registered Engine 07Node at 172.16.40.7:2375" 

当一个容器被重新调度成功会产生如下类似log信息:
Rescheduled container 2536adb23 from node-1 to node-2 as 2362901cb213da321
Container 2536adb23 was running, starting container 2362901cb213da321
重新调度失败会产生如下类似log信息:
Failed to start rescheduled container 2362901cb213da321

Docker Swarm strategies-swarm策略 决定Swarm怎样调度容器到符合条件的节点机器上面。

通过–strategy 标签为swarm manager 指定 调度策略:swarm manage –strategy “spread” 目前swarm支持三种策略:

  • spread
  • binpack
  • random
  • spread、binpack 会根据节点机器可用的CPU、内存、和容器数量来计算调度。random不需要计算资源随机调度。
  • spread 会选择容器数量最少的节点机器运行容器。是默认的策略。(把容器尽可能均匀分散到不同节点机器上),如果节点机器的容器数量一样多,sawrm 倾向于把容器调度到上次最新的容器运行的机器上面。
  • binpack 尽可能把容器分配到相同节点把容器聚集起。把容器调度到运行容器最多的节点机器上面。

例子:


docker tcp:// run -d -P -m 1G --name db mysql
f8b693db9cd6

$ docker tcp:// ps
CONTAINER ID        IMAGE               COMMAND             CREATED                  STATUS              PORTS                           NAMES
f8b693db9cd6        mysql:latest        "mysqld"            Less than a second ago   running             192.168.0.42:49178->3306/tcp    node-1/db

参考:https://docs.docker.com/swarm/scheduler/strategy/

Swarm manage 帮助文档


[root@07Node ~]# docker run --rm swarm  -h
Usage: swarm [OPTIONS] COMMAND [arg...]

A Docker-native clustering system

Version: 1.2.0 (a6c1f14)

Options:
  --debug			debug mode [$DEBUG]
  --log-level, -l "info"	Log level (options: debug, info, warn, error, fatal, panic)
  --experimental		enable experimental features
  --help, -h			show help
  --version, -v			print the version
  
Commands:
  create, c	Create a cluster
  list, l	List nodes in a cluster
  manage, m	Manage a docker cluster
  join, j	Join a docker cluster
  help		Shows a list of commands or help for one command
  
Run 'swarm COMMAND --help' for more information on a command.
[root@07Node ~]# docker run --rm swarm  manage -h
Usage: swarm manage [OPTIONS] 

Manage a docker cluster

Arguments: 
       discovery service to use [$SWARM_DISCOVERY]
                   * token://
                   * consul:///
                   * etcd://,/
                   * file://path/to/file
                   * zk://,/
                   * [nodes://],

Options:
   --strategy "spread"							placement strategy to use [spread, binpack, random]
   --filter, -f [--filter option --filter option]			filter to use [health, port, dependency, affinity, constraint]
   --host, -H [--host option --host option]				ip/socket to listen on [$SWARM_HOST]
   --replication							Enable Swarm manager replication
   --replication-ttl "20s"						Leader lock release time on failure
   --advertise, --addr 							Address of the swarm manager joining the cluster. Other swarm manager(s) MUST be able to reach the swarm manager at this address. [$SWARM_ADVERTISE]
   --tls								use TLS; implied by --tlsverify=true
   --tlscacert 								trust only remotes providing a certificate signed by the CA given here
   --tlscert 								path to TLS certificate file
   --tlskey 								path to TLS key file
   --tlsverify								use TLS and verify the remote
   --engine-refresh-min-interval "30s"					set engine refresh minimum interval
   --engine-refresh-max-interval "60s"					set engine refresh maximum interval
   --engine-failure-retry "3"						set engine failure retry count
   --engine-refresh-retry "3"						deprecated; replaced by --engine-failure-retry
   --heartbeat "60s"							period between each heartbeat
   --api-enable-cors, --cors						enable CORS headers in the remote API
   --cluster-driver, -c "swarm"						cluster driver to use [swarm, mesos-experimental]
   --discovery-opt [--discovery-opt option --discovery-opt option]	discovery options
   --cluster-opt [--cluster-opt option --cluster-opt option]		cluster driver options
   									 * swarm.overcommit=0.05		overcommit to apply on resources
                                    					 * swarm.createretry=0			container create retry count after initial failure
                                    					 * mesos.address=			address to bind on [$SWARM_MESOS_ADDRESS]
                                    					 * mesos.checkpointfailover=false	checkpointing allows a restarted slave to reconnect with old executors and recover status updates, at the cost of disk I/O [$SWARM_MESOS_CHECKPOINT_FAILOVER]
                                    					 * mesos.port=				port to bind on [$SWARM_MESOS_PORT]
                                    					 * mesos.offertimeout=30s		timeout for offers [$SWARM_MESOS_OFFER_TIMEOUT]
                                    					 * mesos.offerrefusetimeout=5s		seconds to consider unused resources refused [$SWARM_MESOS_OFFER_REFUSE_TIMEOUT]
                                    					 * mesos.tasktimeout=5s			timeout for task creation [$SWARM_MESOS_TASK_TIMEOUT]
                                    					 * mesos.user=				framework user [$SWARM_MESOS_USER]
[root@07Node ~]# 

转载请注明:比度技术-关注互联网技术的个人博客 » Swarm节点调度-过滤器-容器失败停止运行Swarm Rescheduling 重新调度(翻译)