pacemaker can't start resources with single controller online

I have 3 controllers in my environment deployed by mirantis fuel and today I saw, that when 1/3 controllers are down, pacemaker works perfectly. But when 2/3 controllers are down - all pcs resources are stopped.

I can't find any reason why, maybe I need to change some properties in my cluster? I would be grateful for any help.

# pcs status
Cluster name:
WARNING: corosync and pacemaker node names do not match (IPs used in setup?)
Last updated: Wed Jun 28 15:14:39 2017          Last change: Wed Jun 28 14:55:45 2017 by hacluster via crmd on
Stack: corosync
Current DC: (version 1.1.14-70404b0) - partition WITHOUT quorum
3 nodes and 50 resources configured

Online: [ ]

Full list of resources:

 Clone Set: clone_p_vrouter [p_vrouter]
     Stopped: [ ]
 vip__management        (ocf::fuel:ns_IPaddr2): Stopped
 vip__zbx_vip_mgmt      (ocf::fuel:ns_IPaddr2): Stopped
 vip__vrouter_pub       (ocf::fuel:ns_IPaddr2): Stopped
 vip__vrouter   (ocf::fuel:ns_IPaddr2): Stopped
 vip__public    (ocf::fuel:ns_IPaddr2): Stopped
 Clone Set: clone_p_haproxy [p_haproxy]
     Stopped: [ ]  (ocf::pacemaker:SysInfo):       Stopped
 Master/Slave Set: master_p_conntrackd [p_conntrackd]
     Stopped: [ ]  (ocf::pacemaker:SysInfo):       Stopped
 Master/Slave Set: master_p_rabbitmq-server [p_rabbitmq-server]
     Slaves: [ ]
 Clone Set: clone_p_mysqld [p_mysqld]
     Started: [ ]
 Clone Set: clone_p_dns [p_dns]
     Stopped: [ ]   (ocf::pacemaker:SysInfo):       Stopped
 p_aodh-evaluator       (ocf::fuel:aodh-evaluator):     Stopped
 p_ceilometer-agent-central     (ocf::fuel:ceilometer-agent-central):   Stopped
 Clone Set: clone_p_heat-engine [p_heat-engine]
     Stopped: [ ]
 Clone Set: clone_neutron-openvswitch-agent [neutron-openvswitch-agent]
     Stopped: [ ]
 Clone Set: clone_neutron-l3-agent [neutron-l3-agent]
     Stopped: [ ]
 Clone Set: clone_neutron-metadata-agent [neutron-metadata-agent]
     Stopped: [ ]
 Clone Set: clone_neutron-dhcp-agent [neutron-dhcp-agent]
     Stopped: [ ]
 Clone Set: clone_p_ntp [p_ntp]
     Stopped: [ ]
 p_zabbix-server        (ocf::fuel:zabbix-server):      Stopped
 Clone Set: clone_ping_vip__public [ping_vip__public]
     Stopped: [ ]

PCSD Status:
  *Unknown* ( Offline member ( Offline
  *Unknown* ( Offline

cluster properties:

 cluster-infrastructure: corosync
 cluster-recheck-interval: 190s
 dc-version: 1.1.14-70404b0
 have-watchdog: false
 last-lrm-refresh: 1498654072
 node-health-strategy: migrate-on-red
 start-failure-is-fatal: false
 stonith-enabled: false
 symmetric-cluster: false
UID/GID: uid=hacluster gid=haclient

pcs config:

1 answer

answered 2017-06-28 14:54:37 -0600

I think it is an expected behavior, if you want to change this default behavior I will advice you to take a look at this. Remember that the auto_tie_breaker parameter is not recommended for production environments.

Thank You Antonio, That's exactly what I needed!

Damian Dąbrowski gravatar imageDamian Dąbrowski ( 2017-06-28 15:07:44 -0600 )edit

