Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Sahara SSHException: Error reading SSH protocol banner

Dear all,

After installing RDO icehouse successfully, Sahara has been integrated. Unfortunately, I went through many issues and have been resolved using debug mode from the log file. Howver, was not able to provision a Hadoop Vanilla cluster using Sahara and kept getting the same error message. The cluster used 1 Master and 3 Slaves. Then tried to use 1 Master and 2 and then 1 Slave. The only combination that brought the cluster to Starting phase in the Dashboard was 1 Master and 1 Slave using 2G RAM and 2 VCP per each instance. The host has 14 GB RAM and 8 CPUs. I ended up by the following debug error trace:

2015-10-06 08:37:38.043 3236 ERROR sahara.service.api [-] Can't start services for cluster 'Hadoop-Cluster' (reason: SSHException: Error reading SSH protocol banner)
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api Traceback (most recent call last):
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api   File "/usr/lib/python2.6/site-packages/sahara/service/api.py", line 220, in _provision_cluster
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api     plugin.start_cluster(cluster)
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api   File "/usr/lib/python2.6/site-packages/sahara/plugins/vanilla/plugin.py", line 60, in start_cluster
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api     cluster.hadoop_version).start_cluster(cluster)
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api   File "/usr/lib/python2.6/site-packages/sahara/plugins/vanilla/v1_2_1/versionhandler.py", line 109, in start_cluster
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api     with remote.get_remote(nn_instance) as r:
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api   File "/usr/lib/python2.6/site-packages/sahara/utils/ssh_remote.py", line 288, in __enter__
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api     _release_remote_semaphore()
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api   File "/usr/lib/python2.6/site-packages/sahara/openstack/common/excutils.py", line 68, in __exit__
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api     six.reraise(self.type_, self.value, self.tb)
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api   File "/usr/lib/python2.6/site-packages/sahara/utils/ssh_remote.py", line 284, in __enter__
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api     self.bulk = BulkInstanceInteropHelper(self.instance)
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api   File "/usr/lib/python2.6/site-packages/sahara/utils/ssh_remote.py", line 419, in __init__
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api     procutils.shutdown_subprocess(self.proc, _cleanup)
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api   File "/usr/lib/python2.6/site-packages/sahara/openstack/common/excutils.py", line 68, in __exit__
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api     six.reraise(self.type_, self.value, self.tb)
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api   File "/usr/lib/python2.6/site-packages/sahara/utils/ssh_remote.py", line 416, in __init__
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api     self._get_conn_params())
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api   File "/usr/lib/python2.6/site-packages/sahara/utils/procutils.py", line 52, in run_in_subprocess
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api     raise SubprocessException(result['exception'])
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api SubprocessException: SSHException: Error reading SSH protocol banner
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api 
2015-10-06 08:37:38.144 3236 INFO sahara.service.api [-] Cluster status has been changed: id=c309cd38-dc40-4d83-87a7-8c0178b1648c, New status=Error

The security groups is allowing SSH and was able to SSH each instance using command line. Sahara was not able to complete successfully to start the Hadoop services in the instances.

Could be a network latency or resources contention issues ?

Thanks in advance !

Sahara SSHException: Error reading SSH protocol banner

Dear all,

After installing RDO icehouse successfully, Sahara has been integrated. Unfortunately, I went through many issues and have been resolved using debug mode from the log file. Howver, However, was not able to provision a Hadoop Vanilla cluster using Sahara and kept getting the same error message. message below. The cluster used 1 Master and 3 Slaves. Then tried to use 1 Master and 2 and then 1 Slave. The only combination that brought the cluster to Starting phase in the Dashboard was 1 Master and 1 Slave using 2G RAM and 2 VCP per each instance. The host has 14 GB RAM and 8 CPUs. I ended up by the following debug error trace:

2015-10-06 08:37:38.043 3236 ERROR sahara.service.api [-] Can't start services for cluster 'Hadoop-Cluster' (reason: SSHException: Error reading SSH protocol banner)
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api Traceback (most recent call last):
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api   File "/usr/lib/python2.6/site-packages/sahara/service/api.py", line 220, in _provision_cluster
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api     plugin.start_cluster(cluster)
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api   File "/usr/lib/python2.6/site-packages/sahara/plugins/vanilla/plugin.py", line 60, in start_cluster
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api     cluster.hadoop_version).start_cluster(cluster)
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api   File "/usr/lib/python2.6/site-packages/sahara/plugins/vanilla/v1_2_1/versionhandler.py", line 109, in start_cluster
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api     with remote.get_remote(nn_instance) as r:
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api   File "/usr/lib/python2.6/site-packages/sahara/utils/ssh_remote.py", line 288, in __enter__
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api     _release_remote_semaphore()
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api   File "/usr/lib/python2.6/site-packages/sahara/openstack/common/excutils.py", line 68, in __exit__
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api     six.reraise(self.type_, self.value, self.tb)
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api   File "/usr/lib/python2.6/site-packages/sahara/utils/ssh_remote.py", line 284, in __enter__
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api     self.bulk = BulkInstanceInteropHelper(self.instance)
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api   File "/usr/lib/python2.6/site-packages/sahara/utils/ssh_remote.py", line 419, in __init__
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api     procutils.shutdown_subprocess(self.proc, _cleanup)
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api   File "/usr/lib/python2.6/site-packages/sahara/openstack/common/excutils.py", line 68, in __exit__
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api     six.reraise(self.type_, self.value, self.tb)
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api   File "/usr/lib/python2.6/site-packages/sahara/utils/ssh_remote.py", line 416, in __init__
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api     self._get_conn_params())
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api   File "/usr/lib/python2.6/site-packages/sahara/utils/procutils.py", line 52, in run_in_subprocess
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api     raise SubprocessException(result['exception'])
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api SubprocessException: SSHException: Error reading SSH protocol banner
2015-10-06 08:37:38.043 3236 TRACE sahara.service.api 
2015-10-06 08:37:38.144 3236 INFO sahara.service.api [-] Cluster status has been changed: id=c309cd38-dc40-4d83-87a7-8c0178b1648c, New status=Error

The security groups is allowing SSH and was able to SSH each instance using command line. Sahara was not able to complete successfully to start the Hadoop services in the instances.

Could be a network latency or resources contention issues ?

Thanks in advance !