Ask Your Question
1

sahara cluster doesn't start when oozie is selected [closed]

asked 2015-01-21 20:28:21 -0600

belle gravatar image

For some reasons I can't start a cluster when oozie is selected on the master node. When I checked the sahara log file, it got this far and then after few mins, it times out.

2015-01-21 11:23:09.250 8834 INFO sahara.plugins.vanilla.v1_2_1.versionhandler [-] Hadoop services in cluster cluster3 have been started
   1393 2015-01-21 11:23:09.650 8834 DEBUG sahara.plugins.vanilla.v1_2_1.run_scripts [-] Starting mysql at cluster3-master3-001 mysql_start /openstack/sahara-master/lib/python2.7/site-packages/sahara/plugins/vanilla/v1_2_1/run_scripts.py:86
   1394 2015-01-21 11:23:09.651 8834 DEBUG sahara.utils.ssh_remote [-] [cluster3-master3-001] Executing "/opt/start-mysql.sh" _log_command /openstack/sahara-master/lib/python2.7/site-packages/sahara/utils/ssh_remote.py:622
   1395 2015-01-21 11:23:12.018 8834 DEBUG sahara.utils.ssh_remote [-] [cluster3-master3-001] _execute_command took 2.4 seconds to complete _log_command /openstack/sahara-master/lib/python2.7/site-packages/sahara/utils/ssh_remote.py:622
   1396 2015-01-21 11:23:12.018 8834 DEBUG sahara.plugins.vanilla.v1_2_1.run_scripts [-] Creating Oozie DB Schema... oozie_create_db /openstack/sahara-master/lib/python2.7/site-packages/sahara/plugins/vanilla/v1_2_1/run_scripts.py:91
   1397 2015-01-21 11:23:12.019 8834 DEBUG sahara.utils.ssh_remote [-] [cluster3-master3-001] Writing file "create_oozie_db.sql" _log_command /openstack/sahara-master/lib/python2.7/site-packages/sahara/utils/ssh_remote.py:622
   1398 2015-01-21 11:23:12.091 8834 DEBUG sahara.utils.ssh_remote [-] [cluster3-master3-001] _write_file_to took 0.1 seconds to complete _log_command /openstack/sahara-master/lib/python2.7/site-packages/sahara/utils/ssh_remote.py:622
   1399 2015-01-21 11:23:12.092 8834 DEBUG sahara.utils.ssh_remote [-] [cluster3-master3-001] Executing "mysql -u root < create_oozie_db.sql && rm create_oozie_db.sql" _log_command /openstack/sahara-master/lib/python2.7/site-packages/sahara/utils/ssh_r        emote.py:622
   1400 2015-01-21 11:23:12.104 8834 DEBUG sahara.utils.ssh_remote [-] [cluster3-master3-001] _execute_command took 0.0 seconds to complete _log_command /openstack/sahara-master/lib/python2.7/site-packages/sahara/utils/ssh_remote.py:622
2015-01-20 00:04:47.025 12251 DEBUG sahara.plugins.vanilla.v1_2_1.run_scripts [-] Sharing Oozie libs to hdfs://test1-master1-001:8020 oozie_share_lib /openstack/sahara-master/lib/python2.7/site-packages/sahara/plugins/vanilla/v1_2_1/run_scripts.py:52
2015-01-20 00:04:47.025 12251 DEBUG sahara.utils.ssh_remote [-] [test1-master1-001] Executing "sudo su - -c "mkdir /tmp/oozielib && tar zxf /opt/oozie/oozie-sharelib-4.0.0.tar.gz -C /tmp/oozielib && hadoop fs -put /tmp/oozielib/share share && rm -rf /tmp/oozielib" hadoop" _log_command /openstack/sahara-master/lib/python2.7/site-packages/sahara/utils/ssh_remote.py:622

I can start a cluster successfully without the oozie. I have tested both on the sahara juno version and the latest sahara master version and it was the same problem.

I am running this on mulitple compute nodes initially and I decided to test it on a single machine, with exactly same version, same OS, same Openstack and sahara version and surprisingly it worked on a single machine. Is there anything I'm missing here?

Thanks for your help?

edit retag flag offensive reopen merge delete

Closed for the following reason the question is answered, right answer was accepted by belle
close date 2015-01-23 11:52:00.753522

Comments

any other errors in sahara related logs? did you try to execute the latest command 'sudo su - -c "mkdir /tmp/oozielib && tar zxf /opt/oozie/oozie-sharelib-4.0.0.tar.gz -C /tmp/oozielib && hadoop fs -put /tmp/oozielib/share share && rm -rf /tmp/oozielib" hadoop' manaully?

9lives gravatar image9lives ( 2015-01-21 20:37:11 -0600 )edit

i got the time out error: ERROR sahara.service.ops [-] Error during operating cluster 'cluster3' (reason: Operation timed out after 300 second(s). when i manually run that command on the master,here's the error that i got: "put: Target share/share is a directory."

belle gravatar imagebelle ( 2015-01-22 10:26:14 -0600 )edit

1 answer

Sort by ยป oldest newest most voted
0

answered 2015-01-22 20:53:44 -0600

belle gravatar image

This issue has been resolved.

I was resolving another issue not related to sahara, it's a VM issue being slow when uploading and downloading files from swift and I found out that it was the network segmentation offload feature that has to be turned off. I have to turn that off on both the network node and the VM (I have to modify the image) so when VM boots up it will be turned off. After that, I was able to successfully start a cluster with oozie.

How can I mark this as solved? Should I just close it?

edit flag offensive delete link more

Comments

you can mark this answer the right one to the original question by click the 'tick' sign on the left side of your answer then close it with appropriate prompted reason in the dropdown list.

9lives gravatar image9lives ( 2015-01-22 21:12:31 -0600 )edit

How you have turned off TSO on image?
I'am using diskimage-create.sh from https://github.com/openstack/sahara-image-elements
And don't see any place to paste turn off command...

pawel0987 gravatar imagepawel0987 ( 2015-10-28 02:51:20 -0600 )edit

Get to know Ask OpenStack

Resources for moderators

Question Tools

2 followers

Stats

Asked: 2015-01-21 20:28:21 -0600

Seen: 242 times

Last updated: Jan 22 '15