Ask Your Question
0

Savanna with Nova-network with Grizlly

asked 2013-08-06 10:36:02 -0500

sarita18narwal gravatar image

I had deployed the hadoop cluster using Savanna API. After launching the cluster it remains in waiting state and after some time goes into Eror State while the nodes remains in active state.

Suppose i had 3 node(1 master and 2 slave) in cluster . When I launched the cluster using respective cluster template,the scenario is like this.

Cluster

Name State Instance Count Sample Error 3

And in Instances

Name IP State 1-master 10.0.0.X ACTIVE 2-slave 10.0.0.X ACTIVE 3-slave 10.0.0.X ACTIVE

Log Description: WARNING savanna.service.instances [-] Can't start cluster 'tstcluster' (reason: Unauthorized (HTTP 401))

ERROR root [-] Original exception being dropped: ['Traceback (most recent call last):\n', ' File "/usr/local/lib/python2.7/dist-packages/savanna/service/instances.py", line 38, in create_cluster\n _await_instances(cluster)\n', ' File "/usr/local/lib/python2.7/dist-packages/savanna/service/instances.py", line 206, in _await_instances\n if not _check_if_up(instance):\n', ' File "/usr/local/lib/python2.7/dist-packages/savanna/service/instances.py", line 215, in _check_if_up\n server = instance.nova_info\n', ' File "/usr/local/lib/python2.7/dist-packages/savanna/db/models.py", line 226, in nova_info\n return nova.client().servers.get(self.instance_id)\n', ' File "/usr/local/lib/python2.7/dist-packages/novaclient/v1_1/servers.py", line 350, in get\n return self._get("/servers/%s" % base.getid(server), "server")\n', ' File "/usr/local/lib/python2.7/dist-packages/novaclient/base.py", line 140, in _get\n _resp, body = self.api.client.get(url)\n', ' File "/usr/local/lib/python2.7/dist-packages/novaclient/client.py", line 230, in get\n return self._cs_request(url, \'GET\', *kwargs)\n', ' File "/usr/local/lib/python2.7/dist-packages/novaclient/client.py", line 227, in _cs_request\n raise e\n', 'Unauthorized: Unauthorized (HTTP 401)\n'] Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/eventlet/hubs/poll.py", line 97, in wait readers.get(fileno, noop).cb(fileno) File "/usr/lib/python2.7/dist-packages/eventlet/greenthread.py", line 194, in main result = function(args, *kwargs) File "/usr/local/lib/python2.7/dist-packages/savanna/context.py", line 127, in wrapper func(args, *kwargs) File "/usr/local/lib/python2.7/dist-packages/savanna/service/api.py", line 111, in _provision_cluster i.create_cluster(cluster) File "/usr/local/lib/python2.7/dist-packages/savanna/service/instances.py", line 51, in create_cluster _rollback_cluster_creation(cluster, ex) File "/usr/local/lib/python2.7/dist-packages/savanna/service/instances.py", line 274, in _rollback_cluster_creation _shutdown_instances(cluster, True) File "/usr/local/lib/python2.7/dist-packages/savanna/service/instances.py", line 303, in _shutdown_instances _shutdown_instance(instance) File "/usr/local/lib/python2.7/dist-packages/savanna/service/instances.py", line 309, in _shutdown_instance nova.client().servers.delete(instance.instance_id) File "/usr/local/lib/python2.7/dist-packages/novaclient/v1_1/servers.py", line 630, in delete self._delete("/servers/%s" % base.getid(server)) File "/usr/local/lib/python2.7/dist-packages/novaclient/base.py", line 154, in _delete _resp, _body = self.api.client.delete(url) File "/usr/local/lib/python2.7/dist-packages/novaclient/client.py ... (more)

edit retag flag offensive close merge delete

29 answers

Sort by ยป oldest newest most voted
0

answered 2013-08-16 06:25:52 -0500

sarita18narwal gravatar image

Thanks Alexander Rubtsov, that solved my question.

edit flag offensive delete link more
0

answered 2013-08-08 12:06:48 -0500

Sarita,

Please attach full Savanna log. In order to create it launch savanna-api with flag "--log-file <path>" Also let's check state of instances. Attach the output of command "nova console-log <instance_id>" after cluster goes to error state.

edit flag offensive delete link more
0

answered 2013-08-13 08:56:05 -0500

sarita18narwal gravatar image

I had decrease the value of expiration from 86400 to 864 in the /etc/keystone/keystone.conf. Then database synchronization and start all the nova services and apache2 server.

This all did not helped me out. My cluster is still in waiting state from last 4 hour. My nova console-log for active instance (after cluster goes into error state) is as follows as:

Instance1: <-------------------------------------------------------------------------> .............................................................................. [ 0.407413] pnp: PnP ACPI: found 8 devices [ 0.408832] ACPI: ACPI bus type pnp unregistered [ 0.446954] NET: Registered protocol family 2 [ 0.448505] IP route cache hash table entries: 4096 (order: 3, 32768 bytes) [ 0.450843] TCP established hash table entries: 16384 (order: 6, 262144 bytes) [ 0.453757] TCP bind hash table entries: 16384 (order: 6, 262144 bytes) [ 0.456131] TCP: Hash tables configured (established 16384 bind 16384) [ 0.459077] TCP: reno registered [ 0.460269] UDP hash table entries: 256 (order: 1, 8192 bytes) [ 0.462055] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes) [ 0.464050] NET: Registered protocol family 1 [ 0.465487] pci 0000:00:00.0: Limiting direct PCI/PCI transfers [ 0.467332] pci 0000:00:01.0: PIIX3: Enabling Passive Release [ 0.469206] pci 0000:00:01.0: Activating ISA DMA hang workarounds [ 0.471212] ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 11 [ 0.473788] audit: initializing netlink socket (disabled) [ 0.475512] type=2000 audit(1376301101.472:1): initialized [ 0.477389] Trying to unpack rootfs image as initramfs... [ 0.529422] HugeTLB registered 2 MB page size, pre-allocated 0 pages [ 0.533234] VFS: Disk quotas dquot_6.5.2 [ 0.534622] Dquot-cache hash table entries: 512 (order 0, 4096 bytes) [ 0.544243] fuse init (API version 7.19) [ 0.545667] msgmni has been set to 939 [ 0.560223] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 252) [ 0.562686] io scheduler noop registered [ 0.563987] io scheduler deadline registered (default) [ 0.572293] io scheduler cfq registered [ 0.573729] pci_hotplug: PCI Hot Plug PCI Core version: 0.5 [ 0.575454] pciehp: PCI Express Hot Plug Controller Driver version: 0.4 [ 0.577614] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input0 [ 0.580044] ACPI: Power Button [PWRF] [ 0.583208] GHES: HEST is not enabled! [ 0.592651] ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 10 [ 0.600195] Serial: 8250/16550 driver, 32 ports, IRQ sharing enabled [ 0.624673] serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A [ 0.674575] serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A [ 0.734591] 00:05: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A [ 0.779096] 00:06: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A [ 0.804373] Linux agpgart interface v0.103 [ 0.813653] brd: module loaded [ 0.815867] loop: module loaded [ 0.884614] vda: vda1 [ 0.912109] scsi0 : ata_piix [ 0.913322] scsi1 : ata_piix [ 0.914465] ata1: PATA max MWDMA2 cmd 0x1f0 ctl 0x3f6 bmdma 0xc180 irq 14 [ 0.916462] ata2: PATA max MWDMA2 cmd 0x170 ctl 0x376 bmdma 0xc188 irq 15 ... (more)

edit flag offensive delete link more
0

answered 2013-08-16 03:35:50 -0500

sarita18narwal gravatar image

Yes,the cluster created through REST was in waiting state.

I had created successfully one instance before the cluster launch. When I created my new cluster using REST having 1 master and 1 worker node ,, The master node and worker node launched successfully as instances with internal IP and cluster went into waiting state. I again launched an instance successfully.

Thus I have 4 successfully launched instances in active state.

1)before cluster 2 & 3) Cluster---> master node and slave node 4)after cluster goes into waiting state.

I created instances before and after cluster just to check whether my nova is contacting to keystone or not .

nova console-log of cluster-- master node is as follows as --->

//////////////////////////////////////////////////////////////////////////////// [......

[ 0.526863] vgaarb: device added: PCI:0000:00:02.0,decodes=io+mem,owns=io+mem,locks=none [ 0.528036] vgaarb: loaded [ 0.529064] vgaarb: bridge control possible 0000:00:02.0 [ 0.532400] SCSI subsystem initialized [ 0.534088] ACPI: bus type usb registered [ 0.535480] usbcore: registered new interface driver usbfs [ 0.536051] usbcore: registered new interface driver hub [ 0.537928] usbcore: registered new device driver usb [ 0.540386] PCI: Using ACPI for IRQ routing [ 0.542470] NetLabel: Initializing [ 0.544043] NetLabel: domain hash size = 128 [ 0.545438] NetLabel: protocols = UNLABELED CIPSOv4 [ 0.547009] NetLabel: unlabeled traffic allowed by default [ 0.548201] HPET: 3 timers in total, 0 timers will be used for per-cpu timer [ 0.550288] hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0 [ 0.553013] hpet0: 3 comparators, 64-bit 100.000000 MHz counter [ 0.560122] Switching to clocksource kvm-clock [ 0.579695] AppArmor: AppArmor Filesystem Enabled [ 0.970489] pnp: PnP ACPI init [ 0.971648] ACPI: bus type pnp registered [ 0.974063] pnp: PnP ACPI: found 8 devices [ 0.975446] ACPI: ACPI bus type pnp unregistered [ 0.983793] NET: Registered protocol family 2 [ 0.985417] IP route cache hash table entries: 65536 (order: 7, 524288 bytes) [ 0.989392] TCP established hash table entries: 262144 (order: 10, 4194304 bytes) [ 0.998635] TCP bind hash table entries: 65536 (order: 8, 1048576 bytes) [ 1.003642] TCP: Hash tables configured (established 262144 bind 65536) [ 1.005719] TCP: reno registered [ 1.006884] UDP hash table entries: 1024 (order: 3, 32768 bytes) [ 1.008784] UDP-Lite hash table entries: 1024 (order: 3, 32768 bytes) [ 1.010837] NET: Registered protocol family 1 [ 1.012308] pci 0000:00:00.0: Limiting direct PCI/PCI transfers [ 1.014125] pci 0000:00:01.0: PIIX3: Enabling Passive Release [ 1.015896] pci 0000:00:01.0: Activating ISA DMA hang workarounds [ 1.017977] ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 11 [ 1.033502] audit: initializing netlink socket (disabled) [ 1.035208] type=2000 audit(1376481113.032:1): initialized [ 1.037043] Trying to unpack rootfs image as initramfs... [ 1.320896] HugeTLB registered 2 MB page size, pre-allocated 0 pages [ 1.328782] VFS: Disk quotas dquot_6.5.2 [ 1.330348] Dquot-cache hash table entries: 512 (order 0, 4096 bytes) [ 1.338159] fuse init (API version 7.19) [ 1.339641] msgmni has been set to 3963 [ 1 ... (more)

edit flag offensive delete link more
0

answered 2013-08-16 09:32:00 -0500

You should enable floating ip assignment in your OpenStack environment ("auto_assign_floating_ip=True" in the nova.conf) Please make sure that Savanna host can reach instances through floating ip.

edit flag offensive delete link more
0

answered 2013-08-16 10:24:09 -0500

sarita18narwal gravatar image

Alexander Rubstsov ,

Thanks a lot. Thanks for listening me patiently and solving my problem.

My Cluster is now in ACTIVE state and i am unable to access my instance using web UI. :) :) :) :)

edit flag offensive delete link more
0

answered 2013-08-16 03:38:58 -0500

sarita18narwal gravatar image

The savanna log for cluster created using REST is from cluster-1.

Please ignore the previous log for "cluster-tt."

Sorry for pasting some irrelevant information in log file description.

edit flag offensive delete link more
0

answered 2013-08-16 05:53:56 -0500

Sarita,

since you can get access from Savanna host to instances through internal IP - change in the savanna.conf: use_floating_ips=False

edit flag offensive delete link more
0

answered 2013-08-13 12:13:33 -0500

sarita18narwal gravatar image

I am using an image downloaded from http://savanna-files.mirantis.com/savanna-0.1.2-hadoop.qcow2 (http://savanna-files.mirantis.com/sav...) .

Yes, I am able to connect via ssh from the host on which the Savanna installed to an instance by fixed ip address. But its asking for password and i do not know the password for root. Ping to this IP is giving response.

I had restarted keystone service too after applying the settings expiration time but still my cluster is in waiting state.Now I had switched from grizzly to Folsom with nova-network but facing the same problem i.e. "Cluster in waiting state".

I do not know from where I can attach a document that's why pasting here the Savanna Log :

[.................................... 2013-08-13 17:29:17.852 3487 INFO requests.packages.urllib3.connectionpool [-] Starting new HTTP connection (1): 10.208.36.50 2013-08-13 17:29:17.902 3487 DEBUG requests.packages.urllib3.connectionpool [-] "GET /v2/fd6e0af3983444bbaa41124740f373d9/os-networks HTTP/1.1" 200 655 _make_request /usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/connectionpool.py:296 2013-08-13 17:29:18.824 3487 DEBUG keystoneclient.middleware.auth_token [-] Authenticating user token __call__ /usr/local/lib/python2.7/dist-packages/keystoneclient/middleware/auth_token.py:448 2013-08-13 17:29:18.824 3487 DEBUG keystoneclient.middleware.auth_token [-] Removing headers from request environment: X-Identity-Status,X-Domain-Id,X-Domain-Name,X-Project-Id,X-Project-Name,X-Project-Domain-Id,X-Project-Domain-Name,X-User-Id,X-User-Name,X-User-Domain-Id,X-User-Domain-Name,X-Roles,X-Service-Catalog,X-User,X-Tenant-Id,X-Tenant-Name,X-Tenant,X-Role _remove_auth_headers /usr/local/lib/python2.7/dist-packages/keystoneclient/middleware/auth_token.py:506 2013-08-13 17:29:18.825 3487 DEBUG keystoneclient.middleware.auth_token [-] Returning cached token f624dfb4302b456181ea0d6cbe29a013 _cache_get /usr/local/lib/python2.7/dist-packages/keystoneclient/middleware/auth_token.py:893 2013-08-13 17:29:18.826 3487 DEBUG savanna.utils.api [-] Rest.route.decorator.handler, kwargs={'tenant_id': u'fd6e0af3983444bbaa41124740f373d9', 'cluster_id': u'e57a136f-5dbd-4248-be31-bcea7bdd692e'} handler /usr/local/lib/python2.7/dist-packages/savanna/utils/api.py:60 2013-08-13 17:29:18.905 3487 INFO requests.packages.urllib3.connectionpool [-] Starting new HTTP connection (1): 10.208.36.50 2013-08-13 17:29:18.999 3487 DEBUG requests.packages.urllib3.connectionpool [-] "GET /v2/fd6e0af3983444bbaa41124740f373d9/servers/7842bd41-259e-4d30-9458-9d752fa56b71 HTTP/1.1" 200 1477 _make_request /usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/connectionpool.py:296 2013-08-13 17:29:19.001 3487 INFO requests.packages.urllib3.connectionpool [-] Starting new HTTP connection (1): 10.208.36.50 2013-08-13 17:29:19.052 3487 DEBUG requests.packages.urllib3.connectionpool [-] "GET /v2/fd6e0af3983444bbaa41124740f373d9/os-networks HTTP/1.1" 200 655 _make_request /usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/connectionpool.py:296

........................................]

edit flag offensive delete link more
0

answered 2013-08-13 12:40:14 -0500

What version of Savanna do you use? Your image intended to old version (0.1)

There is a image for the Savanna 0.2.x: http://savanna-files.mirantis.com/savanna-0.2-vanilla-1.1.2-ubuntu-12.10.qcow2 (http://savanna-files.mirantis.com/sav...) (Notice that at registration of this image you should specify username=ubuntu)

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Get to know Ask OpenStack

Resources for moderators

Question Tools

1 follower

Stats

Asked: 2013-08-06 10:36:02 -0500

Seen: 251 times

Last updated: Nov 04 '13