rlrevell's profile - activity

2015-08-24 18:53:11 -0500 received badge  Famous Question (source)
2015-08-24 18:53:11 -0500 received badge  Notable Question (source)
2015-08-24 18:53:11 -0500 received badge  Popular Question (source)
2015-06-26 06:55:03 -0500 received badge  Popular Question (source)
2015-06-26 06:55:03 -0500 received badge  Notable Question (source)
2015-05-19 09:05:17 -0500 received badge  Notable Question (source)
2015-05-19 09:05:17 -0500 received badge  Famous Question (source)
2015-04-14 14:42:32 -0500 received badge  Self-Learner (source)
2015-04-14 14:21:38 -0500 answered a question devstack install fails with permission denied errors

It worked once I patched the script to actually run all commands requiring root access with sudo:

http://pastebin.com/Zv7w9mvW

2015-04-14 14:20:26 -0500 received badge  Popular Question (source)
2015-04-14 11:10:29 -0500 commented answer devstack install fails with permission denied errors

Pastebin of the changes needed to make it work:

http://pastebin.com/Zv7w9mvW

2015-04-14 11:06:29 -0500 commented answer devstack install fails with permission denied errors

The stack user does have sudo privileges - earlier in the install ./stack.sh installed a bunch of packages and other root level things. I was able to make it work by prepending sudo to about 50 lines in lib/* and functions-common. I don't see how this script ever could work otherwise.

2015-04-14 10:17:46 -0500 asked a question devstack install fails with permission denied errors

Doing everything as user stack like the documentation recommends, ./stack.sh on ubuntu 14.04 fails with these errors. Passwordless sudo does work and the previous parts of the install that require root access work, it seems to be trying to do root level things as the stack user here.

+ configure_keystone
+ sudo install -d -o stack /etc/keystone
+ [[ /etc/keystone != \/\o\p\t\/\s\t\a\c\k\/\k\e\y\s\t\o\n\e\/\e\t\c ]]
+ install -m 600 /opt/stack/keystone/etc/keystone.conf.sample /etc/keystone/keystone.conf
+ cp -p /opt/stack/keystone/etc/policy.json /etc/keystone
cp: cannot create regular file '/etc/keystone/policy.json': Permission denied
+ exit_trap
+ local r=1
++ jobs -p
+ jobs=
+ [[ -n '' ]]
+ kill_spinner
+ '[' '!' -z '' ']'
+ [[ 1 -ne 0 ]]
+ echo 'Error on exit'
Error on exit
+ [[ -z /opt/stack/logs ]]
+ /opt/stack/devstack/tools/worlddump.py -d /opt/stack/logs
World dumping... see /opt/stack/logs/worlddump-2015-04-14-151400.txt for details
df: '/run/user/1000/gvfs': Permission denied
+ exit 1
2015-04-08 10:43:39 -0500 received badge  Famous Question (source)
2015-04-08 10:07:32 -0500 answered a question Device $UUID not defined in plugin, Unexpected vif_type=binding_failed, cannot launch instances

Update: A teardown and reinstall solved this problem.

My working theory is that the issue had something to do with my initially forgetting to install neutron-plugin-ml2 on the controller node before populating the neutron database, as the doc states that "Database population occurs later for Networking because the script requires complete server and plug-in configuration files". I suspect this led to a database inconsistency that dropping and recreating the neutron database did not fix.

2015-04-08 09:57:44 -0500 commented answer Device $UUID not defined in plugin, Unexpected vif_type=binding_failed, cannot launch instances

If no one has any more ideas, time to blow away the cluster and start over.

2015-04-07 15:24:52 -0500 commented answer Device $UUID not defined in plugin, Unexpected vif_type=binding_failed, cannot launch instances

No change adding the rootwrap line. Yes, Juno. Same problem with vxlan - tag 4095, device not defined in plugin, doesn't work

2015-04-07 14:35:05 -0500 commented answer Device $UUID not defined in plugin, Unexpected vif_type=binding_failed, cannot launch instances

I captured logs in debug mode, and it appears agent requests tag 4095 from OVS: http://pastebin.com/Zt2bjk1n line 35

'Running command: ['sudo', '/usr/bin/neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ovs-vsctl', '--timeout=10', 'set', 'Port', 'qr-6ef82c25-bf', 'tag=4095']
2015-04-07 13:43:41 -0500 commented answer Device $UUID not defined in plugin, Unexpected vif_type=binding_failed, cannot launch instances

Yes, 'neutron router-interface-add demo-router demo-subnet' is the command that triggers that message on the network node.

2015-04-07 13:20:43 -0500 commented answer Device $UUID not defined in plugin, Unexpected vif_type=binding_failed, cannot launch instances

Yes, neutron-openvswitch-agent is always up and running and controller node sees it when I create the network stuff.

2015-04-07 12:13:26 -0500 commented answer Device $UUID not defined in plugin, Unexpected vif_type=binding_failed, cannot launch instances

(UUID different because I've deleted and recreated networks) http://pastebin.com/3FEwpa7P

2015-04-07 12:13:26 -0500 received badge  Commentator
2015-04-07 11:56:47 -0500 commented answer Device $UUID not defined in plugin, Unexpected vif_type=binding_failed, cannot launch instances

I have seen that report, but it doesn't explain why the binding fails in the first place, and why deleting and recreating everything doesn't help. But yes, the symptoms appear to be identical.

2015-04-07 11:53:48 -0500 received badge  Notable Question (source)
2015-04-07 10:57:21 -0500 commented answer Device $UUID not defined in plugin, Unexpected vif_type=binding_failed, cannot launch instances

The virtualized environment where everything works is also Ubuntu 14.04... I'll see if I can find any difference in the OVS state.

So the agent tells the openvswitch daemon to create a port and openvswitch gives it a nonworking one? What is the best way to trace what exactly is happening?

2015-04-07 10:30:10 -0500 commented answer Device $UUID not defined in plugin, Unexpected vif_type=binding_failed, cannot launch instances

Yes, every time, I get tag 4095. I can share the script that tears down and recreates all the networks, subnets and the router, it just does the steps in the install guide.

2015-04-07 10:02:03 -0500 commented answer Device $UUID not defined in plugin, Unexpected vif_type=binding_failed, cannot launch instances

I did find one configuration bug on compute node (wrong controller hostname in neutron.conf) but fixing it did not affect this problem. Current output of the above commands:

output

2015-04-07 10:00:24 -0500 received badge  Popular Question (source)
2015-04-07 08:47:12 -0500 commented answer Device $UUID not defined in plugin, Unexpected vif_type=binding_failed, cannot launch instances

I've recreated everything more than once - all networks, router, even uninstalled and reinstalled openvswitch (on the network node that is). No change. Could there be some state persisting that isn't apparent (like in database or on another node)?

2015-04-06 16:09:45 -0500 asked a question Device $UUID not defined in plugin, Unexpected vif_type=binding_failed, cannot launch instances

I have tried every suggestion I can find to solve this problem: destroy and rebuild the network node, drop and recreate the neutron database, delete and recreate networks, delete and recreate OVS bridges, and it still persists.

I think the problem begins when I create the router:

# neutron router-interface-add demo-router demo-subnet
Added interface ad20d4ed-aa62-4223-9ed5-4d1a7def07dd to router demo-router.

This causes the following on the network node:

2015-04-06 17:00:46.742 2779 WARNING neutron.plugins.openvswitch.agent.ovs_neutron_agent [req-0bf50bd6-1c75-4d5c-9bd8-1bfc87977248 None] Device ad20d4ed-aa62-4223-9ed5-4d1a7def07dd not defined on plugin

# neutron router-gateway-set demo-router ext-net
Set gateway for router demo-router

Causes:

2015-04-06 17:00:46.742 2779 WARNING neutron.plugins.openvswitch.agent.ovs_neutron_agent [req-0bf50bd6-1c75-4d5c-9bd8-1bfc87977248 None] Device ad20d4ed-aa62-4223-9ed5-4d1a7def07dd not defined on plugin

Launching an instance does not work; the network node logs the "not defined on plugin" error, the instance creation fails with "no valid host was found", and in the logs of the compute node I see the dreaded "Unexpected vif_type=binding_failed" message.

vif_plugging_is_fatal=false
vif_plugging_timeout=0

does not help, all instance creations still fail with "No valid host was found"

I have this all working perfectly in a virtualized environment and I have repeatedly checked the configs of controller node, network node, and compute node and they are identical to the working environment except for different IPs and hostnames.

Edit: added requested info

# egrep -v ^#\|^$ /etc/neutron/plugins/ml2/ml2_conf.ini 
[ml2]
type_drivers = flat,gre
tenant_network_types = gre
mechanism_drivers = openvswitch
[ml2_type_flat]
flat_networks = external
[ml2_type_vlan]
[ml2_type_gre]
tunnel_id_ranges = 1:1000
[ml2_type_vxlan]
[securitygroup]
enable_security_group = True
enable_ipset = True
firewall_driver = neutron.agent.linux.iptables_firewall.OVSHybridIptablesFirewallDriver
[ovs]
local_ip = 10.0.0.53
enable_tunneling = True
bridge_mappings = external:br-ex
[agent]
tunnel_types = gre

# ovs-vsctl show
d3798e19-b7d2-4b97-8532-7d1a8ba40389
    Bridge br-ex
        Port br-ex
            Interface br-ex
                type: internal
        Port phy-br-ex
            Interface phy-br-ex
                type: patch
                options: {peer=int-br-ex}
        Port "em3"
            Interface "em3"
        Port "qg-0826cdac-08"
            Interface "qg-0826cdac-08"
                type: internal
    Bridge br-tun
        Port br-tun
            Interface br-tun
                type: internal
        Port "gre-0a000033"
            Interface "gre-0a000033"
                type: gre
                options: {df_default="true", in_key=flow, local_ip="10.0.0.53", out_key=flow, remote_ip="10.0.0.51"}
        Port patch-int
            Interface patch-int
                type: patch
                options: {peer=patch-tun}
    Bridge br-int
        fail_mode: secure
        Port patch-tun
            Interface patch-tun
                type: patch
                options: {peer=patch-int}
        Port br-int
            Interface br-int
                type: internal
        Port int-br-ex
            Interface int-br-ex
                type: patch
                options: {peer=phy-br-ex}
        Port "tap28899b2f-6d"
            tag: 4095
            Interface "tap28899b2f-6d"
                type: internal
        Port "qr-ad20d4ed-aa"
            tag: 4095
            Interface "qr-ad20d4ed-aa"
                type: internal
    ovs_version: "2.0.2"
2015-04-03 15:09:08 -0500 received badge  Enthusiast
2015-04-01 16:34:13 -0500 commented question Openstack silently ignores unknown config file entries?

The problem was solved by correcting the glance_host in cinder.conf, no changes to any endpoints were required. But it took hours longer than it should have to troubleshoot because invalid config file entries are silently ignored which I think could be considered a bug.

2015-03-31 00:32:12 -0500 received badge  Famous Question (source)
2015-03-26 15:46:15 -0500 received badge  Notable Question (source)
2015-03-24 01:48:10 -0500 received badge  Popular Question (source)
2015-03-23 16:10:51 -0500 asked a question Openstack silently ignores unknown config file entries?

Can anyone explain the reasoning behind this design decision? I believe it would be much easier to troubleshoot config file errors if the daemon in question refuses to start, rather than silently ignoring the error.

In my specific case, I typoed glance_host as glance-host on the block storage node and the result was that cinder create commands worked fine unless the --image-id switch was passed in which case the block storage node tried to connect to glance on its own IP causing volume creations to fail.

2015-03-23 08:59:39 -0500 answered a question cinder-backup ubuntu package seems to be broken

My mistake, somehow the /etc/apt/sources.list.d/cloudarchive-juno.list file got removed on this node.

2015-03-23 08:09:40 -0500 commented answer cinder-backup ubuntu package seems to be broken

I have the default Ubuntu 14.04 repos plus ceph and openstack:

# cat /etc/apt/sources.list.d/*
deb http://ceph.com/debian-giant/ trusty main
deb http://ubuntu-cloud.archive.canonical.com/ubuntu trusty-updates/juno main
2015-03-20 10:39:53 -0500 asked a question cinder-backup ubuntu package seems to be broken
root@controller-1:~# apt-get install cinder-backup
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 cinder-backup : Depends: cinder-common (= 1:2014.1.3-0ubuntu1.1) but 1:2014.2.1-0ubuntu1~cloud0 is to be installed
E: Unable to correct problems, you have held broken packages.
2015-03-16 02:47:21 -0500 marked best answer Live migrations over ssh fail from nova but work from virsh

This works perfectly in both directions:

nova@compute-1:~$ virsh migrate --live instance-00000025 qemu+ssh://nova@compute-2/system

nova@compute-2:~$ virsh migrate --live instance-00000025 qemu+ssh://nova@compute-1/system

nova.conf contains:

[libvirt]
live_migration_flag=VIR_MIGRATE_UNDEFINE_SOURCE,VIR_MIGRATE_PEER2PEER,VIR_MIGRATE_LIVE, VIR_MIGRATE_TUNNELLED
live_migration_uri = qemu+ssh://nova@%s/system

However migration fails when run from nova:

2015-03-09 16:45:34.605 27561 ERROR nova.virt.libvirt.driver [-] [instance: 76003e69-fcb1-4e62-962e-be4c1257344d] Live Migration failure: operation failed: Failed to connect to remote libvirt URI qemu+ssh://nova@compute-2/system: Cannot recv data: Permission denied, please try again.
Permission denied, please try again.
Permission denied (publickey,password).: Connection reset by peer

But public key authentication between compute-1 and compute-2 is clearly working...

2015-03-16 02:47:21 -0500 received badge  Self-Learner (source)
2015-03-16 02:47:21 -0500 received badge  Teacher (source)