发起问题

mojavezax 的档案 - activity

2020-02-17 09:23:47 -0500 获得奖牌  著名问题 (source)
2019-07-03 12:08:39 -0500 获得奖牌  热门的问题 (source)
2019-07-03 12:08:39 -0500 获得奖牌  受欢迎的问题 (source)
2018-11-26 22:44:07 -0500 获得奖牌  著名问题 (source)
2018-11-26 11:13:00 -0500 回答问题 Certain VMs fail to migrate

Finally figured it out - there was a mistake in the /etc/ceph/ceph.conf. I copied and pasted "client.cinder-dev" instead of "client.cinder":

[client.cinder]
keyring = /etc/ceph/ceph.client.openstack.keyring
2018-11-26 11:12:59 -0500 评论问题 Certain VMs fail to migrate

I do see those two IDs in the Cinder api.log. The log reflects the 60s timeout seen in the nova logs, and then shows a stack trace for cinder.api.middleware.fault.

2018-11-26 11:12:59 -0500 问了问题 Certain VMs fail to migrate

Hi All,

Got a sore head after banging it against the wall for over a week. I've seen many similar posts, but none of them are helpful.

I'm using Pike with Ceph Luminous as the back end storage, all on CentOS 7.5. Everything is at the latest patch level.

I have 3 compute hosts. Some of my VMs migrate between them all just fine. Others won't migrate at all. It may be the larger ones that won't migrate, but at only 2GB RAM and 16GB disk one of them isn't large at all.

After I initiate the migration, I see a message in the nova-compute.log on the target node (called Krypton) that lists the CPU capabilities. After one minute I see the error "Initialize connection failed for volume ..." and HTTP 500 error. That's followed by a few more messages, then a mile-long stack trace from oslo_messaging.rpc.server.

At that same second, log messages start appearing in the nova-compute.log on the source node (called Xenon), starting with "Pre live migration failed at krypton...: MessagingTimeout: Timed out waiting for a reply to message ID ..." Then a stack trace from nova.compute.manager appears. Lastly I see an INFO message saying "No calling threads waiting for msg_id ..."

Any ideas what's causing this? Are there settings I need to adjust? Any suggestions on how to further debug this?

Thanks!

Command:

openstack server migrate --live krypton.example.com b5f912f5-3c49-466b-ad43-525c0476dbf9

Krypton:/var/log/nova/nova-compute.log (target node):

2018-09-14 11:31:39.939 169568 INFO nova.virt.libvirt.driver [req-6c25aada-06e2-4ab7-bd67-e8e2cf49cf29 b4d3c8b03a8d432c999e101f22f8e19e c17f7f6ae0f44372a25439fe22357500 - default default] Instance launched has CPU info: {"vendor": "Intel", "model": "Broadwell-IBRS", "arch": "x86_64", "features": ["pge", "avx", "xsaveopt", "clflush", "sep", "rtm", "tsc_adjust", "tsc-deadline", "dtes64", "stibp", "invpcid", "tsc", "fsgsbase", "xsave", "smap", "vmx", "erms", "xtpr", "cmov", "hle", "smep", "ssse3", "est", "pat", "monitor", "smx", "pbe", "lm", "msr", "adx", "3dnowprefetch", "nx", "fxsr", "syscall", "tm", "sse4.1", "pae", "sse4.2", "pclmuldq", "cx16", "pcid", "fma", "vme", "popcnt", "mmx", "osxsave", "cx8", "mce", "de", "rdtscp", "ht", "dca", "lahf_lm", "abm", "rdseed", "pdcm", "mca", "pdpe1gb", "apic", "sse", "f16c", "pse", "ds", "invtsc", "pni", "tm2", "avx2", "aes", "sse2", "ss", "ds_cpl", "arat", "bmi1", "bmi2", "acpi", "spec-ctrl", "fpu", "ssbd", "pse36", "mtrr", "movbe", "rdrand", "x2apic"], "topology": {"cores": 8, "cells": 2, "threads": 2, "sockets": 1}}
2018-09-14 11:32:23.276 169568 WARNING nova.compute.resource_tracker [req-6c25aada-06e2-4ab7-bd67-e8e2cf49cf29 b4d3c8b03a8d432c999e101f22f8e19e c17f7f6ae0f44372a25439fe22357500 - default default] Instance b5f912f5-3c49-466b-ad43-525c0476dbf9 has been moved to another host xenon.example.com(xenon.example.com). There are allocations remaining against the source host that might need to be removed: {u'resources': {u'VCPU': 4, u'MEMORY_MB': 8192, u'DISK_GB': 40}}.
2018-09-14 11:32:23.301 169568 INFO nova.compute.resource_tracker [req-6c25aada-06e2-4ab7-bd67-e8e2cf49cf29 b4d3c8b03a8d432c999e101f22f8e19e c17f7f6ae0f44372a25439fe22357500 - default default] Final resource view: name=krypton.example.com phys_ram=196510MB used_ram=14848MB phys_disk=18602GB used_disk=88GB total_vcpus=32 used_vcpus=9 pci_stats=[]
2018-09-14 11:32:40.954 169568 ERROR nova.volume.cinder [req-6c25aada-06e2-4ab7-bd67-e8e2cf49cf29 b4d3c8b03a8d432c999e101f22f8e19e c17f7f6ae0f44372a25439fe22357500 - default default] Initialize connection failed for volume e4e411d7-59e7-463b-8598-54a4838aa898 on host krypton.example.com. Error: The server has either erred or is incapable of ...
(more)
2018-11-26 11:12:59 -0500 问了问题 Create instance failed even though volume created from image

Hello,

I'm working with OpenStack Pike, with Ceph 12.2.5 as the backing store.

I tried to create an instance in the GUI from a 30GB image (qcow2 image converted from vmdk), but got the following error:

Error: Failed to perform requested operation on instance "cvcfisapps", the instance has an error status: Please try again later [Error: Build of instance 28c01330-beb1-4158-b8f7-11e975389cd7 aborted: Volume a8f96b32-541f-45ec-835a-458f23bd592c did not finish being created even after we waited 188 seconds or 61 attempts. And its status is creating.].

I checked and it created a volume from the image. I deleted the failed instance, and was then able to boot from the volume successfully. My question is why creating the instance failed with the above error?

Here are the relevant log entries from /var/log/cinder/volume.log on the compute host:

2018-06-29 08:58:08.748 54792 INFO cinder.volume.flows.manager.create_volume [req-752787bb-b661-4fa3-a05e-d4cd3c4d94c7 fa9368f1ede54e9b84b3657848d0e080 3ec2bba82240472d889d565dfd9b8ff8 - default default] Volume a8f96b32-541f-45ec-835a-458f23bd592c: being created as image with specification: {'status': u'creating', 'image_location': (None, None), 'volume_size': 32, 'volume_name': 'volume-a8f96b32-541f-45ec-835a-458f23bd592c', 'image_id': 'f4352cdb-d4cc-4f6f-afb7-395b099e0795', 'image_service': <cinder.image.glance.GlanceImageService object at 0x7f0e3a79dd90>, 'image_meta': {u'status': u'active', u'name': u'cvcfisapps', u'tags': [], u'container_format': u'bare', u'created_at': datetime.datetime(2018, 6, 26, 15, 11, 51, tzinfo=<iso8601.Utc>), u'disk_format': u'qcow2', u'updated_at': datetime.datetime(2018, 6, 26, 15, 13, 49, tzinfo=<iso8601.Utc>), u'visibility': u'public', 'properties': {}, u'owner': u'c17f7f6ae0f44372a25439fe22357500', u'protected': False, u'id': u'f4352cdb-d4cc-4f6f-afb7-395b099e0795', u'file': u'/v2/images/f4352cdb-d4cc-4f6f-afb7-395b099e0795/file', u'checksum': u'54c6579afadf40e29cde2ebdebfcaffb', u'min_disk': 0, u'virtual_size': None, u'min_ram': 0, u'size': 31927369728}}
2018-06-29 08:58:23.593 54792 WARNING oslo.service.loopingcall [req-7dfe2be7-8362-4a99-8268-438f5ec6f834 - - - - -] Function 'cinder.service.Service.report_state' run outlasted interval by 0.54 sec
2018-06-29 08:59:23.716 54792 WARNING oslo.service.loopingcall [req-7dfe2be7-8362-4a99-8268-438f5ec6f834 - - - - -] Function 'cinder.service.Service.report_state' run outlasted interval by 40.12 sec
2018-06-29 09:00:02.470 54792 WARNING oslo.service.loopingcall [req-7dfe2be7-8362-4a99-8268-438f5ec6f834 - - - - -] Function 'cinder.service.Service.report_state' run outlasted interval by 28.75 sec
2018-06-29 09:00:23.750 54792 WARNING oslo.service.loopingcall [req-7dfe2be7-8362-4a99-8268-438f5ec6f834 - - - - -] Function 'cinder.service.Service.report_state' run outlasted interval by 1.28 sec
2018-06-29 09:01:23.497 54792 WARNING oslo.service.loopingcall [req-7dfe2be7-8362-4a99-8268-438f5ec6f834 - - - - -] Function 'cinder.service.Service.report_state' run outlasted interval by 49.75 sec
2018-06-29 09:01:51.354 54792 WARNING oslo.service.loopingcall [req-7dfe2be7-8362-4a99-8268-438f5ec6f834 - - - - -] Function 'cinder.service.Service.report_state' run outlasted interval by 7.86 sec
2018-06-29 09:02:42.669 54792 INFO cinder.image.image_utils [req-752787bb-b661-4fa3-a05e-d4cd3c4d94c7 fa9368f1ede54e9b84b3657848d0e080 3ec2bba82240472d889d565dfd9b8ff8 - default default] Image download 30448.00 MB at 111.16 MB/s
2018-06-29 09:02:42.678 54792 WARNING oslo.service.loopingcall [req-7dfe2be7-8362-4a99-8268-438f5ec6f834 - - - - -] Function 'cinder.service.Service.report_state' run outlasted interval by 31.33 sec
2018-06-29 09:03:06.201 54792 INFO cinder.image.image_utils [req-752787bb-b661-4fa3-a05e-d4cd3c4d94c7 fa9368f1ede54e9b84b3657848d0e080 3ec2bba82240472d889d565dfd9b8ff8 - default default] Converted 30720.00 MB image at 1449.66 MB/s
2018-06-29 09:07:50.629 54792 INFO cinder.volume.flows.manager.create_volume [req-752787bb-b661-4fa3-a05e-d4cd3c4d94c7 fa9368f1ede54e9b84b3657848d0e080 3ec2bba82240472d889d565dfd9b8ff8 - default default] Volume volume-a8f96b32-541f-45ec-835a-458f23bd592c (a8f96b32-541f-45ec-835a-458f23bd592c): created successfully ...
(more)
2018-10-18 22:12:33 -0500 获得奖牌  热门的问题 (source)
2018-10-10 03:52:55 -0500 获得奖牌  受欢迎的问题 (source)
2018-09-17 09:38:21 -0500 问了问题 Certain VMs fail to migrate

Hi All,

Got a sore head after banging it against the wall for over a week. I've seen many similar posts, but none of them are helpful.

I'm using Pike with Ceph Luminous as the back end storage, all on CentOS 7.5. Everything is at the latest patch level.

I have 3 compute hosts. Some of my VMs migrate between them all just fine. Others won't migrate at all. It may be the larger ones that won't migrate, but at only 2GB RAM and 16GB disk one of them isn't large at all.

After I initiate the migration, I see a message in the nova-compute.log on the target node (called Krypton) that lists the CPU capabilities. After one minute I see the error "Initialize connection failed for volume ..." and HTTP 500 error. That's followed by a few more messages, then a mile-long stack trace from oslo_messaging.rpc.server.

At that same second, log messages start appearing in the nova-compute.log on the source node (called Xenon), starting with "Pre live migration failed at ...: MessagingTimeout: Timed out waiting for a reply to message ID ..." Then a stack trace from nova.compute.manager appears. Lastly I see an INFO message saying "No calling threads waiting for msg_id ..."

Any ideas what's causing this? Are there settings I need to adjust? Any suggestions on how to further debug this?

Thanks!

Command:

openstack server migrate --live krypton.example.com b5f912f5-3c49-466b-ad43-525c0476dbf9

Krypton:/var/log/nova/nova-compute.log (target node):

2018-09-14 11:31:39.939 169568 INFO nova.virt.libvirt.driver [req-6c25aada-06e2-4ab7-bd67-e8e2cf49cf29 b4d3c8b03a8d432c999e101f22f8e19e c17f7f6ae0f44372a25439fe22357500 - default default] Instance launched has CPU info: {"vendor": "Intel", "model": "Broadwell-IBRS", "arch": "x86_64", "features": ["pge", "avx", "xsaveopt", "clflush", "sep", "rtm", "tsc_adjust", "tsc-deadline", "dtes64", "stibp", "invpcid", "tsc", "fsgsbase", "xsave", "smap", "vmx", "erms", "xtpr", "cmov", "hle", "smep", "ssse3", "est", "pat", "monitor", "smx", "pbe", "lm", "msr", "adx", "3dnowprefetch", "nx", "fxsr", "syscall", "tm", "sse4.1", "pae", "sse4.2", "pclmuldq", "cx16", "pcid", "fma", "vme", "popcnt", "mmx", "osxsave", "cx8", "mce", "de", "rdtscp", "ht", "dca", "lahf_lm", "abm", "rdseed", "pdcm", "mca", "pdpe1gb", "apic", "sse", "f16c", "pse", "ds", "invtsc", "pni", "tm2", "avx2", "aes", "sse2", "ss", "ds_cpl", "arat", "bmi1", "bmi2", "acpi", "spec-ctrl", "fpu", "ssbd", "pse36", "mtrr", "movbe", "rdrand", "x2apic"], "topology": {"cores": 8, "cells": 2, "threads": 2, "sockets": 1}}
2018-09-14 11:32:23.276 169568 WARNING nova.compute.resource_tracker [req-6c25aada-06e2-4ab7-bd67-e8e2cf49cf29 b4d3c8b03a8d432c999e101f22f8e19e c17f7f6ae0f44372a25439fe22357500 - default default] Instance b5f912f5-3c49-466b-ad43-525c0476dbf9 has been moved to another host xenon.example.com(xenon.example.com). There are allocations remaining against the source host that might need to be removed: {u'resources': {u'VCPU': 4, u'MEMORY_MB': 8192, u'DISK_GB': 40}}.
2018-09-14 11:32:23.301 169568 INFO nova.compute.resource_tracker [req-6c25aada-06e2-4ab7-bd67-e8e2cf49cf29 b4d3c8b03a8d432c999e101f22f8e19e c17f7f6ae0f44372a25439fe22357500 - default default] Final resource view: name=krypton.example.com phys_ram=196510MB used_ram=14848MB phys_disk=18602GB used_disk=88GB total_vcpus=32 used_vcpus=9 pci_stats=[]
2018-09-14 11:32:40.954 169568 ERROR nova.volume.cinder [req-6c25aada-06e2-4ab7-bd67-e8e2cf49cf29 b4d3c8b03a8d432c999e101f22f8e19e c17f7f6ae0f44372a25439fe22357500 - default default] Initialize connection failed for volume e4e411d7-59e7-463b-8598-54a4838aa898 on host krypton.example.com. Error: The server has either erred or is incapable of performing ...
(more)
2018-09-17 09:38:18 -0500 评论回答 failed to delete a volume as the backend lv is shown open

Thank you sunzen.wang! This is exactly the same problem I was having, and your fix worked!

2018-09-12 12:40:06 -0500 获得奖牌  拥护者 (source)
2018-07-12 10:56:59 -0500 获得奖牌  粉丝