kenneth-jiang's profile - activity

2020-09-22 16:08:26 -0600 received badge  Notable Question (source)
2020-09-22 16:08:26 -0600 received badge  Famous Question (source)
2020-09-22 16:08:26 -0600 received badge  Popular Question (source)
2017-04-18 21:46:40 -0600 received badge  Notable Question (source)
2017-04-18 21:46:40 -0600 received badge  Famous Question (source)
2017-03-21 04:54:48 -0600 received badge  Notable Question (source)
2017-03-21 04:54:48 -0600 received badge  Popular Question (source)
2015-06-29 02:00:54 -0600 received badge  Popular Question (source)
2012-02-01 22:35:27 -0600 answered a question how to debug swift

Thanks Sybrand. Winpdb looks like a cute little tool. Will find a chance to try it sometime.

2011-12-17 17:23:49 -0600 answered a question Changes to nova.conf not being picked up by nova-network

Thanks a lot Vish. After wresting with MySQL tables I was able to switch from Vlan to FlatDHCP.

2011-12-16 22:48:41 -0600 answered a question Changes to nova.conf not being picked up by nova-network

I dug into nova-network.log and confirmed that my changes found their way into the log:

2011-12-16 14:21:18,204 DEBUG nova [-] Full set of FLAGS: from (pid=4297) wait /usr/lib/python2.7/dist-packages/nova/service.py:352 .....

2011-12-16 14:21:18,208 DEBUG nova [-] public_interface : br100 from (pid=4297) wait /usr/lib/python2.7/dist-packages/nova/service.py:355 ... 2011-12-16 14:21:18,218 DEBUG nova [-] flat_interface : eth0.1728 from (pid=4297) wait /usr/lib/python2.7/dist-packages/nova/service.py:355 ... 2011-12-16 14:21:18,229 DEBUG nova [-] network_manager : nova.network.manager.FlatDHCPManager from (pid=4297) wait /usr/lib/python2.7/dist-packages/nova/service.py:355

However, further down in the log nova-network for some reason stubbornly insisted the old behavior: ... 2011-12-16 14:21:18,519 DEBUG nova.utils [aca5807e-1a5d-4d31-9295-d03b546f75fd None None] Running cmd (subprocess): ip link show dev vlan100 from (pid=4297) execute /usr/lib/python2.7/dist-packages/nova/utils.py:166 ... 2011-12-16 14:21:18,526 DEBUG nova.utils [aca5807e-1a5d-4d31-9295-d03b546f75fd None None] Running cmd (subprocess): ip link show dev br100 from (pid=4297) execute /usr/lib/python2.7/dist-packages/nova/utils.py:166 2011-12-16 14:21:18,532 DEBUG nova.utils [aca5807e-1a5d-4d31-9295-d03b546f75fd None None] Running cmd (subprocess): sudo brctl addif br100 vlan100 from (pid=4297) execute /usr/lib/python2.7/dist-packages/nova/utils.py:166 ...

Again maybe I missed something obvious here?

2011-12-16 22:30:50 -0600 asked a question Changes to nova.conf not being picked up by nova-network

I made several changes to /etc/nova/nova.conf but they didn't seem to be picked up by nova-network. And I not only restarted nova-network, but also rebooted the machine!

In /etc/nova/nova.conf, I made the following changes:

... --public_interface=eth0 --vlan_interface=eth1 ...

---->

..... --network_manager=nova.network.manager.FlatDHCPManager --public_interface=br100 --flat_interface=eth0.1728 ...

I checked and the right conf path was given to nova-network

ps -ef | grep nova-network

nova 4012 1 0 14:09 ? 00:00:00 su -c nova-network --flagfile=/etc/nova/nova.conf nova nova 4019 4012 0 14:09 ? 00:00:02 /usr/bin/python /usr/bin/nova-network --flagfile=/etc/nova/nova.conf

However after rebooting the machine, nova-network still bridged vlan100 to br100, and completely ignored eth0.1728:

brctl show

bridge name bridge id STP enabled interfaces br100 8000.02163e3e4a7e no vlan100 virbr0 8000.000000000000 yes

I also checked to make sure eth0.1728 was brought up correct:

ip link show eth0.1728

4: eth0.1728@eth0: <broadcast,multicast,up,lower_up> mtu 1500 qdisc noqueue state UP link/ether 98:4b:e1:5f:e7:e2 brd ff:ff:ff:ff:ff:ff

I probably missed something obvious here. Any hints why that happened?

2011-12-09 21:59:04 -0600 answered a question 2 dnsmasq processes. Is it a concern?

Thanks Vish Ishaya, that solved my question.

2011-12-09 14:36:49 -0600 answered a question Some nova services didn't started

The way I check if all nova services are running correctly is to use "ps -ef | grep nova" to check if all expected processes are there. If not, look into /var/log/nova for errors in logs.

Also if your nova is functioning correctly, not all services shown in nova-manage command becomes a smaller issue.

2011-12-09 14:23:21 -0600 asked a question 2 dnsmasq processes. Is it a concern?

My nova installation is working fine but I'm quite bothered by an observation: there are always 2 dnsmasq processes running at the same time, one being the sub-process of the other:

ps -ef | grep dns

nobody 11435 1 0 06:09 ? 00:00:00 dnsmasq --strict-order --bind-interfaces --conf-file= --domain=novalocal --pid-file=/var/lib/nova/networks/nova-br100.pid --listen-address=172.16.0.4 --except-interface=lo --dhcp-range=172.16.0.3,static,120s --dhcp-lease-max=256 --dhcp-hostsfile=/var/lib/nova/networks/nova-br100.conf --dhcp-script=/usr/bin/nova-dhcpbridge --leasefile-ro root 11436 11435 0 06:09 ? 00:00:00 dnsmasq --strict-order --bind-interfaces --conf-file= --domain=novalocal --pid-file=/var/lib/nova/networks/nova-br100.pid --listen-address=172.16.0.4 --except-interface=lo --dhcp-range=172.16.0.3,static,120s --dhcp-lease-max=256 --dhcp-hostsfile=/var/lib/nova/networks/nova-br100.conf --dhcp-script=/usr/bin/nova-dhcpbridge --leasefile-ro

I tried "killall dnsmasq; service nova-network restart" numerous times and the observation remained.

Nova admin guide states this as a problem that will prevent VMs from receiving IP addresses. I also experienced failed DHCP handshake with a Windows VM which I suspect is related with 2 dnsmasq proceses (see my previous question: https://answers.launchpad.net/nova/+question/167297 (https://answers.launchpad.net/nova/+q...) ).

So my questions are: - Is it possible that 2 dnsmasq processes will cause sporadic DHCP handshake failure? - If so, what can I do to fix it?

Again your help is highly appreciated! Kenneth

2011-12-09 14:09:04 -0600 answered a question Windows VM unable to accept IP address from dnsmasq

Problem resolved by "killall dnsmasq; service nova-network restart". Now syslog spits out DHCPACK (as opposed to previous DHCPOFFER) for Windows VM, the same as it does for Linux.

2011-12-08 14:34:12 -0600 asked a question Windows VM unable to accept IP address from dnsmasq

When I launched a windows VM (image created by following http://docs.openstack.org/diablo/openstack-compute/admin/content/creating-a-windows-image.html (http://docs.openstack.org/diablo/open...) ), the Windows VM was unable to get an IP assigned:

Administrator> ipconfig /renew

Windows IP Configuration

An error occurred while renewing interface Local Area Connection : unable to contact your DHCP server. Request has timed out.

Openstack Dashboard shows that this instance is assigned IP 172.16.0.8, and I found the following in syslog:

Dec 8 05:59:54 cloud-ProLiant-DL160-G6 dnsmasq-dhcp[3140]: DHCPDISCOVER(br100) 02:16:3e:11:59:7a Dec 8 05:59:54 cloud-ProLiant-DL160-G6 dnsmasq-dhcp[3140]: DHCPOFFER(br100) 172.16.0.8 02:16:3e:11:59:7a Dec 8 05:59:57 cloud-ProLiant-DL160-G6 dnsmasq-dhcp[3140]: DHCPDISCOVER(br100) 02:16:3e:11:59:7a Dec 8 05:59:57 cloud-ProLiant-DL160-G6 dnsmasq-dhcp[3140]: DHCPOFFER(br100) 172.16.0.8 02:16:3e:11:59:7a Dec 8 06:00:05 cloud-ProLiant-DL160-G6 dnsmasq-dhcp[3140]: DHCPDISCOVER(br100) 02:16:3e:11:59:7a Dec 8 06:00:05 cloud-ProLiant-DL160-G6 dnsmasq-dhcp[3140]: DHCPOFFER(br100) 172.16.0.8 02:16:3e:11:59:7a Dec 8 06:00:20 cloud-ProLiant-DL160-G6 dnsmasq-dhcp[3140]: DHCPDISCOVER(br100) 02:16:3e:11:59:7a Dec 8 06:00:20 cloud-ProLiant-DL160-G6 dnsmasq-dhcp[3140]: DHCPOFFER(br100) 172.16.0.8 02:16:3e:11:59:7a

Obviously the Windows VM was firing quite a few DHCP requests in 30 seconds, and dnsmasq responded each time with a DHCPOFFER, but somehow that IP address didn't get picked up by Windows VM.

I have a few Linux VMs (Ubuntu) running in the same nova installation and they are running fine. One observation, however, is that when Linux VMs sent DHCPDISCOVER, dnsmasq responded with DHCPACK, rather than DHCPOFFER. An example is:

Dec 8 06:00:10 cloud-ProLiant-DL160-G6 dnsmasq-dhcp[3140]: DHCPREQUEST(br100) 172.16.0.6 02:16:3e:73:4d:54 Dec 8 06:00:10 cloud-ProLiant-DL160-G6 dnsmasq-dhcp[3140]: DHCPACK(br100) 172.16.0.6 02:16:3e:73:4d:54 test1

I'm wondering why dnsmasq gave "special treatment" to Windows VM, and if that was the cause of my problem.

Your help is really appreciated!

2011-12-04 17:00:36 -0600 answered a question failed to initialize KVM: Operation not permitted

Problem solved by rebooting the server. My gut feeling is that is has something to do with my manually "modprobe kvm kvm_intel".

If anyone can educate me (and probably other people who come across this post) with more details it'll be highly appreciated!

2011-12-04 14:41:33 -0600 answered a question failed to initialize KVM: Operation not permitted

And the CPUs do support virtualization:

grep --color vmx /proc/cpuinfo

flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt lahf_lm dts tpr_shadow vnmi flexpriority ept vpid flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt lahf_lm dts tpr_shadow vnmi flexpriority ept vpid flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt lahf_lm dts tpr_shadow vnmi flexpriority ept vpid flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt lahf_lm dts tpr_shadow vnmi flexpriority ept vpid

2011-12-04 14:35:07 -0600 answered a question failed to initialize KVM: Operation not permitted

As the server is in data center, I can't check if "virtualization enabled in the BIOS". But I assumed it is as "kvm-ok" runs fine.

2011-12-04 14:28:42 -0600 asked a question failed to initialize KVM: Operation not permitted

When I tried to spawn an instance, I got this error in nova-compute.log

2011-12-04 06:16:06,297 ERROR nova.compute.manager [-] Instance '3' failed to spawn. Is virtualization enabled in the BIOS? Details: internal error Process exited while reading console log output: char device redirected to /dev/pts/3 open /dev/kvm: Permission denied failed to initialize KVM: Operation not permitted (nova.compute.manager): TRACE: Traceback (most recent call last): (nova.compute.manager): TRACE: File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 438, in _run_instance (nova.compute.manager): TRACE: network_info, block_device_info) (nova.compute.manager): TRACE: File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 131, in wrapped (nova.compute.manager): TRACE: raise Error(str(e)) (nova.compute.manager): TRACE: Error: internal error Process exited while reading console log output: char device redirected to /dev/pts/3 (nova.compute.manager): TRACE: open /dev/kvm: Permission denied (nova.compute.manager): TRACE: failed to initialize KVM: Operation not permitted (nova.compute.manager): TRACE: (nova.compute.manager): TRACE:

I chgrp'ed /dev/kvm to "kvm", and even chmod'ed to 777:

ls -l /dev/kvm

crwxrwxrwx+ 1 root kvm 10, 232 2011-11-28 14:14 /dev/kvm

Also

kvm-ok

INFO: /dev/kvm exists KVM acceleration can be used

What else can I do to further investigate this problem?

Your help is really appreciated!

2011-10-19 17:40:46 -0600 answered a question 0-byte proxy.log

I just found out why: it never got a chance to hit any code that'd generate log entry other than "notice", which goes to proxy.error.

But the reason why it didn't hit any real code probably indicated a bug: From command line I used "curl" to send an http, as opposed to https request. This generated an exception at /usr/lib/pymodules/python2.6/eventlet/wsgi.py:606. However, for some reason that I didn't have time to dig out, eventlet swallowed that exception, never sent back a single byte of response, nor did it close the connection.

To make it worse, at /usr/lib/pymodules/python2.6/swift/common/wsgi.py:222, a NullLogger() was passed to wsgi.server, therefore eventlet didn't get any chance to log what happened, which made an otherwise-straightforward problem very hard to trace.

2011-10-19 16:46:32 -0600 answered a question how to debug swift

Thanks amwang, that solved my question.

2011-10-18 21:53:28 -0600 asked a question how to debug swift

I ran into a problem with swift and I was trying to debug it. When I did

python -m pdb /usr/bin/swift-proxy-server --verbose /etc/swift/proxy-server.conf

the debugging session would exit after this: capture_stdio(logger)

I commented out this line, but ran into another problem, swift forks processes to handle request. If I break in a middle of a sub-process, pdb input/output is messed up (it doesn't echo and shows other strange behaviors).

I'd appreciate if someone can help me with instructions on debugging swift.

2011-10-18 21:07:44 -0600 asked a question 0-byte proxy.log

Installed swift 1.4.3 on Ubuntu 11.04. But after proxy-server is started, I can only see 0-byte /var/log/swift/proxy.log.

While /var/log/swift/proxy.err contains only:

Oct 18 03:49:05 kens-lab2 proxy-server Started child 17276 Oct 18 03:49:05 kens-lab2 proxy-server Started child 17277 Oct 18 03:49:05 kens-lab2 proxy-server Started child 17278 ...

When swift-proxy-server was started, it printed out a warning message that I'm not sure is significant:

/usr/lib/pymodules/python2.6/paste/deploy/loadwsgi.py:8: UserWarning: Module netifaces was already imported from /usr/lib/pymodules/python2.6/netifaces.so, but /usr/lib/pymodules/python2.6 is being added to sys.path

My configuration is below:

cat /etc/swift/proxy-server.conf

[DEFAULT]

Enter these next two values if using SSL certifications

cert_file = /etc/swift/cert.crt key_file = /etc/swift/cert.key bind_port = 9080 workers = 8 user = swift set log_level = DEBUG set log_facility = LOG_LOCAL1

[pipeline:main]

keep swauth in the line below if you plan to use swauth for authentication

pipeline = healthcheck cache tempauth proxy-server

[app:proxy-server] use = egg:swift#proxy allow_account_management = true account_autocreate = true

[filter:tempauth] use = egg:swift#tempauth user_admin_admin = admin .admin .reseller_admin user_test_tester = testing .admin user_test2_tester2 = testing2 .admin user_test_tester3 = testing3

[filter:healthcheck] use = egg:swift#healthcheck

[filter:cache] use = egg:swift#memcache memcache_servers = 10.244.196.230:11211

cat /etc/rsyslog.d/10-swift.conf

Uncomment the following to have a log containing all logs together

#local1,local2,local3,local4,local5.* /var/log/swift/all.log

Uncomment the following to have hourly proxy logs for stats processing

#$template HourlyProxyLog,"/var/log/swift/hourly/%$YEAR%%$MONTH%%$DAY%%$HOUR%" #local1.*;local1.!notice ?HourlyProxyLog

local1.;local1.!notice /var/log/swift/proxy.log local1.notice /var/log/swift/proxy.error local1. ~

local2.;local2.!notice /var/log/swift/storage1.log local2.notice /var/log/swift/storage1.error local2. ~

local3.;local3.!notice /var/log/swift/storage2.log local3.notice /var/log/swift/storage2.error local3. ~

local4.;local4.!notice /var/log/swift/storage3.log local4.notice /var/log/swift/storage3.error local4. ~

local5.;local5.!notice /var/log/swift/storage4.log local5.notice /var/log/swift/storage4.error local5. ~

Your comments/suggestions are appreciated!