Ask Your Question
0

Can Ping but Can't SSH to Instances

asked 2014-06-20 17:10:09 -0500

Nastooh gravatar image

updated 2014-06-20 17:37:22 -0500

Looked at all suggested solutions to this problem to no avail. The following is a summary of what I've done:

 1. Generated key-pair, both via nova
    and dashboard 
 2. Performed chmod 0600 key and
        ssh-add key. ssh-ing from either public
        address or through ip netns qrxxxxx
        hangs on the following line: 

debug1: sending
            SSH2_MSG_KEXDH_INIT debug1:
            expecting SSH2_MSG_KEXDH_REPLY
        while cirros guest reports: 
Jun20
            14:29:21 cir3 authpriv.info
            dropbear[364]: Child connection from
            GatewayPrivateIp:32818 Jun20
            14:29:21 cir3 authpriv.info
            dropbear[364]: Exit before auth:
            Timeout before auth
 3. wireshark captures shows that after the key
        exchange, repeated ACKs are sent
        from source to VM guest, and VM
        eventually closes the session, by
        sending [FIN, ACK].
 4. Added meta data server to compute node's
        nova.conf file, and am seeing the
        following during start up: 
cirros-ds
            'net' up at 1.56 checking
            http://169.254.169.254/2009-04-04/instance-id successful after 1/20 tries: up
            1.59. iid=i-00000013 failed to get
        http://169.254.169.254/2009-04-04/user-data warning: no ec2 metadata for
            user-data found datasource (ec2,
            net) cirros-apply-net already run
            per instance check-version already
            run per instance Starting dropbear
            sshd: OK
 5. Security group setting seems ok, as connection to port 22 is possible and can ping the VMs from public network. I am also able to ssh between the 2 cirros VMs running on the same private subnet
 6. tcpdum on  router's name space shows the following:
root@node1:~/.ssh# ip netns exec qrouter-b2282f17-7685-4b58-a498-3330b4c21e87 tcpdump -v -i qr-93b34601-f8 
tcpdump: listening on qr-93b34601-f8, link-type EN10MB (Ethernet), capture size 65535 bytes
21:46:07.665408 IP (tos 0x0, ttl 63, id 35372, offset 0, flags [DF], proto TCP (6), length 60)
    10.0.10.100.45917 > 192.168.1.11.ssh: Flags [S], cksum 0x6dfb (correct), seq 2817118475, win 29200, options [mss 1460,sackOK,TS val 172841767 ecr 0,nop,wscale 7], length 0
21:46:07.666632 IP (tos 0x10, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
    192.168.1.11.ssh > 10.0.10.100.45917: Flags [S.], cksum 0xa334 (correct), seq 230720134, ack 2817118476, win 14480, options [mss 1460,sackOK,TS val 2126803 ecr 172841767,nop,wscale 3], length 0
21:46:07.672105 IP (tos 0x0, ttl 63, id 35373, offset 0, flags [DF], proto TCP (6), length 52)
    10.0.10.100.45917 > 192.168.1.11.ssh: Flags [.], cksum 0x09a7 (correct), ack 1, win 229, options [nop,nop,TS val 172841768 ecr 2126803], length 0
21:46:07.677115 IP (tos 0x0, ttl 63, id 35374, offset 0, flags [DF], proto TCP (6), length 93)
    10.0.10.100.45917 > 192.168.1.11.ssh: Flags [P.], cksum 0xe44f (correct), seq 1:42, ack 1, win 229, options [nop,nop,TS val 172841768 ecr 2126803], length 41
21:46:07.677481 IP (tos 0x10, ttl 64, id 61778, offset 0, flags [DF], proto TCP (6), length 78)
    192.168.1.11.ssh ...
(more)
edit retag flag offensive close merge delete

Comments

For Private key, give permission 400 and regenerate the key

subha gravatar imagesubha ( 2014-06-21 00:14:20 -0500 )edit

I'll take it that by regenrate you mean doing ssh-add keyname, after chmod 400 keyname? If so, it didn't make a diiference in the outcome, ssh is still stuck, while waiting to receive finger print from VM.

Nastooh gravatar imageNastooh ( 2014-06-23 11:04:36 -0500 )edit

3 answers

Sort by ยป oldest newest most voted
1

answered 2014-06-24 16:49:37 -0500

Nastooh gravatar image

In our multi-node environment, problem ended up being packet fragmentation. The work around for is to increase mtu to 1700 on management nics of both compute and neutron nodes, e.g., ifconfig ethxxx mtu 1700

edit flag offensive delete link more
0

answered 2016-05-31 05:41:22 -0500

I had the same issue, and by changing the mtu size it is working fine. i.e. ifconfig <interface> mtu <mtu_size>

edit flag offensive delete link more
0

answered 2014-06-25 00:54:57 -0500

sdimber gravatar image

I had the similar issue, i recreated the bridge and the network using nova net create. Also, ensure that when you create the instances you use the net-id. it works for me now

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Get to know Ask OpenStack

Resources for moderators

Question Tools

1 follower

Stats

Asked: 2014-06-20 17:10:09 -0500

Seen: 2,198 times

Last updated: May 31 '16