Ask Your Question
0

Vanilla Plugin with default configuration throws "Connection Refused"

asked 2013-12-18 04:59:14 -0600

Hi,

My set up is:

Ubuntu 12.04 OpenStack Havana with Vanilla Plugin

I have deployed a cluster with the following node groups:

1 x master:

-Uses 1 cinder volume : 2TB

-namenode -secondarynamenode -oozie -datanode -jobtracker -tasktracker

2x slaves:

-Uses 1 cinder volume: 2TB

-datanode -tasktracker

Both node groups used the following flavor:

VCPUs: 32 RAM: 250000 Root disk: 300GB Ephemeral: 300GB Swap: 0

They also use the default Ubuntu Hadoop Vanilla image downloadable from https://savanna.readthedocs.org/en/latest/userdoc/vanilla_plugin.html (https://savanna.readthedocs.org/en/la...)

The /etc/hosts file in all nodes is: 127.0.0.1 localhost 10.0.0.2 test-master2T-001.novalocal test-master2T-001 10.0.0.3 test-slave2T-001.novalocal test-slave2T-001 10.0.0.4 test-slave2T-002.novalocal test-slave2T-002

Without changing any of the default configuration, the cluster boots correctly.

The problem is that, when running a job (for example, teragen 100GB), the map tasks fail many times, having to repeat them, thus increasing the job time. They seem to fail randomly, from one slave or the other, depending on the execution.

Checking the logs of the datanotes in the slaves, I can see this error:

WARN org.apache.hadoop.hdfs.server.datanode.DataNode: java.net.ConnectException: Call to test-master2T-001/10.0.0.2:8020 failed on connection exception: java.net.ConnectException: Connection refused

Full error: http://pastebin.com/DDp39yqt

The logs of the datanode in the master, gives this error:

WARN org.apache.hadoop.hdfs.server.datanode.DataNode: checkDiskError: exception: java.net.SocketException: Original Exception : java.io.IOException: Connection reset by peer

Full error: http://pastebin.com/NXYXELQX

I have tried changing hadoop.tmp.dir to point to the 2TB cinder volume /volumes/disk1/lib/hadoop/hdfs/tmp, but nothing changed.

Thank you in advance.

edit retag flag offensive close merge delete

1 answer

Sort by ยป oldest newest most voted
0

answered 2013-12-20 14:35:41 -0600

dmitrymex gravatar image

The question was answered in the mailing list: http://lists.openstack.org/pipermail/openstack/2013-December/004039.html (http://lists.openstack.org/pipermail/...)

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Get to know Ask OpenStack

Resources for moderators

Question Tools

1 follower

Stats

Asked: 2013-12-18 04:59:14 -0600

Seen: 290 times

Last updated: Dec 20 '13