Ask Your Question
0

swift containers disappear in less than 1 min

asked 2014-12-18 07:21:43 -0500

T u l gravatar image

Hi,

I had an RDO IceHouse installation of openstack on Fedora 20. It had no swift installed, so I was trying to install it. The installation was done ~6-8 month ago, and since then some additional configs were made, so running packstack again was a bit of risk.

Following mainly that document I succeeded to have swift running.

But the problem is that when I create a container (from a dashboard, for example), in 5-30 seconds it disappears.

    $ swift stat
       Account: AUTH_df715cfea8e240e3be22ba7bd56d148a
    Containers: 1
       Objects: 0
         Bytes: 0
 Accept-Ranges: bytes
   X-Timestamp: 1418907580.14515
    X-Trans-Id: tx811daed7a0d846d8b7ad7-005492cfcb
  Content-Type: text/plain; charset=utf-8

  $ swift stat
       Account: AUTH_df715cfea8e240e3be22ba7bd56d148a
    Containers: 0
       Objects: 0
         Bytes: 0
X-Put-Timestamp: 1418907615.35912
   X-Timestamp: 1418907615.35912
    X-Trans-Id: tx62147245b42340419681f-005492cfdf
  Content-Type: text/plain; charset=utf-8

Has anybody faced that problem?

I have these rings:

swift-ring-builder account.builder create 18 1 1
swift-ring-builder container.builder create 18 1 1
swift-ring-builder object.builder create 18 1 1

and one partition with no replicas

swift-ring-builder account.builder add z1-$swiftstorage:6202/partition1 100
swift-ring-builder container.builder add z1-$swiftstorage:6201/partition1 100
swift-ring-builder object.builder add z1-$swiftstorage:6200/partition1 100

It looks like account-replicator removes things: A piece of log from tail -f /var/log/messages | grep -i "swift\|accoun\|storage\|objec"

Dec 18 14:15:27 celeste account-replicator: Beginning replication run
Dec 18 14:15:27 celeste account-replicator: Replication run OVER
Dec 18 14:15:27 celeste account-replicator: Attempted to replicate 1 dbs in 0.01274 seconds (78.51059/s)
Dec 18 14:15:27 celeste account-replicator: Removed 0 dbs
Dec 18 14:15:27 celeste account-replicator: 0 successes, 0 failures
Dec 18 14:15:27 celeste account-replicator: no_change:0 ts_repl:0 diff:0 rsync:0 diff_capped:0 hashmatch:0 empty:0
where FILE_TYPE is one of the following: bin_t, boot_t, cert_t, device_t, devpts_t, etc_runtime_t, etc_t, file_context_t, lib_t, locale_t, man_cache_t, man_t, net_conf_t, nscd_var_run_t, proc_t, root_t, samba_etc_t, shell_exec_t, src_t, sssd_public_t, swift_data_t, swift_lock_t, swift_tmp_t, swift_tmpfs_t, swift_var_cache_t, swift_var_run_t, system_conf_t, system_db_t, textrel_shlib_t, tmp_t, tmpfs_t, usr_t, var_lib_t, var_lock_t, var_run_t, var_t. 
# grep swift-object-au /var/log/audit/audit.log | audit2allow -M mypol
Dec 18 14:15:30 celeste object-replicator: Starting object replication pass.
Dec 18 14:15:30 celeste object-replicator: 3/3 (100.00%) partitions replicated in 0.01s (245.38/sec, 0s remaining)
Dec 18 14:15:30 celeste object-replicator: 3 suffixes checked - 0.00% hashed, 0.00% synced
Dec 18 14:15:30 celeste object-replicator: Partition times: max 0.0009s, min 0.0005s, med 0.0006s
Dec 18 14:15:30 celeste object-replicator: Object replication complete. (0.00 minutes)
Dec 18 14:15:33 celeste object-replicator: Starting object replication pass.
Dec 18 14:15:33 celeste object-replicator: 3/3 (100.00%) partitions replicated in 0.01s (244.64/sec, 0s remaining)
Dec 18 14:15:33 celeste object-replicator: 3 suffixes checked - 0.00% hashed, 0.00% synced
Dec 18 14:15:33 celeste object-replicator: Partition times: max 0.0012s, min 0.0005s, med 0.0005s
Dec 18 14:15:33 celeste object-replicator: Object replication complete. (0.00 minutes ...
(more)
edit retag flag offensive close merge delete

Comments

Increased the number of replicas for all rings (from 1 to 2); did not help. Found a couple of similar issues, but never with a reply. Except here http://toster.ru/q/108791 (in russian), but the guy says that updating the distribution on a proxy node helped. I updated my fedora 20, no changes...

T u l gravatar imageT u l ( 2014-12-18 16:13:51 -0500 )edit

Maybe anyone can suggest where else could I ask that question?

T u l gravatar imageT u l ( 2014-12-19 06:33:30 -0500 )edit

1 answer

Sort by ยป oldest newest most voted
0

answered 2014-12-20 08:02:45 -0500

T u l gravatar image

updated 2014-12-20 08:29:25 -0500

Ok, seems that it is fixed somehow

Possible reason: in parallel with services (openstack-swift-...), i run swift-init (which run the same processes) and the configuration changes I did were not taken into account.

  1. stop things with swift-init: swift-init kill all (Note: I tried to fix it by doing all the following steps without this one. It did not work, so it was crucial)
  2. stop all services (I run all on one node):

    for service in openstack-swift-object openstack-swift-object-replicator openstack-swift-object-updater openstack-swift-object-auditor openstack-swift-container openstack-swift-container-replicator openstack-swift-container-updater openstack-swift-container-auditor openstack-swift-account openstack-swift-account-replicator openstack-swift-account-reaper openstack-swift-account-auditor openstack-swift-proxy openstack-swift-account; do service $service stop; done

  3. remove all files from the node (I had it at /srv/node/partition1)

  4. in /etc/swift, removed {account,container,object}{.builder,.ring.gz} (also removed things from /etc/swift/backup)
  5. Recreated rings:

    cd /etc/swift

    swift-ring-builder account.builder create 18 1 1

    swift-ring-builder container.builder create 18 1 1

    swift-ring-builder object.builder create 18 1 1

    swiftstorage=ip-of-your-storage-node

    swift-ring-builder account.builder add z1-$swiftstorage:6202/partition1 100

    swift-ring-builder container.builder add z1-$swiftstorage:6201/partition1 100

    swift-ring-builder object.builder add z1-$swiftstorage:6200/partition1 100

    swift-ring-builder account.builder rebalance

    swift-ring-builder container.builder rebalance

    swift-ring-builder object.builder rebalance

    chown -R swift:swift .

  6. restarted services:

    for service in openstack-swift-object openstack-swift-object-replicator openstack-swift-object-updater openstack-swift-object-auditor openstack-swift-container openstack-swift-container-replicator openstack-swift-container-updater openstack-swift-container-auditor openstack-swift-account openstack-swift-account-replicator openstack-swift-account-reaper openstack-swift-account-auditor openstack-swift-proxy openstack-swift-account; do service $service start; done

Now stuff started working... Hope it can help someone.

edit flag offensive delete link more

Comments

Well, that could do it. My personal guess was that time was off on one of nodes. But ring inconsistency could be responsible too.

zaitcev gravatar imagezaitcev ( 2014-12-29 21:30:27 -0500 )edit

all components were running on one node... but i had a wrong path for [object] in rsyncd.conf (here, btw, came the unexpected /srv/node from)

T u l gravatar imageT u l ( 2014-12-30 12:05:09 -0500 )edit

The rsync process removes my containers. Something I dont' understand?

flovax gravatar imageflovax ( 2015-01-28 08:05:35 -0500 )edit

I had the same issue, and I found that if you stop the replicator service for all (container, object and account) that I don't see this issue any longer. Does anyone know what the replicator service is supposed to do? I am running Juno on CentOS 7

foster gravatar imagefoster ( 2015-02-24 08:53:43 -0500 )edit

UPDATE: zaitcev was correct. It was the time on the system that was off.

foster gravatar imagefoster ( 2015-03-01 04:49:15 -0500 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Get to know Ask OpenStack

Resources for moderators

Question Tools

1 follower

Stats

Asked: 2014-12-18 07:21:43 -0500

Seen: 677 times

Last updated: Dec 20 '14