Ask Your Question
0

Using Indexes instead of checking duplicating entities by hand.

asked 2012-11-01 14:31:01 -0500

boris-42 gravatar image

It would be good to use DB Indexes instead of checking duplicating entities by hand in sqlalchemy DB API.

For example code in nova/db/sqlalchemy/api.py:

@require_admin_context def instance_type_create(context, values): """Create a new instance type. In order to pass in extra specs, the values dict should contain a 'extra_specs' key/value pair:

{'extra_specs' : {'k1': 'v1', 'k2': 'v2', ...}}

"""
session = get_session()
with session.begin():
    try:
        instance_type_get_by_name(context, values['name'], session)
        raise exception.InstanceTypeExists(name=values['name'])
    except exception.InstanceTypeNotFoundByName:
        pass
    try:
        instance_type_get_by_flavor_id(context, values['flavorid'],
                                       session)
        raise exception.InstanceTypeExists(name=values['name'])
    except exception.FlavorNotFound:
        pass
    try:
        specs = values.get('extra_specs')
        specs_refs = []
        if specs:
            for k, v in specs.iteritems():
                specs_ref = models.InstanceTypeExtraSpecs()
                specs_ref['key'] = k
                specs_ref['value'] = v
                specs_refs.append(specs_ref)
        values['extra_specs'] = specs_refs

        instance_type_ref = models.InstanceTypes()
        instance_type_ref.update(values)
        instance_type_ref.save(session=session)
    except Exception, e:
        raise exception.DBError(e)
    return _dict_with_extra_specs(instance_type_ref)

So before we can create new instance_type in DB we must check that instance type with the same name or flavor_id doesn't exist in DB. So we make 3 requests to DB instead of 1. I think it is a not good approach.

Also there are other entities that have create and update methods in API, so we should make such checks in both methods. So we can easily make mistake. For example I've found such code in sqlalchemy DB API:

@require_admin_context def sm_backend_conf_create(context, values): session = get_session() with session.begin(): config_params = values['config_params'] backend_conf = model_query(context, models.SMBackendConf, session=session, read_deleted="yes").\ filter_by(config_params=config_params).\ first()

    if backend_conf:
        raise exception.Duplicate(_('Backend exists'))
    else:
        backend_conf = models.SMBackendConf()
        backend_conf.update(values)
        backend_conf.save(session=session)
return backend_conf

@require_admin_context def sm_backend_conf_update(context, sm_backend_id, values): session = get_session() with session.begin(): backend_conf = model_query(context, models.SMBackendConf, session=session, read_deleted="yes").\ filter_by(id=sm_backend_id).\ first()

    if not backend_conf:
        raise exception.NotFound(
            _("No backend config with id %(sm_backend_id)s") % locals())

    backend_conf.update(values)
    backend_conf.save(session=session)
return backend_conf

In function sm_backend_conf_update: there is no code that checks that there will be no duplicating entities in DB after update. This an obvious mistake.

It is not so easy to use indexes, because the entities are not physically deleted from DB (they are only marked as "deleted").

The best way that I've found is to use composite unique indexes like this one: (column1, column2, ...,columnN, deleted_at). So if an entity is deleted its deleted_at column is set to some date , so we can create new entity with same values of columns (column1, column2,...,columnN). Since deleted_at column of the new entity will be NULL by default, our unique constraint is satisfied. If there is already not deleted entity, we cannot create one more with the same parameters, because deleted_at in both entities is NULL.

Unfortunately it doesn't work, because in mysql NULL is not equal to NULL. To fix this issue we can set default value of deleted_at column to, for example, '1970-1-1'. This will not break existing code, because deleted_at column isn't ... (more)

edit retag flag offensive close merge delete

3 answers

Sort by ยป oldest newest most voted
0

answered 2012-11-08 15:14:13 -0500

dims-v gravatar image

Boris, please help us by submitting changes, here's the usual process for contributions - http://wiki.openstack.org/GerritWorkflow

edit flag offensive delete link more
0

answered 2012-11-08 15:52:18 -0500

boris-42 gravatar image

Thanks for reply.

I'm newbie in OpenStack and I'm wondering why indexes were not used in the case I described (for checking duplicates). Maybe there are some reasons for that?

I will submit an example soon.

edit flag offensive delete link more
0

answered 2012-11-15 09:43:50 -0500

boris-42 gravatar image

I've submitted an example to review https://review.openstack.org/#/c/16163/

edit flag offensive delete link more

Get to know Ask OpenStack

Resources for moderators

Question Tools

1 follower

Stats

Asked: 2012-11-01 14:31:01 -0500

Seen: 80 times

Last updated: Nov 15 '12