关键词搜索

源码搜索 ×
×

SQLite 版本引发的 Python 程序调用问题

发布2019-06-04浏览2628次

详情内容

问题

在跑 OpenStack functional 功能测试的时候有两个用例过不去。

  • nova.tests.functional.db.test_resource_provider.ResourceClassTestCase.test_create_duplicate_id_retry
  • nova.tests.functional.db.test_resource_provider.ResourceClassTestCase.test_create_duplicate_id_retry_failing

调试定位到问题代码:

# /opt/stack/queens/nova/nova/objects/resource_provider.py

    def create(self):
        if 'id' in self:
            raise exception.ObjectActionError(action='create',
                                              reason='already created')
        if 'name' not in self:
            raise exception.ObjectActionError(action='create',
                                              reason='name is required')
        if self.name in fields.ResourceClass.STANDARD:
            raise exception.ResourceClassExists(resource_class=self.name)

        if not self.name.startswith(fields.ResourceClass.CUSTOM_NAMESPACE):
            raise exception.ObjectActionError(
                action='create',
                reason='name must start with ' +
                        fields.ResourceClass.CUSTOM_NAMESPACE)

        updates = self.obj_get_changes()
        # There is the possibility of a race when adding resource classes, as
        # the ID is generated locally. This loop catches that exception, and
        # retries until either it succeeds, or a different exception is
        # encountered.
        retries = self.RESOURCE_CREATE_RETRY_COUNT
        while retries:
            retries -= 1
            try:
                rc = self._create_in_db(self._context, updates)
                self._from_db_object(self._context, self, rc)
                break
            except db_exc.DBDuplicateEntry as e:
                # NOTE: e.columns 为空,所以直接出发后续的异常
                if 'id' in e.columns:
                    # Race condition for ID creation; try again
                    continue
                # The duplication is on the other unique column, 'name'. So do
                # not retry; raise the exception immediately.
                raise exception.ResourceClassExists(resource_class=self.name)
        else:
            # We have no idea how common it will be in practice for the retry
            # limit to be exceeded. We set it high in the hope that we never
            # hit this point, but added this log message so we know that this
            # specific situation occurred.
            LOG.warning("Exceeded retry limit on ID generation while "
                        "creating ResourceClass %(name)s",
                        {'name': self.name})
            msg = _("creating resource class %s") % self.name
            raise exception.MaxDBRetriesExceeded(action=msg)

    继续看 db_exc.DBDuplicateEntry 的实现:

    # /usr/lib/python2.7/site-packages/oslo_db/exception.py
    class DBDuplicateEntry(DBError):
        """Duplicate entry at unique column error.
    
        Raised when made an attempt to write to a unique column the same entry as
        existing one. :attr: `columns` available on an instance of the exception
        and could be used at error handling::
    
           try:
               instance_type_ref.save()
           except DBDuplicateEntry as e:
               if 'colname' in e.columns:
                   # Handle error.
    
        :kwarg columns: a list of unique columns have been attempted to write a
            duplicate entry.
        :type columns: list
        :kwarg value: a value which has been attempted to write. The value will
            be None, if we can't extract it for a particular database backend. Only
            MySQL and PostgreSQL 9.x are supported right now.
        """
        def __init__(self, columns=None, inner_exception=None, value=None):
            # 正常情况下,触发 DBDuplicateEntry 会将冲突的 columns 返回,让开发者得以方便的作出进一步判断
            self.columns = columns or []
            self.value = value
            super(DBDuplicateEntry, self).__init__(inner_exception)
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26

    定位到生成冲突 columns 的地方:

    # /opt/stack/queens/nova/.tox/functional/lib/python2.7/site-packages/oslo_db/sqlalchemy/exc_filters.py
    
    @filters("sqlite", sqla_exc.IntegrityError,
             (r"^.*columns?(?P<columns>[^)]+)(is|are)\s+not\s+unique$",
              r"^.*UNIQUE\s+constraint\s+failed:\s+(?P<columns>.+)$",
              r"^.*PRIMARY\s+KEY\s+must\s+be\s+unique.*$"))
    def _sqlite_dupe_key_error(integrity_error, match, engine_name, is_disconnect):
        """Filter for SQLite duplicate key error.
    
        note(boris-42): In current versions of DB backends unique constraint
        violation messages follow the structure:
    
        sqlite:
        1 column - (IntegrityError) column c1 is not unique
        N columns - (IntegrityError) column c1, c2, ..., N are not unique
    
        sqlite since 3.7.16:
        1 column - (IntegrityError) UNIQUE constraint failed: tbl.k1
        N columns - (IntegrityError) UNIQUE constraint failed: tbl.k1, tbl.k2
    
        sqlite since 3.8.2:
        (IntegrityError) PRIMARY KEY must be unique
    
        """
        columns = []
        # NOTE(ochuprykov): We can get here by last filter in which there are no
        #                   groups. Trying to access the substring that matched by
        #                   the group will lead to IndexError. In this case just
        #                   pass empty list to exception.DBDuplicateEntry
        try:
            columns = match.group('columns')
            columns = [c.split('.')[-1] for c in columns.strip().split(", ")]
        except IndexError:
            pass
    
        raise exception.DBDuplicateEntry(columns, integrity_error)
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36

    没有生产冲突 columns 的原因是:「底层 db engine 返回的 string match 不符合上述的匹配规范」。e.g.

    2013-05-20 错误:('(sqlite3.IntegrityError) PRIMARY KEY must be unique',)
    2019-04-16 正确:('(sqlite3.IntegrityError) UNIQUE constraint failed: resource_classes.id',)
    
    • 1
    • 2

    这是一个 SQLite3 版本不匹配导致的问题,但在 Nova 项目中却没有明确的指定 SQLite3 的版本,所以只能手动的修复这一问题。

    解决

    手动编译升级 SQLite3 的版本:

    wget https://www.sqlite.orghttps://cdn.jxasp.com:9143/image/2019/sqlite-autoconf-3280000.tar.gz
    tar -xvf sqlite-autoconf-3280000.tar.gz
    cd sqlite-autoconf-3280000
    mkdir /opt/sqlite3
    ./configure --prefix=/opt/sqlite3
    make && make install
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6

    升级完 SQLite3 依旧没有直接解决上述问题,这里主要涉及到一个 Python 如何调用 C so 库的问题,这也是解决这个问题的精髓所在。

    • 首先我们找到 SQLite3 Python 客户端(API)的位置
    $ python -c "import sqlite3; print(sqlite3.__file__)"
    /usr/lib64/python2.7/sqlite3/__init__.pyc
    
    • 1
    • 2
    • 查看 SQLite3 API 实现并找到 so 库导入语句
    # /usr/lib64/python2.7/sqlite3/dbapi2.py
    
    from _sqlite3 import *
    
    • 1
    • 2
    • 3
    • 查找 _sqlite3 so 库的位置
    $ python -c 'import _sqlite3; print(_sqlite3)'
    <module '_sqlite3' from '/opt/stack/queens/nova/.tox/functional/lib64/python2.7/lib-dynload/_sqlite3.so'>
    
    • 1
    • 2
    • 查看 _sqlite3 so 库内含的动态函数库
    $ ldd /opt/stack/queens/nova/.tox/functional/lib64/python2.7/lib-dynload/_sqlite3.so
    	linux-vdso.so.1 =>  (0x00007ffc4defb000)
    	libsqlite3.so.0 => /lib64/libsqlite3.so.0 (0x00007f708ba42000)
    	libpython2.7.so.1.0 => /lib64/libpython2.7.so.1.0 (0x00007f708b676000)
    	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f708b45a000)
    	libc.so.6 => /lib64/libc.so.6 (0x00007f708b08d000)
    	libz.so.1 => /lib64/libz.so.1 (0x00007f708ae77000)
    	libm.so.6 => /lib64/libm.so.6 (0x00007f708ab75000)
    	libdl.so.2 => /lib64/libdl.so.2 (0x00007f708a971000)
    	libutil.so.1 => /lib64/libutil.so.1 (0x00007f708a76e000)
    	/lib64/ld-linux-x86-64.so.2 (0x00007f708bf62000)
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 凭直觉,我们首先关注 libsqlite3.so.0 函数库
    $ ls -alh /lib64/libsqlite3.so.0
    lrwxrwxrwx. 1 root root 19 May 14 05:13 /lib64/libsqlite3.so.0 -> libsqlite3.so.0.8.6
    
    $ ls -alh /lib64/libsqlite3.so.0.8.6
    -rwxr-xr-x. 1 root root 5.1M Jun  4 05:51 /lib64/libsqlite3.so.0.8.6
    
    • 1
    • 2
    • 3
    • 4
    • 5

    至此,我们可以想到之所以升级了 SQLite3 的版本但依旧没有解决问题的原因是「Python 程序中调用的动态函数库依旧没有被更新」。所以我们只需要使用新安装的 so 文件替换掉就的就可以解决了。

    mv /usr/lib64/libsqlite3.so.0.8.6 /usr/lib64/libsqlite3.so.0.8.6.bk
    cp /opt/sqlite3/lib/libsqlite3.so.0.8.6 /usr/lib64/libsqlite3.so.0.8.6
    
    • 1
    • 2

    最后

    最后贴上 SQLite3 的修改 commit:

    This issue is involved this commit, and introduced by version-3.8.2
    
    ...
    commit eb743f01b125bebd8736ceb2873b69f27721b0ae
    Author: D. Richard Hipp <drh@hwaci.com>
    Date:   Tue Nov 5 13:33:55 2013 +0000
    
        Standardize the error messages generated by constraint failures to a format
        of "$TYPE constraint failed: $DETAIL".  This involves many changes to the
        expected output of test cases.
    ...
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11

    解决这一问题主要的收获是 Python 程序和 C 程序之间的调用关键,如果两者之间并非是通过 TCP 协议来通信,而是通过 so 库文件来调用的话,那么我们需要注意 C 程序在 Linux 操作系统上的文件安装方式。并非单纯的升级了 C 程序就会立马在 Python 程序上生效,还要注意两者之间的桥梁(调用库文件)是否也一同升级了

    相关技术文章

    点击QQ咨询
    开通会员
    返回顶部
    ×
    微信扫码支付
    微信扫码支付
    确定支付下载
    请使用微信描二维码支付
    ×

    提示信息

    ×

    选择支付方式

    • 微信支付
    • 支付宝付款
    确定支付下载