pyinstaller打包flask(二)

发表于 2017-10-08 分类于 Python 阅读次数：本文字数： 6.1k 阅读时长 ≈ 6 分钟

在使用pyinstaller库的过程中遇到几个问题,看了下源码,发现造成这几个问题的原因类似,感觉挺有意思,之前说过,pyinstaller其实就是分析python文件中的import 语句,然后打包成pyd或都是dll等库文件,但是对于一些python中使用动态导入或者是使用指定路径等形式,或者换句话说就是:在运行中才能确定导入了哪些库或者是使用了哪些dll,这种情况下pyinstaller是无法自动辨别的,下面遇到的三个问题都是由于动态导入的问题,这个时候就需要使用pyinstaller更高级的用法了.

问题

错误一: No executor by the name “threadpool” was found

在上一篇介绍jobstore时我们知道apscheduler中默认的执行器为threadpool,而且有3种初始化写法,具体请参考这篇官网,而那对于flask-apscheduler,默认的初始化方式也很json风格

class Config(object):
    JOBS = [
        #每月15号23:30删除32天之前采集的awr.db的历史数据,一键生成报表时需要当月数据,历史数据需要保留一个月,这里设置为32天
        {
            'id': 'truncatedb',
            'func': 'dbfunc:truncatedb',
            'args': (32,),
            'trigger': CronTrigger(day=15,hour=23,minute=30),
            'replace_existing': True
        }
    ]
    SCHEDULER_JOBSTORES = {
        'default': SQLAlchemyJobStore(url='sqlite:///' + SQLITE_DB)
    }
    SCHEDULER_EXECUTORS = {
        'default': {'type': 'threadpool', 'max_workers': 20}
    }
    SCHEDULER_JOB_DEFAULTS = {
        #如果某一任务错过执行多次,设定为True时,只会执行一次
        'coalesce': True,
        #应该调度执行的时间跟当前时间差值泛围,小于则会被再次调度
        'misfire_grace_time': 20,
        #每个job在同一时刻能够运行的最大实例数
        'max_instances': 10
    }
    SCHEDULER_API_ENABLED = True

上面指定了默认的执行器为threadpool,按照flask-apscheduler用户手册来说这样写没有问题,事实上单独写成脚本执行也没有问题,但是一整合到flask web中,就会报如下错误,提示找不到threadpool执行器:

查看flask-apscheduler源码,有如下函数:

def _load_config(self):
        """
        Load the configuration from the Flask configuration.
        """
        options = dict()

        job_stores = self.app.config.get('SCHEDULER_JOBSTORES')
        if job_stores:
            options['jobstores'] = job_stores

        executors = self.app.config.get('SCHEDULER_EXECUTORS')
        if executors:
            options['executors'] = executors

        job_defaults = self.app.config.get('SCHEDULER_JOB_DEFAULTS')
        if job_defaults:
            options['job_defaults'] = job_defaults

        timezone = self.app.config.get('SCHEDULER_TIMEZONE')
        if timezone:
            options['timezone'] = timezone
        self._scheduler.configure(**options)
            ...

该函数从app中get到配置,然后通过字典形式传入_scheduler中,_scheduler是一个BackgroundScheduler()对象,而这个对象又是继承于BaseScheduler(),看BaseScheduler类这完全没问题,那为何找不到threadpool呢?

问题二:No modules named ‘reportlab.graphics.barcode.common’

同样的问题,单独脚本执行可以,使用pyinstaller打包flask成all-in-one就提示找不到库了,当时还以为是reportlab库需要适配,也是追了reportlab的源码,没发现什么问题,reportlab出问题的源码如下:

def _BCW(doc,codeName,attrMap,mod,value,**kwds):
    """factory for Barcode Widgets"""
    _pre_init = kwds.pop('_pre_init','')
    _methods = kwds.pop('_methods','')
    name = 'Barcode'+codeName
    ns = vars().copy()
    code = 'from %s import %s' % (mod,codeName)
    rl_exec(code,ns)
    ns['_BarcodeWidget'] = _BarcodeWidget
    code = '''class %(name)s(_BarcodeWidget,%(codeName)s):
\t_BCC = %(codeName)s
\tcodeName = %(codeName)r
\tdef __init__(self,**kw):%(_pre_init)s
\t\t_BarcodeWidget.__init__(self,%(value)r,**kw)%(_methods)s''' % ns
    rl_exec(code,ns)
    Klass = ns[name]
    if attrMap: Klass._attrMap = attrMap
    if doc: Klass.__doc__ = doc
    for k, v in kwds.items():
        setattr(Klass,k,v)
    return Klass

很明显上面的代码中有code = 'from %s import %s' % (mod,codeName),动态导入方式.

问题三:dlopen() failed to load a library: cairo / cairo-2

try:
    cairo = ffi.dlopen(os.path.join(os.path.dirname(__file__), 'cairo.dll')) # case1
    #cairo = ffi.dlopen(os.path.join(os.path.dirname(os.path.abspath(__file__)), 'cairo.dll')) # case2
    #cairo = ffi.dlopen('cairo.dll') # case3
except Exception:
    cairo = dlopen(ffi, 'cairo', 'cairo-2')

上面最开始那句是cairocffi的源码,可以看出这里使用了os.path.join(os.path.dirname(_file_,’cairo.dll’),其实就是使用该目录下的cairo.dll文件,这也是只能在脚本运行时才能确定路径,所以pyinstaller运行时会产生异常.

那么最重要的问题来了,同样也是解决上面3个问题的方法

解决

pyinstaller之hiddenimports

从字面上可以理解,隐藏式导入就是可以不以代码为标准直接导入指定的模块,问题二:可以直接使用hiddenimports导入pyinstaller不能自动导入的模块:

hiddenimports = [
'reportlab.graphics.barcode.common',
'reportlab.graphics.barcode.code128',
'reportlab.graphics.barcode.code93',
'reportlab.graphics.barcode.code39',
'reportlab.graphics.barcode.usps',
'reportlab.graphics.barcode.usps4s',
'reportlab.graphics.barcode.ecc200datamatrix'
]

这样pyinstaller在打包的时候会把上面指定的模块也打包进去,是不是很方便

pyinstaller之打包二进制文件

问题三则是pyinstaller无法打到dll文件,这个时候可以使用binaries指定,

a = Analysis(['xxx.py'],
             pathex=['D:\\xxx'],
             hiddenimports=hiddenimports,
             binaries=[('.\\cairo.dll','.')],
             datas=added_files,
             hookspath=[],
             runtime_hooks=[],
             excludes=[],
             win_no_prefer_redirects=False,
             win_private_assemblies=False,
             cipher=block_cipher)

pyinstaller则会在当前目录下查找cairo.dll打包进exe,exe执行的时候会把cairo.dll解压到pyinstaller的临时生成的解压路径,而且也把问题三的源码修改成了case3,不需要再使用os.path就能找到了

而问题一则需要改成如下声明:

from apscheduler.jobstores.sqlalchemy import SQLAlchemyJobStore
from apscheduler.executors.pool import ThreadPoolExecutor
from apscheduler.triggers.cron import CronTrigger
from apscheduler.triggers.interval import IntervalTrigger
from apscheduler.events import EVENT_JOB_EXECUTED,EVENT_JOB_ERROR

class Config(object):
    JOBS = [
        #每月15号23:30删除32天之前采集的awr.db的历史数据,一键生成报表时需要当月数据,历史数据需要保留一个月,这里设置为32天
        {
            'id': 'truncatedb',
            'func': 'dbfunc:truncatedb',
            'args': (32,),
            'trigger': CronTrigger(day=15,hour=23,minute=30),
            'replace_existing': True
        }
    ]
    SCHEDULER_JOBSTORES = {
        'default': SQLAlchemyJobStore(url='sqlite:///' + SQLITE_DB)
    }
    SCHEDULER_EXECUTORS = {
        #使用pyinstaller打包时只能采用第二种写法,第一种写法会提示No executor by the name "threadpool" was found
        #'default': {'type': 'threadpool', 'max_workers': 20}
        'default':ThreadPoolExecutor(20)
    }
    SCHEDULER_JOB_DEFAULTS = {
        #如果某一任务错过执行多次,设定为True时,只会执行一次
        'coalesce': True,
        #应该调度执行的时间跟当前时间差值泛围,小于则会被再次调度
        'misfire_grace_time': 20,
        #每个job在同一时刻能够运行的最大实例数
        'max_instances': 10
    }
    SCHEDULER_API_ENABLED = True

总结

总之就是一句话,在打包好的exe运行所需要的环境需要在pyinstaller解压路径下存在,对于在程序运行时才能确定的模块则需要额外处理了,可以指定路径(不是所有的机器都有环境,要不然就不需要一键打包了),可以hiddenimports,也可以hooks,hooks意为勾子,也是pyinstaller的一种查找模块机制,下次再研究.

参考文章:

pyinstaller官网