基于 Yii 2 的 HTTP 客户端扩展,下载文件且另存为具体的文件名(下载远程文件至服务器),内存占用从 400 MB 下降至 7 MB 的实现

1、现在已经实现:复制来源的资源文件至渠道发布的资源目录,返回相对路径(同步)。代码如下

    /**
     * 复制来源的资源文件至渠道发布的资源目录,返回相对路径(同步)
     * @param string $source 来源
     * 格式如下:spider
     * @param array $assets 来源的资源文件的绝对URL
     * 格式如下:
     * [
     *     [
     *         'type' => 'image',
     *         'absolute_url' => 'http://localhost/spider/storage/spider/images/1.png',
     *     ],
     *     [
     *         'type' => 'video',
     *         'absolute_url' => 'http://127.0.0.1/channel-pub-api/storage/spider/videos/7月份北上广深等十大城市租金环比上涨 看东方 20180820 高清_高清.mp4',
     *     ],
     * ]
     *
     * @return array $channelPubApiAssetAbsolutePaths 渠道发布的资源文件的相对路径
     * 格式如下:
     * [
     *     [
     *         'type' => 'image',
     *         'relative_path' => '/2018/09/20/1537439889.2333.1441541478.png',
     *     ],
     *     [
     *         'type' => 'video',
     *         'relative_path' => '/2018/09/20/1537439889.2403.62871268.mp4',
     *     ],
     * ]
     *
     * @throws UnprocessableEntityHttpException 如果来源的资源文件不是以来源的 HOME URL + BASE URL 开头,将抛出 422 HTTP 异常
     * @throws NotFoundHttpException 如果来源的资源文件不存在,将抛出 404 HTTP 异常
     * @throws ServerErrorHttpException 如果创建目录失败,将抛出 500 HTTP 异常
     * @throws ServerErrorHttpException 如果复制来源的资源文件至渠道发布的资源目录失败,将抛出 500 HTTP 异常
     * @throws \yii\base\Exception if the directory could not be created (i.e. php error due to parallel changes)
     */    public static function copyAssetsSync($source, $assets)
    {
        // file_put_contents('/mcloud/www/channel-pub-api/console/runtime/copy-assets-sync-source-' . $source . '-' . time() . '.txt', $source);
        // file_put_contents('C:/phpStudy/PHPTutorial/WWW/channel-pub-api/douyin/runtime/copy-assets-sync-assets-' . $source . '-' . time() . '.txt', print_r($assets, true));
        // 不是以来源的 HOME URL + BASE URL 开头的来源的资源文件
        $notAbsoluteUrlStartKeys = [];
        // 不存在的来源的资源文件
        $notExistsKeys = [];
        // 渠道发布的资源文件的相对路径
        $channelPubApiAssetAbsolutePaths = [];

        foreach ($assets as $key => $asset) {
            if ($index = stripos($asset['absolute_url'], '?')) {
                $absoluteUrl = substr($asset['absolute_url'], 0, $index);
                $asset['absolute_url'] = $absoluteUrl;
                $assets[$key]['absolute_url'] = $absoluteUrl;
            }
            // 检查来源的资源文件的绝对URL是否以来源的 HOME URL + BASE URL 开头
            if (!StringHelper::startsWith($asset['absolute_url'], Yii::$app->params['source']['asset'][$asset['type']]['hostInfo'] . Yii::$app->params['source']['asset'][$asset['type']]['baseUrl'])) {
                $notAbsoluteUrlStartKeys[] = $asset['absolute_url'];
            } else {
                // 获取来源的资源文件的绝对路径
                $sourceAssetAbsolutePath = Yii::$app->params['source']['asset'][$asset['type']]['basePath'] . str_replace(Yii::$app->params['source']['asset'][$asset['type']]['hostInfo'] . Yii::$app->params['source']['asset'][$asset['type']]['baseUrl'], '', $asset['absolute_url']);
                // 检查来源的资源文件是否存在
                if (!file_exists($sourceAssetAbsolutePath)) {
                    $notExistsKeys[] = $sourceAssetAbsolutePath;
                }
            }
        }

        if (!empty($notAbsoluteUrlStartKeys)) {
            $notAbsoluteUrlStartKeys = implode(",", $notAbsoluteUrlStartKeys);
            throw new UnprocessableEntityHttpException(Yii::t('error', Yii::t('error', Yii::t('error', '202003'), ['not_absolute_url_start_keys' => $notAbsoluteUrlStartKeys])), 202003);
        }

        if (!empty($notExistsKeys)) {
            $notExistsKeys = implode(",", $notExistsKeys);
            throw new NotFoundHttpException(Yii::t('error', Yii::t('error', Yii::t('error', '202004'), ['not_exists_keys' => $notExistsKeys])), 202004);
        }

        foreach ($assets as $key => $asset) {
            // 获取来源的资源文件的绝对路径
            $sourceAssetAbsolutePath = Yii::$app->params['source']['asset'][$asset['type']]['basePath'] . str_replace(Yii::$app->params['source']['asset'][$asset['type']]['hostInfo'] . Yii::$app->params['source']['asset'][$asset['type']]['baseUrl'], '', $asset['absolute_url']);

            // 获取来源的资源文件的路径信息
            $pathInfo = pathinfo($sourceAssetAbsolutePath);
            // 渠道发布的资源文件的相对路径
            $directory = date('Y/m/d');
            $channelPubApiAssetRelativePath = '/' . $directory . '/' . microtime(true) . '.' . mt_rand() . '.' . $pathInfo['extension'];
            // 渠道发布的资源文件的绝对路径
            $channelPubApiAssetAbsolutePath = Yii::$app->params['channelPubApi']['asset'][$asset['type']]['basePath'] . $channelPubApiAssetRelativePath;
            // 创建目录
            if (!FileHelper::createDirectory(Yii::$app->params['channelPubApi']['asset'][$asset['type']]['basePath'] . '/' . $directory)) {
                throw new ServerErrorHttpException(Yii::t('error', Yii::t('error', Yii::t('error', '202005'), ['directory' => Yii::$app->params['channelPubApi']['asset'][$asset['type']]['basePath'] . '/' . $directory])), 202005);
            }
            // 复制来源的资源文件至渠道发布的资源目录
            if (!copy($sourceAssetAbsolutePath, $channelPubApiAssetAbsolutePath)) {
                throw new ServerErrorHttpException(Yii::t('error', Yii::t('error', Yii::t('error', '202006'), ['source_asset_absolute_path' => $sourceAssetAbsolutePath])), 202006);
            }

            $channelPubApiAssetAbsolutePaths[$key] = [
                'type' => $asset['type'],
                'relative_path' => $channelPubApiAssetRelativePath,
            ];
        }

        return $channelPubApiAssetAbsolutePaths;
    }

2、现在需要有所调整,能够再实现:下载文件且另存为具体的文件名,而不是复制。

3、先验证第一步骤的实现。调用此方法,打印响应结果,代码如下

        $copyAssetsResult = AssetService::copyAssetsSync(
            'spider',
            [
                [
                    'type' => 'image',
                    'absolute_url' => 'http://127.0.0.1/channel-pub-api/storage/spider/images/1.png',
                ],
                [
                    'type' => 'video',
                    'absolute_url' => 'http://127.0.0.1/channel-pub-api/storage/spider/videos/02018684a82381d9c59bb085e18e1a5d.mp4',
                ],
            ]
        );

        print_r($copyAssetsResult);
        exit;

4、打印响应结果,如图1

图1

Array
(
    [0] => Array
        (
            [type] => image
            [relative_path] => /2020/11/24/1606187993.2607.476899553.png
        )

    [1] => Array
        (
            [type] => video
            [relative_path] => /2020/11/24/1606187993.2711.210785146.mp4
        )

)

5、查看渠道发布的资源目录中,复制后的文件。其绝对路径分别为:E:\wwwroot\channel-pub-api\storage\channel-pub-api\images\2020\11\24\1606187993.2607.476899553.png、E:\wwwroot\channel-pub-api\storage\channel-pub-api\videos\2020\11\24\1606187993.2711.210785146.mp4。如图2

图2

6、实现 HTTP 模型,/common/logics/http/asset_api/Download.php

<?php
/**
 * Created by PhpStorm.
 * User: Qiang Wang
 * Date: 2020/11/24
 * Time: 15:47
 */
namespace common\logics\http\asset_api;

use Yii;
use yii\base\InvalidConfigException;
use yii\httpclient\Client;
use yii\httpclient\Exception;
use yii\web\ServerErrorHttpException;

/**
 * 资源接口的下载
 *
 * @author Qiang Wang <shuijingwanwq@163.com>
 * @since 1.0
 */class Download extends Model
{
    /**
     * HTTP请求,基于来源的资源文件的绝对URL下载文件
     *
     * @param string $url 来源的资源文件的绝对URL
     *
     * @return string
     * 格式如下:
     *
     * @throws ServerErrorHttpException 如果响应状态码不等于20x
     * @throws InvalidConfigException
     * @throws Exception
     */    public function download($url)
    {
        $client = new Client([
            'transport' => 'yii\httpclient\CurlTransport' // 只有 cURL 支持此选项
        ]);
        $response = $client->createRequest()
            ->setMethod('GET')
            ->setUrl($url)
            ->send();
        // 检查响应状态码是否等于20x
        if ($response->isOk) {
            return $response->content;
        } else {
            throw new ServerErrorHttpException(Yii::t('error', Yii::t('error', Yii::t('error', '202218'), ['status_code' => $response->getStatusCode()])), 202218);
        }
    }
}

7、新的实现:下载来源的资源文件后写入至渠道发布的资源目录,返回相对路径(同步)。代码如下

            /* HTTP请求,基于来源的资源文件的绝对URL下载文件 */            $httpAssetApiDownload = new HttpAssetApiDownload();

            foreach ($assets as $key => $asset) {
                $downloadAsset = $httpAssetApiDownload->download($asset['absolute_url']);

                // 渠道发布的资源文件的相对路径
                $directory = date('Y/m/d');

                $channelPubApiAssetRelativePath = '/' . $directory . '/' . microtime(true) . '.' . mt_rand() . '.' . $asset['extension'];
                // 渠道发布的资源文件的绝对路径
                $channelPubApiAssetAbsolutePath = Yii::$app->params['channelPubApi']['asset'][$asset['type']]['basePath'] . $channelPubApiAssetRelativePath;
                // 创建目录
                if (!FileHelper::createDirectory(Yii::$app->params['channelPubApi']['asset'][$asset['type']]['basePath'] . '/' . $directory)) {
                    throw new ServerErrorHttpException(Yii::t('error', Yii::t('error', Yii::t('error', '202005'), ['directory' => Yii::$app->params['channelPubApi']['asset'][$asset['type']]['basePath'] . '/' . $directory])), 202005);
                }
                // 下载来源的资源文件后写入至渠道发布的资源目录
                if (!file_put_contents($channelPubApiAssetAbsolutePath, $downloadAsset, LOCK_EX)) {
                    throw new ServerErrorHttpException(Yii::t('error', Yii::t('error', Yii::t('error', '202220'), ['source_asset_absolute_url' => $asset['source_asset_absolute_url']])), 202220);
                }

                $channelPubApiAssetAbsolutePaths[$key] = [
                    'type' => $asset['type'],
                    'relative_path' => $channelPubApiAssetRelativePath,
                ];
            }

8、运行代码,视频文件的大小为:396 MB,报错:Allowed memory size of 134217728 bytes exhausted (tried to allocate 62918656 bytes)。如图3

图3

{
    "name": "PHP Fatal Error",
    "message": "Allowed memory size of 134217728 bytes exhausted (tried to allocate 62918656 bytes)",
    "code": 1,
    "type": "yii\\base\\ErrorException",
    "file": "E:\\wwwroot\\channel-pub-api\\vendor\\yiisoft\\yii2-httpclient\\src\\CurlTransport.php",
    "line": 40,
    "stack-trace": [
        "#0 [internal function]: yii\\base\\ErrorHandler->handleFatalError()",
        "#1 {main}"
    ]
}

9、编辑 php.ini 文件,修改 memory_limit = 128M 为 memory_limit = 1024M。设置为 512M 时仍然报错。运行成功。查看日志,占用内存:406 MB。如图4

图4

10、查看渠道发布的资源目录中,复制后的文件。其绝对路径分别为:E:\wwwroot\channel-pub-api\storage\channel-pub-api\images\2020\11\25\1606291709.4136.2030184970.png、E:\wwwroot\channel-pub-api\storage\channel-pub-api\videos\2020\11\25\1606291735.722.2031039128.mp4。符合预期。如图5

图5

11、此方案的问题在于:内存占用随着下载文件的大小而变化。可能因为下载文件的大小超出 memory_limit 而失败。决定再想办法优化完善。

12、重构 HTTP 模型,/common/logics/http/asset_api/Download.php。使用方法:setOutputFile()。与 yii\httpclient\CurlTransport 一起使用以设置传输写入的文件。如图6

图6

<?php
/**
 * Created by PhpStorm.
 * User: Qiang Wang
 * Date: 2020/11/24
 * Time: 15:47
 */
namespace common\logics\http\asset_api;

use Yii;
use yii\base\InvalidConfigException;
use yii\httpclient\Client;
use yii\httpclient\Exception;
use yii\web\ServerErrorHttpException;

/**
 * 资源接口的下载
 *
 * @author Qiang Wang <shuijingwanwq@163.com>
 * @since 1.0
 */class Download extends Model
{
    /**
     * HTTP请求,基于来源的资源文件的绝对URL下载文件
     *
     * @param string $url 来源的资源文件的绝对URL
     * @param string $absolutePath 资源文件的绝对路径(下载至的目标路径)
     *
     * @return bool
     * 格式如下:true
     *
     * @throws ServerErrorHttpException 如果响应状态码不等于20x
     * @throws InvalidConfigException
     * @throws Exception
     */    public function download($url, $absolutePath)
    {
        // 打开即将下载的本地文件,在该文件上打开一个流
        $handle = fopen($absolutePath, 'w');
        if (!$handle) {
            throw new ServerErrorHttpException(Yii::t('error', Yii::t('error', Yii::t('error', '202220'), ['absolute_path' => $absolutePath])), 202220);
        }

        $client = new Client([
            'transport' => 'yii\httpclient\CurlTransport' // 只有 cURL 支持此选项
        ]);
        $response = $client->createRequest()
            ->setMethod('GET')
            ->setUrl($url)
            ->setOutputFile($handle)
            ->send();

        // 关闭一个已打开的文件指针
        fclose($handle);

        // 检查响应状态码是否等于20x
        if ($response->isOk) {
            return $response->content;
        } else {
            throw new ServerErrorHttpException(Yii::t('error', Yii::t('error', Yii::t('error', '202218'), ['status_code' => $response->getStatusCode()])), 202218);
        }
    }
}

13、重构新的实现:下载来源的资源文件同步写入至渠道发布的资源目录,返回相对路径(同步)。代码如下

            /* HTTP请求,基于来源的资源文件的绝对URL下载文件 */            $httpAssetApiDownload = new HttpAssetApiDownload();

            foreach ($assets as $key => $asset) {

                // 渠道发布的资源文件的相对路径
                $directory = date('Y/m/d');

                $channelPubApiAssetRelativePath = '/' . $directory . '/' . microtime(true) . '.' . mt_rand() . '.' . $asset['extension'];
                // 渠道发布的资源文件的绝对路径
                $channelPubApiAssetAbsolutePath = Yii::$app->params['channelPubApi']['asset'][$asset['type']]['basePath'] . $channelPubApiAssetRelativePath;
                // 创建目录
                if (!FileHelper::createDirectory(Yii::$app->params['channelPubApi']['asset'][$asset['type']]['basePath'] . '/' . $directory)) {
                    throw new ServerErrorHttpException(Yii::t('error', Yii::t('error', Yii::t('error', '202005'), ['directory' => Yii::$app->params['channelPubApi']['asset'][$asset['type']]['basePath'] . '/' . $directory])), 202005);
                }

                // HTTP请求,基于来源的资源文件的绝对URL下载文件
                $httpAssetApiDownload->download($asset['absolute_url'], $channelPubApiAssetAbsolutePath);

                $channelPubApiAssetAbsolutePaths[$key] = [
                    'type' => $asset['type'],
                    'relative_path' => $channelPubApiAssetRelativePath,
                ];
            }

14、运行成功。查看日志,占用内存:6.377 MB。下载文件的大小不再受到 memory_limit 的限制。如图7

图7

15、为何采用 cURL ,原因在于:cURL 的性能优于 file_get_contents 和 fopen。但是可以肯定的是内存占用优于 file_get_contents。未经测试,此结论为 PHP 官方文档所述。链接:https://www.php.net/manual/zh/ref.curl.php 。如图8

图8

永夜