在 PHP 中基于 HTTP 协议检测文件的大小(即不下载文件的情况下,获取远程文件的大小),使用 URL 函数 get_headers — 取得服务器响应一个 HTTP 请求所发送的所有标头

1、参考网址:https://www.shuijingwanwq.com/2021/08/10/5155/ 。在 PHP 中基于 HTTP 协议检测文件的大小(即不下载文件的情况下,获取远程文件的大小),之前的实现,使用文件系统函数 fopen — 打开文件或者 URL。如果文件的网址需要 301 跳转,则不能够获取到文件的真实大小。如图1

图1

<?php
function remote_filesize($url) {
    static $regex = '/^Content-Length: *+\K\d++$/im';
 // 将 $url 指定的名字资源绑定到一个流上,只读方式打开,将文件指针指向文件头
    if (!$fp = @fopen($url, 'rb')) {
        return 0;
    }
 // 执行 fopen() 时可以使用变量 $http_response_header。其中包含响应标头的数组。
    if (
        isset($http_response_header) &&
        preg_match($regex, implode("\n", $http_response_header), $matches)
    ) {
  print_r($http_response_header);
        return (int)$matches[0];
    }
 return 0;
}

$startTime = microtime(true);
$url = 'http://www.shuijingwanwq.com/wp-content/uploads/2021/07/ChinaJoy 2015 上海 梦幻西游.mp4';
echo remote_filesize($url) . "\n";
$endTime = microtime(true);
echo '耗费时间:' . ($endTime - $startTime) . '秒';
?>

Array
(
    [0] => HTTP/1.1 301 Moved Permanently
    [1] => Server: nginx
    [2] => Date: Tue, 10 Aug 2021 05:16:59 GMT
    [3] => Content-Type: text/html
    [4] => Content-Length: 162
    [5] => Connection: close
    [6] => Location: https://www.shuijingwanwq.com/wp-content/uploads/2021/07/ChinaJoy 2015 上海 梦幻西游.mp4
    [7] => Strict-Transport-Security: max-age=15768000
    [8] => HTTP/1.1 200 OK
    [9] => Server: nginx
    [10] => Date: Tue, 10 Aug 2021 05:17:00 GMT
    [11] => Content-Type: video/mp4
    [12] => Content-Length: 123596884
    [13] => Last-Modified: Tue, 04 Dec 2018 06:27:23 GMT
    [14] => Connection: close
    [15] => ETag: "5c061e4b-75df054"
    [16] => Expires: Thu, 09 Sep 2021 05:17:00 GMT
    [17] => Cache-Control: max-age=2592000
    [18] => Strict-Transport-Security: max-age=15768000
    [19] => Accept-Ranges: bytes
)
162
耗费时间:0.34205508232117秒

2、参考网址:https://www.php.net/manual/zh/function.get-headers 。决定使用 URL 函数 get_headers() 返回一个数组,包含有服务器响应一个 HTTP 请求所发送的标头。

3、新的实现方案,其代码更为简洁,且耗费时间更短。不过如果文件的网址需要 301 跳转,则不能够获取到文件的真实大小的问题仍然存在。如图2

图2

<?php
function remote_filesize($url) {
 // 返回一个数组,包含有服务器响应一个 HTTP 请求所发送的标头。
 $headers = get_headers($url, 1);
    if ($headers !== false && isset($headers['Content-Length'][0])) {
  print_r($headers);
        return $headers['Content-Length'][0];
    }
 return 0;
}

$startTime = microtime(true);
$url = 'http://www.shuijingwanwq.com/wp-content/uploads/2021/07/ChinaJoy 2015 上海 梦幻西游.mp4';
echo remote_filesize($url) . "\n";
$endTime = microtime(true);
echo '耗费时间:' . ($endTime - $startTime) . '秒';
?>

Array
(
    [0] => HTTP/1.1 301 Moved Permanently
    [Server] => Array
        (
            [0] => nginx
            [1] => nginx
        )

    [Date] => Array
        (
            [0] => Tue, 10 Aug 2021 05:48:21 GMT
            [1] => Tue, 10 Aug 2021 05:48:21 GMT
        )

    [Content-Type] => Array
        (
            [0] => text/html
            [1] => video/mp4
        )

    [Content-Length] => Array
        (
            [0] => 162
            [1] => 123596884
        )

    [Connection] => Array
        (
            [0] => close
            [1] => close
        )

    [Location] => https://www.shuijingwanwq.com/wp-content/uploads/2021/07/ChinaJoy 2015 上海 梦幻西游.mp4
    [Strict-Transport-Security] => Array
        (
            [0] => max-age=15768000
            [1] => max-age=15768000
        )

    [1] => HTTP/1.1 200 OK
    [Last-Modified] => Tue, 04 Dec 2018 06:27:23 GMT
    [ETag] => "5c061e4b-75df054"
    [Expires] => Thu, 09 Sep 2021 05:48:21 GMT
    [Cache-Control] => max-age=2592000
    [Accept-Ranges] => bytes
)
162
耗费时间:0.27841401100159秒

4、新的实现方案,如果文件的网址不需要 301 跳转,则不存在:Location 。如图3

图3


Array
(
    [0] => HTTP/1.1 200 OK
    [Server] => nginx
    [Date] => Tue, 10 Aug 2021 05:49:20 GMT
    [Content-Type] => video/mp4
    [Content-Length] => 123596884
    [Last-Modified] => Tue, 04 Dec 2018 06:27:23 GMT
    [Connection] => close
    [ETag] => "5c061e4b-75df054"
    [Expires] => Thu, 09 Sep 2021 05:49:20 GMT
    [Cache-Control] => max-age=2592000
    [Strict-Transport-Security] => max-age=15768000
    [Accept-Ranges] => bytes
)
1
耗费时间:0.17363715171814秒

5、如果文件的网址需要 301 跳转,则继续请求 301 跳转的网址,以获取文件的大小。代码实现如下。最终获取到跳转后的文件的真实大小。如图4

图4

<?php
function remote_filesize($url) {
 echo 0 . "\n";
 // 返回一个数组,包含有服务器响应一个 HTTP 请求所发送的标头。
 $filesize = 0;
 $headers = get_headers($url, 1);

 if ($headers !== false) {
  echo 1 . "\n";
  if (!isset($headers['Location']) && isset($headers['Content-Length'])) {
   echo 2 . "\n";
   $filesize = $headers['Content-Length'];
  }
  if (isset($headers['Location'])) {
   echo 3 . "\n";
   return remote_filesize($headers['Location']);
  }
    }

 return $filesize;
}

$startTime = microtime(true);
$url = 'http://www.shuijingwanwq.com/wp-content/uploads/2021/07/ChinaJoy 2015 上海 梦幻西游.mp4';
echo remote_filesize($url) . "\n";
$endTime = microtime(true);
echo '耗费时间:' . ($endTime - $startTime) . '秒';
?>

0
1
3
0
1
2
123596884
耗费时间:0.404137134552秒

6、如果文件的网址不需要 301 跳转,执行顺序如下,仅请求了 1 次。同样的文件:秦时明月第一集荧惑守心.mp4,请求的耗费时间相当,不过现方案略微短于之前的方案。注:时间单位为秒。如图5

图5

<?php
function remote_filesize($url) {
 echo 0 . "\n";
 // 返回一个数组,包含有服务器响应一个 HTTP 请求所发送的标头。
 $filesize = 0;
 $headers = get_headers($url, 1);

 if ($headers !== false) {
  echo 1 . "\n";
  if (!isset($headers['Location']) && isset($headers['Content-Length'])) {
   echo 2 . "\n";
   $filesize = $headers['Content-Length'];
  }
  if (isset($headers['Location'])) {
   echo 3 . "\n";
   return remote_filesize($headers['Location']);
  }
    }

 return $filesize;
}

$startTime = microtime(true);
$url = 'https://www.shuijingwanwq.com/wp-content/uploads/2021/07/秦时明月第一集荧惑守心.mp4';
echo remote_filesize($url) . "\n";
$endTime = microtime(true);
echo '耗费时间:' . ($endTime - $startTime) . '秒';
?>

0
1
2
143162935
耗费时间:0.14869999885559秒

 

永夜