1、设置与获取缓存的代码如下

    const TAG_THEME_EDITOR = 'theme_editor'; // 缓存标签:主题编辑

    const THEME_EDITOR_SESSION_PREFIX = 'theme_editor_session:'; // 缓存前缀:主题编辑会话

    const TTL = 86400; // 缓存的过期时间(秒)

    public static function saveSessionMetadata(string $sessionId, array $data): bool
    {
        return Cache::tags([
                ThemePreviewInterface::TAG_THEME_EDITOR,
                ThemePreviewInterface::TAG_THEME_EDITOR . '_' . $sessionId]
            )
            ->put(ThemePreviewInterface::THEME_EDITOR_SESSION_PREFIX . $sessionId, json_encode($data), ThemePreviewInterface::TTL);
    }

    public static function getSessionMetadata(string $sessionId): array
    {
        $data = Cache::tags([
                ThemePreviewInterface::TAG_THEME_EDITOR,
                ThemePreviewInterface::TAG_THEME_EDITOR . '_' . $sessionId
            ])
            ->get(ThemePreviewInterface::THEME_EDITOR_SESSION_PREFIX . $sessionId);
        if (empty($data)) {
            return [];
        }

        return JsonDecoder::jsonToArray($data);
    }

2、缓存的过期时间(秒)为 86400,但是在生产环境中,有极低概率出现不到 24 小时就过期的情况,甚至有才过去几个小时就无法获取缓存的情况。打印 $data,其值为 NULL。

3、决定先排除掉外部因素的干扰,先从程序本身入手分析。初步怀疑是程序本身的某处操作了缓存,导致了缓存被清空。

4、在 .env 的缓存相关配置如下

CACHE_DRIVER=redis
REDIS_CLIENT=phpredis

5、打印配置项,database.reids.client=phpredis;cache.default=redis。如图1

图1

/object # php artisan tinker
Psy Shell v0.9.12 (PHP 7.4.26 — cli) by Justin Hileman
>>> echo config('database.redis.client');
>>> echo config('database.redis.client');
phpredis⏎
>>> echo config('cache.default');
redis⏎
>>> 

6、在 /config/cache.php 中的配置如下

    'stores' => [

        'redis' => [
            'driver' => 'redis',
            'connection' => 'cache',
        ],



    ],

7、在 /config/database.php 中的配置如下


    'redis' => [

        'client' => env('REDIS_CLIENT', 'phpredis'),

        'options' => [
            'cluster' => env('REDIS_CLUSTER', 'predis'),
        ],

        'cache' => [
            'prefix' => env('REDIS_PREFIX', Str::slug(env('APP_NAME', 'laravel'), '_') . '_database_'),
            'url' => env('REDIS_URL'),
            'host' => env('REDIS_HOST', '127.0.0.1'),
            'password' => env('REDIS_PASSWORD', null),
            'port' => env('REDIS_PORT', 6379),
            'database' => env('REDIS_CACHE_DB', 1),
            'persistent'=>env('REDIS_PERSISTENT', true),
            'persistent_id'=>env('REDIS_PERSISTENT_ID', 'cache'),
            'retry_interval'=>env('REDIS_PERSISTENT_RETRY', 3000),
        ],

    ],

8、在 /config/app.php 配置文件的别名部分,未重命名 Illuminate \ Support \ Facades \ Redis 外观别名。但是,并未抛出异常:Please remove or rename the Redis facade alias in your “app” configuration file in order to avoid collision with the PHP Redis extension.。决定暂时忽略此问题。

    'aliases' => [

        'Redis' => Illuminate\Support\Facades\Redis::class,
    ],

9、仍然决定重命名为其他内容,例如 RedisManager,以避免与 Redis 类发生冲突。

    'aliases' => [

        'RedisManager' => Illuminate\Support\Facades\Redis::class,
    ],

10、决定登录至 Redis 中确认一下,先在本地环境确认一下执行命令的顺序。如图2

图2

127.0.0.1:1>GET object_cache:tag:theme_editor_48wJgWKHnabdr3AznlDhP5onmEVedTz67ULb:key
"s:22:"631ae7ba62b83808649247";"

127.0.0.1:1>SMEMBERS object_cache:631ae7ba62b83808649247:standard_ref
1) "object_cache:c613daaf6b394e496382fd07bc808ea8a497556f:theme_editor_session:9734e008-f539-4f76-9e9f-d8037334b968:48wJgWKHnabdr3AznlDhP5onmEVedTz67ULb:product_detail"
2) "object_cache:c613daaf6b394e496382fd07bc808ea8a497556f:theme_editor_session:48wJgWKHnabdr3AznlDhP5onmEVedTz67ULb"


127.0.0.1:1>GET object_cache:c613daaf6b394e496382fd07bc808ea8a497556f:theme_editor_session:48wJgWKHnabdr3AznlDhP5onmEVedTz67ULb
"s:9329:"{"id":372}";"

127.0.0.1:1>

11、决定登录至 Redis 中确认一下,基于 Tinker 查询。按照如上的执行命令的顺序,以确认本地环境中是否存在对应的 key。如图3

图3

PS E:\wwwroot\object> php artisan tinker
Psy Shell v0.9.12 (PHP 7.4.27 — cli) by Justin Hileman
>>> var_dump(\Illuminate\Support\Facades\Redis::connection('cache')->get('object_cache:tag:theme_
editor_48wJgWKHnabdr3AznlDhP5onmEVedTz67ULb:key'));
string(30) "s:22:"631ae7ba62b83808649247";"
=> null
>>> var_dump(\Illuminate\Support\Facades\Redis::connection('cache')->SMEMBERS('object_cache:631ae
7ba62b83808649247:standard_ref'));
array(2) {
  [0]=>
  string(186) "object_cache:c613daaf6b394e496382fd07bc808ea8a497556f:theme_editor_session:9734e008-f539-4f76-9e9f-d8037334b968:48wJgWKHnabdr3AznlDhP5onmEVedTz67ULb:product_detail"
  [1]=>
  string(134) "object_cache:c613daaf6b394e496382fd07bc808ea8a497556f:theme_editor_session:48wJgWKHnabdr3AznlDhP5onmEVedTz67ULb"
}
=> null
>>> var_dump(\Illuminate\Support\Facades\Redis::connection('cache')->GET('object_cache:c613daaf6b
394e496382fd07bc808ea8a497556f:theme_editor_session:48wJgWKHnabdr3AznlDhP5onmEVedTz67ULb'));
string(9339) "s:9329:"{"id":372}";"
=> null
>>> var_dump(\Illuminate\Support\Facades\Redis::connection('cache')->TTL('object_cache:c613daaf6b
394e496382fd07bc808ea8a497556f:theme_editor_session:48wJgWKHnabdr3AznlDhP5onmEVedTz67ULb'));
int(81060)
=> null
>>>                                                                                                                         

12、TTL key,以秒为单位,返回给定 key 的剩余生存时间(TTL, time to live)。一个 key 是 2022/09/09 15:14 生成的。现在时间是 2022/09/09 16:44 。已经过去了 1 小时 30 分钟。86400 – 1 * 3600 – 30 * 60 = 81000。剩余生存秒数为:int(81060)。符合预期。如图4

图4

13、决定在线上环境重新生成一个,然后查看剩余生存时间秒数。刚生成的一个 key 的剩余生存秒数为:int(85730)。符合预期。如图5

图5

14、决定登录至 Redis 中确认一下,基于 Tinker 查询。按照如上的执行命令的顺序,以确认线上环境中是否存在对应的 key。确定对应的 key 未被完全清空。如果执行了 Cache::tags(ThemePreviewInterface::TAG_THEME_EDITOR . ‘_’ . $sessionId)->flush() 或者 Cache::flush(),则 执行查询 var_dump(\Illuminate\Support\Facades\Redis::connection(‘cache’)->get(‘arrest9_cache:tag:theme_editor_Uq6BLXO7GsbWHQovCXQAshX8SdaS6BXh2lWE:key’)); 时,会返回 null,但是现在是有值的。如图6、图6.1

图6

 

图6.1

>>> var_dump(\Illuminate\Support\Facades\Redis::connection('cache')->get('arrest9_cache:tag:theme_editor_Uq6BLXO7GsbWHQovCXQAshX8SdaS6BXh2lWE:key'));
string(30) "s:22:"63243db5c01db958165221";
=> null
>>> var_dump(\Illuminate\Support\Facades\Redis::connection('cache')->TTL('arrest9_cache:tag:theme_editor_Uq6BLXO7GsbWHQovCXQAshX8SdaS6BXh2lWE:key'));
int(-1)
=> null
>>> var_dump(\Illuminate\Support\Facades\Redis::connection('cache')->SMEMBERS('arrest9_cache:63243db5c01db958165221:standard_ref'));
array(0) {
}
=> null
>>> 
>>> var_dump(\Illuminate\Support\Facades\Redis::connection('cache')->get('object_cache:tag:theme_
editor_ERo5GoALUnDuxcMtW3yPYPKsjwci8DVZR2ij:key'));
NULL
=> null
>>> var_dump(\Illuminate\Support\Facades\Redis::connection('cache')->SMEMBERS('object_cache:632a7
fda56e94065934346:standard_ref'));
array(0) {
}
=> null
>>>  

15、决定基于事件监听,添加相应的日志。新建监听器文件 /app/Listeners/Cache/LogKeyWrittenListener.php ,当写入 key 时。

<?php

namespace App\Listeners\Cache;

use Illuminate\Cache\Events\KeyWritten;
use Illuminate\Support\Facades\Log;
use Modules\OnlineStoreThemeGraphQL\Preview\ThemePreviewInterface;

class LogKeyWrittenListener
{
    /**
     * 处理事件
     *
     * @param  KeyWritten  $event
     * @return void
     */    public function handle(KeyWritten $event)
    {
        if (config('app.debug') === true && strpos($event->key, ThemePreviewInterface::THEME_EDITOR_SESSION_PREFIX) !== false) {
            Log::debug('theme_editor_session_key_written', [
                'key' => $event->key,
                'tags' => $event->tags,
                'value' => $event->value,
                'seconds' => $event->seconds
            ]);
        }
    }
}

16、生成的日志信息如下。如图7

图7

[2022-09-15 17:26:05] local.DEBUG: theme_editor_session_key_written {
    "key": "theme_editor_session:7JivgR1WqXVeE4hsyA335ILc60k153yTynIN",
    "tags": [
        "theme_editor",
        "theme_editor_7JivgR1WqXVeE4hsyA335ILc60k153yTynIN"
    ],
    "value": "",
    "seconds": 86400
} 

17、新建监听器文件 /app/Listeners/Cache/LogCacheHitListener.php ,当命中 key 时。

    /**
     * 处理事件
     *
     * @param  CacheHit  $event
     * @return void
     */    public function handle(CacheHit $event)
    {
        if (config('app.debug') === true && strpos($event->key, ThemePreviewInterface::THEME_EDITOR_SESSION_PREFIX) !== false) {
            Log::debug('theme_editor_session_cache_hit', [
                'key' => $event->key,
                'tags' => $event->tags,
                'value' => $event->value
            ]);
        }
    }
[2022-09-15 18:09:08] local.DEBUG: theme_editor_session_cache_hit {
    "key": "theme_editor_session:KrV0Nf5rMOiS5ZdF6i2uDandZZZWPPpXcQ4D",
    "tags": [
        "theme_editor",
        "theme_editor_KrV0Nf5rMOiS5ZdF6i2uDandZZZWPPpXcQ4D"
    ],
    "value": ""
} 

18、新建监听器文件 /app/Listeners/Cache/LogCacheMissedListener.php ,当未命中 key 时。

    /**
     * 处理事件
     *
     * @param  CacheMissed  $event
     * @return void
     */    public function handle(CacheMissed $event)
    {
        if (config('app.debug') === true && strpos($event->key, ThemePreviewInterface::THEME_EDITOR_SESSION_PREFIX) !== false) {
            Log::debug('theme_editor_session_cache_missed', [
                'key' => $event->key,
                'tags' => $event->tags
            ]);
        }
    }
[2022-09-15 18:10:15] local.DEBUG: theme_editor_session_cache_missed {
    "key": "theme_editor_session:KrV0Nf5rMOiS5ZdF6i2uDandZZZWPPpXcQ4D",
    "tags": [
        "theme_editor",
        "theme_editor_KrV0Nf5rMOiS5ZdF6i2uDandZZZWPPpXcQ4D"
    ]
} 

19、但是,当移除带有标签的缓存数据时,使用 Cache::tags(ThemePreviewInterface::TAG_THEME_EDITOR . ‘_’ . $sessionId)->flush(); ,监听器 /app/Listeners/Cache/LogKeyForgottenListener.php 并未生成日志,当清除 key 时。

    /**
     * 处理事件
     *
     * @param  KeyForgotten  $event
     * @return void
     */    public function handle(KeyForgotten $event)
    {
        if (config('app.debug') === true && strpos($event->key, ThemePreviewInterface::THEME_EDITOR_SESSION_PREFIX) !== false) {
            Log::debug('theme_editor_session_key_forgotten', [
                'key' => $event->key,
                'tags' => $event->tags
            ]);
        }
    }

20、决定使用 flush 方法清空所有的缓存,Cache::flush();。以确认是否会触发清除 key 的事件,确认未触发。但是当执行了清空所有的缓存操作后,已经无法获取带有标签的缓存数据了。不论是否带有标签,所有的缓存数据皆被清空。如图8

图8

21、所以即使有 4 个缓存日志监听器,仍然很难判断 key 是什么时候被清掉的。只能够先在预计会被清掉的地方,记录一下相应的日志。

22、针对第 16 步骤,决定咨询一下运维同事,有时候运维会针对 Redis 的 死 key ,运行 Python 脚本,手动执行清理操作。因此,决定将同样的 Python 脚本在本地环境模拟复现。执行:python batch-remove-rediskey.py /tmp。本地未复现,分析脚本,线上有可能复现。不过,运维已经确定,在最近出现问题的时间段内,并未手动运行过此脚本。如图9

图9

PS E:\wwwroot\object> python batch-remove-rediskey.py /tmp
/tmp*
All done

23、由于设置了2个 tag,其中一个 tag 下的 key 的数量会不断地递增,其数量最终可能在一周内达到几千甚至几万个。最终决定仅设置1个 tag。设置与获取缓存的代码如下

    const TAG_THEME_EDITOR = 'theme_editor'; // 缓存标签:主题编辑

    const THEME_EDITOR_SESSION_PREFIX = 'theme_editor_session:'; // 缓存前缀:主题编辑会话

    const TTL = 86400; // 缓存的过期时间(秒)

    public static function saveSessionMetadata(string $sessionId, array $data): bool
    {
        return Cache::tags([
                ThemePreviewInterface::TAG_THEME_EDITOR . '_' . $sessionId]
            )
            ->put(ThemePreviewInterface::THEME_EDITOR_SESSION_PREFIX . $sessionId, json_encode($data), ThemePreviewInterface::TTL);
    }

    public static function getSessionMetadata(string $sessionId): array
    {
        $data = Cache::tags([
                ThemePreviewInterface::TAG_THEME_EDITOR . '_' . $sessionId
            ])
            ->get(ThemePreviewInterface::THEME_EDITOR_SESSION_PREFIX . $sessionId);
        if (empty($data)) {
            return [];
        }

        return JsonDecoder::jsonToArray($data);
    }

24、在仅设置了1个 tag 后,且这1个 tag 中的 key 的数量一般就维持在 20 个左右,缓存未再出现无法命中的情况,整体运行正常。初步分析得出结论:在使用 Cache::tags 时,其下的 tag 的数量不宜过多,建议仅设置 1 个,具体到每个 tag 下的 key 的数量也不宜过多,建议不要超过 100。

永夜