Nicksxs's Blog

聊聊 mysql 的 MVCC

发表于 2020-04-26 更新于 2020-05-02 分类于 Mysql ， C ，数据结构，源码， Mysql 阅读次数： Disqus：

很久以前，有位面试官问到，你知道 mysql 的事务隔离级别吗，“额 O__O …，不太清楚”，完了之后我就去网上找相关的文章，找到了这篇MySQL 四种事务隔离级的说明, 文章写得特别好，看了这个就懂了各个事务隔离级别都是啥，不过看了这个之后多思考一下的话还是会发现问题，这么神奇的事务隔离级别是怎么实现的呢

其中 innodb 的事务隔离用到了标题里说到的 mvcc，Multiversion concurrency control, 直译过来就是多版本并发控制，先不讲这个究竟是个啥，考虑下如果纯猜测，这个事务隔离级别应该会是怎么样实现呢，愚钝的我想了下，可以在事务开始的时候拷贝一个表，这个可以支持 RR 级别，RC 级别就不支持了，而且要是个非常大的表，想想就不可行

腆着脸说虽然这个不可行，但是思路是对的，具体实行起来需要做一系列（肥肠多）的改动，首先根据我的理解，其实这个拷贝一个表是变成拷贝一条记录，但是如果有多个事务，那就得拷贝多次，这个问题其实可以借助版本管理系统来解释，在用版本管理系统，git 之类的之前，很原始的可能是开发完一个功能后，就打个压缩包用时间等信息命名，然后如果后面要找回这个就直接用这个压缩包的就行了，后来有了 svn，git 中心式和分布式的版本管理系统，它的一个特点是粒度可以控制到文件和代码行级别，对应的我们的 mysql 事务是不是也可以从一开始预想的表级别细化到行的级别，可能之前很多人都了解过，数据库的一行记录除了我们用户自定义的字段，还有一些额外的字段，去源码data0type.h里捞一下

/* Precise data types for system columns and the length of those columns;
NOTE: the values must run from 0 up in the order given! All codes must
be less than 256 */
#define DATA_ROW_ID 0     /* row id: a 48-bit integer */
#define DATA_ROW_ID_LEN 6 /* stored length for row id */

/** Transaction id: 6 bytes */
constexpr size_t DATA_TRX_ID = 1;

/** Transaction ID type size in bytes. */
constexpr size_t DATA_TRX_ID_LEN = 6;

/** Rollback data pointer: 7 bytes */
constexpr size_t DATA_ROLL_PTR = 2;

/** Rollback data pointer type size in bytes. */
constexpr size_t DATA_ROLL_PTR_LEN = 7;

一个是 DATA_ROW_ID，这个是在数据没指定主键的时候会生成一个隐藏的，如果用户有指定主键就是主键了

一个是 DATA_TRX_ID，这个表示这条记录的事务 ID

还有一个是 DATA_ROLL_PTR 指向回滚段的指针

指向的回滚段其实就是我们常说的 undo log，这里面的具体结构就是个链表，在 mvcc 里会使用到这个，还有就是这个 DATA_TRX_ID，每条记录都记录了这个事务 ID，表示的是这条记录的当前值是被哪个事务修改的，下面就扯回事务了，我们知道 Read Uncommitted，其实用不到隔离，直接读取当前值就好了，到了 Read Committed 级别，我们要让事务读取到提交过的值，mysql 使用了一个叫 read view 的玩意，它里面有这些值是我们需要注意的，

m_low_limit_id, 这个是 read view 创建时最大的活跃事务 id

m_up_limit_id, 这个是 read view 创建时最小的活跃事务 id

m_ids, 这个是 read view 创建时所有的活跃事务 id 数组

m_creator_trx_id 这个是当前记录的创建事务 id

判断事务的可见性主要的逻辑是这样，

当记录的事务 id 小于最小活跃事务 id，说明是可见的，
如果记录的事务 id 等于当前事务 id，说明是自己的更改，可见
如果记录的事务 id 大于最大的活跃事务 id, 不可见

如果记录的事务 id 介于 m_low_limit_id 和 m_up_limit_id 之间，则要判断它是否在 m_ids 中，如果在，不可见，如果不在，表示已提交，可见
具体的代码捞一下看看

/** Check whether the changes by id are visible.
  @param[in]	id	transaction id to check against the view
  @param[in]	name	table name
  @return whether the view sees the modifications of id. */
  bool changes_visible(trx_id_t id, const table_name_t &name) const
      MY_ATTRIBUTE((warn_unused_result)) {
    ut_ad(id > 0);

    if (id < m_up_limit_id || id == m_creator_trx_id) {
      return (true);
    }

    check_trx_id_sanity(id, name);

    if (id >= m_low_limit_id) {
      return (false);

    } else if (m_ids.empty()) {
      return (true);
    }

    const ids_t::value_type *p = m_ids.data();

    return (!std::binary_search(p, p + m_ids.size(), id));
  }

剩下来一点是啥呢，就是 Read Committed 和 Repeated Read 也不一样，那前面说的 read view 都能支持吗，又是怎么支持呢，假如这个 read view 是在事务一开始就创建，那好像能支持的只是 RR 事务隔离级别，其实呢，这是通过创建 read view的时机，对于 RR 级别，就是在事务的第一个 select 语句是创建，对于 RC 级别，是在每个 select 语句执行前都是创建一次，那样就可以保证能读到所有已提交的数据

redis系列介绍八-淘汰策略

发表于 2020-04-18 更新于 2022-06-22 分类于 Redis ，数据结构， C ，源码， Redis 阅读次数： Disqus：

LRU

说完了过期策略再说下淘汰策略，redis 使用的策略是近似的 lru 策略，为什么是近似的呢，先来看下什么是 lru，看下 wiki 的介绍
，图中一共有四个槽的存储空间，依次访问顺序是 A B C D E D F，
当第一次访问 D 时刚好占满了坑，并且值是 4，这个值越小代表越先被淘汰，当 E 进来时，看了下已经存在的四个里 A 是最小的，代表是最早存在并且最早被访问的，那就先淘汰它了，E 占领了 A 的位置，并设置值为 4，然后又访问 D 了，D 已经存在了，不过又被访问到了，得更新值为 5，然后是 F 进来了，这时 B 是最老的且最近未被访问，所以就淘汰它了。以上是一个 lru 的简要说明，但是 redis 没有严格按照这个去执行，理由跟前面过期策略一致，最严格的过期策略应该是每个 key 都有对应的定时器，当超时时马上就能清除，但是问题是这样的cpu 消耗太大，所换来的内存效率不太值得，淘汰策略也是这样，类似于上图，要维护所有 key 的一个有序 lru 值，并且遍历将最小的淘汰，redis 采用的是抽样的形式，最初的实现方式是随机从 dict 抽取 5 个 key，淘汰一个 lru 最小的，这样子勉强能达到淘汰的目的，但是效果不是特别好，后面在 redis 3.0开始，将随机抽取改成了维护一个 pool，pool 的大小默认是 16，每次放入的都是按lru 值有序排列好，每一次放入的必须是 lru小于 pool 中最小的 lru 才允许放入，直到放满，后面再有新的就会将大的踢出。
redis 针对这个策略的改进做了一个实验，这里借用下图

首先背景是这图中的所有点都对应一个 redis 的 key，灰色部分加入后被顺序访问过一遍，然后又加入了绿色部分，那么按照理论的 lru 算法，应该是图左上中，浅灰色部分全都被淘汰，那么对比来看看图右上，左下和右下，左下表示 2.8 版本就是随机抽样 5 个 key，淘汰其中 lru 最小的一个，发现是灰色和浅灰色的都有被淘汰的，右下的 3.0 版本抽样数量不变的情况下，稍好一些，当 3.0 版本的抽样数量调整成 10 后，已经较为接近理论上的 lru 策略了，通过代码来简要分析下

typedef struct redisObject {
    unsigned type:4;
    unsigned encoding:4;
    unsigned lru:LRU_BITS; /* LRU time (relative to global lru_clock) or
                            * LFU data (least significant 8 bits frequency
                            * and most significant 16 bits access time). */
    int refcount;
    void *ptr;
} robj;

对于 lru 策略来说，lru 字段记录的就是redisObj 的LRU time，
redis 在访问数据时，都会调用lookupKey方法

/* Low level key lookup API, not actually called directly from commands
 * implementations that should instead rely on lookupKeyRead(),
 * lookupKeyWrite() and lookupKeyReadWithFlags(). */
robj *lookupKey(redisDb *db, robj *key, int flags) {
    dictEntry *de = dictFind(db->dict,key->ptr);
    if (de) {
        robj *val = dictGetVal(de);

        /* Update the access time for the ageing algorithm.
         * Don't do it if we have a saving child, as this will trigger
         * a copy on write madness. */
        if (!hasActiveChildProcess() && !(flags & LOOKUP_NOTOUCH)){
            if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) {
                // 这个是后面一节的内容
                updateLFU(val);
            } else {
                //  对于这个分支，访问时就会去更新 lru 值
                val->lru = LRU_CLOCK();
            }
        }
        return val;
    } else {
        return NULL;
    }
}
/* This function is used to obtain the current LRU clock.
 * If the current resolution is lower than the frequency we refresh the
 * LRU clock (as it should be in production servers) we return the
 * precomputed value, otherwise we need to resort to a system call. */
unsigned int LRU_CLOCK(void) {
    unsigned int lruclock;
    if (1000/server.hz <= LRU_CLOCK_RESOLUTION) {
        // 如果服务器的频率server.hz大于 1 时就是用系统预设的 lruclock
        lruclock = server.lruclock;
    } else {
        lruclock = getLRUClock();
    }
    return lruclock;
}
/* Return the LRU clock, based on the clock resolution. This is a time
 * in a reduced-bits format that can be used to set and check the
 * object->lru field of redisObject structures. */
unsigned int getLRUClock(void) {
    return (mstime()/LRU_CLOCK_RESOLUTION) & LRU_CLOCK_MAX;
}

redis 处理命令是在这里processCommand

/* If this function gets called we already read a whole
 * command, arguments are in the client argv/argc fields.
 * processCommand() execute the command or prepare the
 * server for a bulk read from the client.
 *
 * If C_OK is returned the client is still alive and valid and
 * other operations can be performed by the caller. Otherwise
 * if C_ERR is returned the client was destroyed (i.e. after QUIT). */
int processCommand(client *c) {
    moduleCallCommandFilters(c);

    

    /* Handle the maxmemory directive.
     *
     * Note that we do not want to reclaim memory if we are here re-entering
     * the event loop since there is a busy Lua script running in timeout
     * condition, to avoid mixing the propagation of scripts with the
     * propagation of DELs due to eviction. */
    if (server.maxmemory && !server.lua_timedout) {
        int out_of_memory = freeMemoryIfNeededAndSafe() == C_ERR;
        /* freeMemoryIfNeeded may flush slave output buffers. This may result
         * into a slave, that may be the active client, to be freed. */
        if (server.current_client == NULL) return C_ERR;

        /* It was impossible to free enough memory, and the command the client
         * is trying to execute is denied during OOM conditions or the client
         * is in MULTI/EXEC context? Error. */
        if (out_of_memory &&
            (c->cmd->flags & CMD_DENYOOM ||
             (c->flags & CLIENT_MULTI &&
              c->cmd->proc != execCommand &&
              c->cmd->proc != discardCommand)))
        {
            flagTransaction(c);
            addReply(c, shared.oomerr);
            return C_OK;
        }
    }
}

这里只摘了部分，当需要清理内存时就会调用, 然后调用了freeMemoryIfNeededAndSafe

/* This is a wrapper for freeMemoryIfNeeded() that only really calls the
 * function if right now there are the conditions to do so safely:
 *
 * - There must be no script in timeout condition.
 * - Nor we are loading data right now.
 *
 */
int freeMemoryIfNeededAndSafe(void) {
    if (server.lua_timedout || server.loading) return C_OK;
    return freeMemoryIfNeeded();
}
/* This function is periodically called to see if there is memory to free
 * according to the current "maxmemory" settings. In case we are over the
 * memory limit, the function will try to free some memory to return back
 * under the limit.
 *
 * The function returns C_OK if we are under the memory limit or if we
 * were over the limit, but the attempt to free memory was successful.
 * Otehrwise if we are over the memory limit, but not enough memory
 * was freed to return back under the limit, the function returns C_ERR. */
int freeMemoryIfNeeded(void) {
    int keys_freed = 0;
    /* By default replicas should ignore maxmemory
     * and just be masters exact copies. */
    if (server.masterhost && server.repl_slave_ignore_maxmemory) return C_OK;

    size_t mem_reported, mem_tofree, mem_freed;
    mstime_t latency, eviction_latency;
    long long delta;
    int slaves = listLength(server.slaves);

    /* When clients are paused the dataset should be static not just from the
     * POV of clients not being able to write, but also from the POV of
     * expires and evictions of keys not being performed. */
    if (clientsArePaused()) return C_OK;
    if (getMaxmemoryState(&mem_reported,NULL,&mem_tofree,NULL) == C_OK)
        return C_OK;

    mem_freed = 0;

    if (server.maxmemory_policy == MAXMEMORY_NO_EVICTION)
        goto cant_free; /* We need to free memory, but policy forbids. */

    latencyStartMonitor(latency);
    while (mem_freed < mem_tofree) {
        int j, k, i;
        static unsigned int next_db = 0;
        sds bestkey = NULL;
        int bestdbid;
        redisDb *db;
        dict *dict;
        dictEntry *de;

        if (server.maxmemory_policy & (MAXMEMORY_FLAG_LRU|MAXMEMORY_FLAG_LFU) ||
            server.maxmemory_policy == MAXMEMORY_VOLATILE_TTL)
        {
            struct evictionPoolEntry *pool = EvictionPoolLRU;

            while(bestkey == NULL) {
                unsigned long total_keys = 0, keys;

                /* We don't want to make local-db choices when expiring keys,
                 * so to start populate the eviction pool sampling keys from
                 * every DB. */
                for (i = 0; i < server.dbnum; i++) {
                    db = server.db+i;
                    dict = (server.maxmemory_policy & MAXMEMORY_FLAG_ALLKEYS) ?
                            db->dict : db->expires;
                    if ((keys = dictSize(dict)) != 0) {
                        evictionPoolPopulate(i, dict, db->dict, pool);
                        total_keys += keys;
                    }
                }
                if (!total_keys) break; /* No keys to evict. */

                /* Go backward from best to worst element to evict. */
                for (k = EVPOOL_SIZE-1; k >= 0; k--) {
                    if (pool[k].key == NULL) continue;
                    bestdbid = pool[k].dbid;

                    if (server.maxmemory_policy & MAXMEMORY_FLAG_ALLKEYS) {
                        de = dictFind(server.db[pool[k].dbid].dict,
                            pool[k].key);
                    } else {
                        de = dictFind(server.db[pool[k].dbid].expires,
                            pool[k].key);
                    }

                    /* Remove the entry from the pool. */
                    if (pool[k].key != pool[k].cached)
                        sdsfree(pool[k].key);
                    pool[k].key = NULL;
                    pool[k].idle = 0;

                    /* If the key exists, is our pick. Otherwise it is
                     * a ghost and we need to try the next element. */
                    if (de) {
                        bestkey = dictGetKey(de);
                        break;
                    } else {
                        /* Ghost... Iterate again. */
                    }
                }
            }
        }

        /* volatile-random and allkeys-random policy */
        else if (server.maxmemory_policy == MAXMEMORY_ALLKEYS_RANDOM ||
                 server.maxmemory_policy == MAXMEMORY_VOLATILE_RANDOM)
        {
            /* When evicting a random key, we try to evict a key for
             * each DB, so we use the static 'next_db' variable to
             * incrementally visit all DBs. */
            for (i = 0; i < server.dbnum; i++) {
                j = (++next_db) % server.dbnum;
                db = server.db+j;
                dict = (server.maxmemory_policy == MAXMEMORY_ALLKEYS_RANDOM) ?
                        db->dict : db->expires;
                if (dictSize(dict) != 0) {
                    de = dictGetRandomKey(dict);
                    bestkey = dictGetKey(de);
                    bestdbid = j;
                    break;
                }
            }
        }

        /* Finally remove the selected key. */
        if (bestkey) {
            db = server.db+bestdbid;
            robj *keyobj = createStringObject(bestkey,sdslen(bestkey));
            propagateExpire(db,keyobj,server.lazyfree_lazy_eviction);
            /* We compute the amount of memory freed by db*Delete() alone.
             * It is possible that actually the memory needed to propagate
             * the DEL in AOF and replication link is greater than the one
             * we are freeing removing the key, but we can't account for
             * that otherwise we would never exit the loop.
             *
             * AOF and Output buffer memory will be freed eventually so
             * we only care about memory used by the key space. */
            delta = (long long) zmalloc_used_memory();
            latencyStartMonitor(eviction_latency);
            if (server.lazyfree_lazy_eviction)
                dbAsyncDelete(db,keyobj);
            else
                dbSyncDelete(db,keyobj);
            latencyEndMonitor(eviction_latency);
            latencyAddSampleIfNeeded("eviction-del",eviction_latency);
            latencyRemoveNestedEvent(latency,eviction_latency);
            delta -= (long long) zmalloc_used_memory();
            mem_freed += delta;
            server.stat_evictedkeys++;
            notifyKeyspaceEvent(NOTIFY_EVICTED, "evicted",
                keyobj, db->id);
            decrRefCount(keyobj);
            keys_freed++;

            /* When the memory to free starts to be big enough, we may
             * start spending so much time here that is impossible to
             * deliver data to the slaves fast enough, so we force the
             * transmission here inside the loop. */
            if (slaves) flushSlavesOutputBuffers();

            /* Normally our stop condition is the ability to release
             * a fixed, pre-computed amount of memory. However when we
             * are deleting objects in another thread, it's better to
             * check, from time to time, if we already reached our target
             * memory, since the "mem_freed" amount is computed only
             * across the dbAsyncDelete() call, while the thread can
             * release the memory all the time. */
            if (server.lazyfree_lazy_eviction && !(keys_freed % 16)) {
                if (getMaxmemoryState(NULL,NULL,NULL,NULL) == C_OK) {
                    /* Let's satisfy our stop condition. */
                    mem_freed = mem_tofree;
                }
            }
        } else {
            latencyEndMonitor(latency);
            latencyAddSampleIfNeeded("eviction-cycle",latency);
            goto cant_free; /* nothing to free... */
        }
    }
    latencyEndMonitor(latency);
    latencyAddSampleIfNeeded("eviction-cycle",latency);
    return C_OK;

cant_free:
    /* We are here if we are not able to reclaim memory. There is only one
     * last thing we can try: check if the lazyfree thread has jobs in queue
     * and wait... */
    while(bioPendingJobsOfType(BIO_LAZY_FREE)) {
        if (((mem_reported - zmalloc_used_memory()) + mem_freed) >= mem_tofree)
            break;
        usleep(1000);
    }
    return C_ERR;
}

这里就是根据具体策略去淘汰 key，首先是要往 pool 更新 key，更新key 的方法是evictionPoolPopulate

void evictionPoolPopulate(int dbid, dict *sampledict, dict *keydict, struct evictionPoolEntry *pool) {
    int j, k, count;
    dictEntry *samples[server.maxmemory_samples];

    count = dictGetSomeKeys(sampledict,samples,server.maxmemory_samples);
    for (j = 0; j < count; j++) {
        unsigned long long idle;
        sds key;
        robj *o;
        dictEntry *de;

        de = samples[j];
        key = dictGetKey(de);

        /* If the dictionary we are sampling from is not the main
         * dictionary (but the expires one) we need to lookup the key
         * again in the key dictionary to obtain the value object. */
        if (server.maxmemory_policy != MAXMEMORY_VOLATILE_TTL) {
            if (sampledict != keydict) de = dictFind(keydict, key);
            o = dictGetVal(de);
        }

        /* Calculate the idle time according to the policy. This is called
         * idle just because the code initially handled LRU, but is in fact
         * just a score where an higher score means better candidate. */
        if (server.maxmemory_policy & MAXMEMORY_FLAG_LRU) {
            idle = estimateObjectIdleTime(o);
        } else if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) {
            /* When we use an LRU policy, we sort the keys by idle time
             * so that we expire keys starting from greater idle time.
             * However when the policy is an LFU one, we have a frequency
             * estimation, and we want to evict keys with lower frequency
             * first. So inside the pool we put objects using the inverted
             * frequency subtracting the actual frequency to the maximum
             * frequency of 255. */
            idle = 255-LFUDecrAndReturn(o);
        } else if (server.maxmemory_policy == MAXMEMORY_VOLATILE_TTL) {
            /* In this case the sooner the expire the better. */
            idle = ULLONG_MAX - (long)dictGetVal(de);
        } else {
            serverPanic("Unknown eviction policy in evictionPoolPopulate()");
        }

        /* Insert the element inside the pool.
         * First, find the first empty bucket or the first populated
         * bucket that has an idle time smaller than our idle time. */
        k = 0;
        while (k < EVPOOL_SIZE &&
               pool[k].key &&
               pool[k].idle < idle) k++;
        if (k == 0 && pool[EVPOOL_SIZE-1].key != NULL) {
            /* Can't insert if the element is < the worst element we have
             * and there are no empty buckets. */
            continue;
        } else if (k < EVPOOL_SIZE && pool[k].key == NULL) {
            /* Inserting into empty position. No setup needed before insert. */
        } else {
            /* Inserting in the middle. Now k points to the first element
             * greater than the element to insert.  */
            if (pool[EVPOOL_SIZE-1].key == NULL) {
                /* Free space on the right? Insert at k shifting
                 * all the elements from k to end to the right. */

                /* Save SDS before overwriting. */
                sds cached = pool[EVPOOL_SIZE-1].cached;
                memmove(pool+k+1,pool+k,
                    sizeof(pool[0])*(EVPOOL_SIZE-k-1));
                pool[k].cached = cached;
            } else {
                /* No free space on right? Insert at k-1 */
                k--;
                /* Shift all elements on the left of k (included) to the
                 * left, so we discard the element with smaller idle time. */
                sds cached = pool[0].cached; /* Save SDS before overwriting. */
                if (pool[0].key != pool[0].cached) sdsfree(pool[0].key);
                memmove(pool,pool+1,sizeof(pool[0])*k);
                pool[k].cached = cached;
            }
        }

        /* Try to reuse the cached SDS string allocated in the pool entry,
         * because allocating and deallocating this object is costly
         * (according to the profiler, not my fantasy. Remember:
         * premature optimizbla bla bla bla. */
        int klen = sdslen(key);
        if (klen > EVPOOL_CACHED_SDS_SIZE) {
            pool[k].key = sdsdup(key);
        } else {
            memcpy(pool[k].cached,key,klen+1);
            sdssetlen(pool[k].cached,klen);
            pool[k].key = pool[k].cached;
        }
        pool[k].idle = idle;
        pool[k].dbid = dbid;
    }
}

Redis随机选择maxmemory_samples数量的key，然后计算这些key的空闲时间idle time，当满足条件时(比pool中的某些键的空闲时间还大)就可以进pool。pool更新之后，就淘汰pool中空闲时间最大的键。

estimateObjectIdleTime用来计算Redis对象的空闲时间：

/* Given an object returns the min number of milliseconds the object was never
 * requested, using an approximated LRU algorithm. */
unsigned long long estimateObjectIdleTime(robj *o) {
    unsigned long long lruclock = LRU_CLOCK();
    if (lruclock >= o->lru) {
        return (lruclock - o->lru) * LRU_CLOCK_RESOLUTION;
    } else {
        return (lruclock + (LRU_CLOCK_MAX - o->lru)) *
                    LRU_CLOCK_RESOLUTION;
    }
}

空闲时间第一种是 lurclock 大于对象的 lru，那么就是减一下乘以精度，因为 lruclock 有可能是已经预生成的，所以会可能走下面这个

LFU

上面介绍了LRU 的算法，但是考虑一种场景

~~~~~A~~~~~A~~~~~A~~~~A~~~~~A~~~~~A~~|
~~B~~B~~B~~B~~B~~B~~B~~B~~B~~B~~B~~B~|
~~~~~~~~~~C~~~~~~~~~C~~~~~~~~~C~~~~~~|
~~~~~D~~~~~~~~~~D~~~~~~~~~D~~~~~~~~~D|

可以发现，当采用 lru 的淘汰策略的时候，D 是最新的，会被认为是最值得保留的，但是事实上还不如 A 跟 B，然后 antirez 大神就想到了LFU (Least Frequently Used) 这个算法, 显然对于上面的四个 key 的访问频率，保留优先级应该是 B > A > C = D
那要怎么来实现这个 LFU 算法呢，其实像LRU，理想的情况就是维护个链表，把最新访问的放到头上去，但是这个会影响访问速度，注意到前面代码的应该可以看到，redisObject 的 lru 字段其实是两用的，当策略是 LFU 时，这个字段就另作他用了，它的 24 位长度被分成两部分

      16 bits      8 bits
+----------------+--------+
+ Last decr time | LOG_C  |
+----------------+--------+

前16位字段是最后一次递减时间，因此Redis知道上一次计数器递减，后8位是计数器 counter。
LFU 的主体策略就是当这个 key 被访问的次数越多频率越高他就越容易被保留下来，并且是最近被访问的频率越高。这其实有两个事情要做，一个是在访问的时候增加计数值，在一定长时间不访问时进行衰减，所以这里用了两个值，前 16 位记录上一次衰减的时间，后 8 位记录具体的计数值。
Redis4.0之后为maxmemory_policy淘汰策略添加了两个LFU模式：

volatile-lfu：对有过期时间的key采用LFU淘汰策略
allkeys-lfu：对全部key采用LFU淘汰策略
还有2个配置可以调整LFU算法：

lfu-log-factor 10
lfu-decay-time 1
```  
`lfu-log-factor` 可以调整计数器counter的增长速度，lfu-log-factor越大，counter增长的越慢。

`lfu-decay-time`是一个以分钟为单位的数值，可以调整counter的减少速度
这里有个问题是 8 位大小够计么，访问一次加 1 的话的确不够，不过大神就是大神，才不会这么简单的加一。往下看代码
```C
/* Low level key lookup API, not actually called directly from commands
 * implementations that should instead rely on lookupKeyRead(),
 * lookupKeyWrite() and lookupKeyReadWithFlags(). */
robj *lookupKey(redisDb *db, robj *key, int flags) {
    dictEntry *de = dictFind(db->dict,key->ptr);
    if (de) {
        robj *val = dictGetVal(de);

        /* Update the access time for the ageing algorithm.
         * Don't do it if we have a saving child, as this will trigger
         * a copy on write madness. */
        if (!hasActiveChildProcess() && !(flags & LOOKUP_NOTOUCH)){
            if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) {
                // 当淘汰策略是 LFU 时，就会调用这个updateLFU
                updateLFU(val);
            } else {
                val->lru = LRU_CLOCK();
            }
        }
        return val;
    } else {
        return NULL;
    }
}

updateLFU 这个其实个入口，调用了两个重要的方法

/* Update LFU when an object is accessed.
 * Firstly, decrement the counter if the decrement time is reached.
 * Then logarithmically increment the counter, and update the access time. */
void updateLFU(robj *val) {
    unsigned long counter = LFUDecrAndReturn(val);
    counter = LFULogIncr(counter);
    val->lru = (LFUGetTimeInMinutes()<<8) | counter;
}

首先来看看LFUDecrAndReturn，这个方法的作用是根据上一次衰减时间和系统配置的 lfu-decay-time 参数来确定需要将 counter 减去多少

/* If the object decrement time is reached decrement the LFU counter but
 * do not update LFU fields of the object, we update the access time
 * and counter in an explicit way when the object is really accessed.
 * And we will times halve the counter according to the times of
 * elapsed time than server.lfu_decay_time.
 * Return the object frequency counter.
 *
 * This function is used in order to scan the dataset for the best object
 * to fit: as we check for the candidate, we incrementally decrement the
 * counter of the scanned objects if needed. */
unsigned long LFUDecrAndReturn(robj *o) {
    // 右移 8 位，拿到上次衰减时间
    unsigned long ldt = o->lru >> 8;
    // 对 255 做与操作，拿到 counter 值
    unsigned long counter = o->lru & 255;
    // 根据lfu_decay_time来算出过了多少个衰减周期
    unsigned long num_periods = server.lfu_decay_time ? LFUTimeElapsed(ldt) / server.lfu_decay_time : 0;
    if (num_periods)
        counter = (num_periods > counter) ? 0 : counter - num_periods;
    return counter;
}

然后是加，调用了LFULogIncr

/* Logarithmically increment a counter. The greater is the current counter value
 * the less likely is that it gets really implemented. Saturate it at 255. */
uint8_t LFULogIncr(uint8_t counter) {
    // 最大值就是 255，到顶了就不加了
    if (counter == 255) return 255;
    // 生成个随机小数
    double r = (double)rand()/RAND_MAX;
    // 减去个基础值，LFU_INIT_VAL = 5，防止刚进来就被逐出
    double baseval = counter - LFU_INIT_VAL;
    // 如果是小于 0，
    if (baseval < 0) baseval = 0;
    // 如果 baseval 是 0，那么 p 就是 1了，后面 counter 直接加一，如果不是的话，得看系统参数lfu_log_factor，这个越大，除出来的 p 越小，那么 counter++的可能性也越小，这样子就把前面的疑问给解决了，不是直接+1 的
    double p = 1.0/(baseval*server.lfu_log_factor+1);
    if (r < p) counter++;
    return counter;
}

大概的变化速度可以参考

+--------+------------+------------+------------+------------+------------+
| factor | 100 hits   | 1000 hits  | 100K hits  | 1M hits    | 10M hits   |
+--------+------------+------------+------------+------------+------------+
| 0      | 104        | 255        | 255        | 255        | 255        |
+--------+------------+------------+------------+------------+------------+
| 1      | 18         | 49         | 255        | 255        | 255        |
+--------+------------+------------+------------+------------+------------+
| 10     | 10         | 18         | 142        | 255        | 255        |
+--------+------------+------------+------------+------------+------------+
| 100    | 8          | 11         | 49         | 143        | 255        |
+--------+------------+------------+------------+------------+------------+

简而言之就是 lfu_log_factor 越大变化的越慢

总结

总结一下，redis 实现了近似的 lru 淘汰策略，通过增加了淘汰 key 的池子(pool)，并且增大每次抽样的 key 的数量来将淘汰效果更进一步地接近于 lru，这是 lru 策略，但是对于前面举的一个例子，其实 lru 并不能保证 key 的淘汰就如我们预期，所以在后期又引入了 lfu 的策略，lfu的策略比较巧妙，复用了 redis 对象的 lru 字段，并且使用了factor 参数来控制计数器递增的速度，防止 8 位的计数器太早溢出。

redis系列介绍七-过期策略

发表于 2020-04-12 分类于 Redis ，数据结构， C ，源码， Redis 阅读次数： Disqus：

这一篇不再是数据结构介绍了，大致的数据结构基本都介绍了，这一篇主要是查漏补缺，或者说讲一些重要且基本的概念，也可能是经常被忽略的，很多讲 redis 的系列文章可能都会忽略，学习 redis 的时候也会，因为觉得源码学习就是讲主要的数据结构和“算法”学习了就好了。
redis 的主要应用就是拿来作为高性能的缓存，那么缓存一般有些啥需要注意的，首先是访问速度，如果取得跟数据库一样快，那就没什么存在的意义，第二个是缓存的字面意思，我只是为了让数据读取快一些，通常大部分的场景这个是需要更新过期的，这里就把我要讲的第一点引出来了（真累，

redis过期策略

redis 是如何过期缓存的，可以猜测下，最无脑的就是每个设置了过期时间的 key 都设个定时器，过期了就删除，这种显然消耗太大，清理地最及时，还有的就是 redis 正在采用的懒汉清理策略和定期清理
懒汉策略就是在使用的时候去检查缓存是否过期，比如 get 操作时，先判断下这个 key 是否已经过期了，如果过期了就删掉，并且返回空，如果没过期则正常返回
主要代码是

/* This function is called when we are going to perform some operation
 * in a given key, but such key may be already logically expired even if
 * it still exists in the database. The main way this function is called
 * is via lookupKey*() family of functions.
 *
 * The behavior of the function depends on the replication role of the
 * instance, because slave instances do not expire keys, they wait
 * for DELs from the master for consistency matters. However even
 * slaves will try to have a coherent return value for the function,
 * so that read commands executed in the slave side will be able to
 * behave like if the key is expired even if still present (because the
 * master has yet to propagate the DEL).
 *
 * In masters as a side effect of finding a key which is expired, such
 * key will be evicted from the database. Also this may trigger the
 * propagation of a DEL/UNLINK command in AOF / replication stream.
 *
 * The return value of the function is 0 if the key is still valid,
 * otherwise the function returns 1 if the key is expired. */
int expireIfNeeded(redisDb *db, robj *key) {
    if (!keyIsExpired(db,key)) return 0;

    /* If we are running in the context of a slave, instead of
     * evicting the expired key from the database, we return ASAP:
     * the slave key expiration is controlled by the master that will
     * send us synthesized DEL operations for expired keys.
     *
     * Still we try to return the right information to the caller,
     * that is, 0 if we think the key should be still valid, 1 if
     * we think the key is expired at this time. */
    if (server.masterhost != NULL) return 1;

    /* Delete the key */
    server.stat_expiredkeys++;
    propagateExpire(db,key,server.lazyfree_lazy_expire);
    notifyKeyspaceEvent(NOTIFY_EXPIRED,
        "expired",key,db->id);
    return server.lazyfree_lazy_expire ? dbAsyncDelete(db,key) :
                                         dbSyncDelete(db,key);
}

/* Check if the key is expired. */
int keyIsExpired(redisDb *db, robj *key) {
    mstime_t when = getExpire(db,key);
    mstime_t now;

    if (when < 0) return 0; /* No expire for this key */

    /* Don't expire anything while loading. It will be done later. */
    if (server.loading) return 0;

    /* If we are in the context of a Lua script, we pretend that time is
     * blocked to when the Lua script started. This way a key can expire
     * only the first time it is accessed and not in the middle of the
     * script execution, making propagation to slaves / AOF consistent.
     * See issue #1525 on Github for more information. */
    if (server.lua_caller) {
        now = server.lua_time_start;
    }
    /* If we are in the middle of a command execution, we still want to use
     * a reference time that does not change: in that case we just use the
     * cached time, that we update before each call in the call() function.
     * This way we avoid that commands such as RPOPLPUSH or similar, that
     * may re-open the same key multiple times, can invalidate an already
     * open object in a next call, if the next call will see the key expired,
     * while the first did not. */
    else if (server.fixed_time_expire > 0) {
        now = server.mstime;
    }
    /* For the other cases, we want to use the most fresh time we have. */
    else {
        now = mstime();
    }

    /* The key expired if the current (virtual or real) time is greater
     * than the expire time of the key. */
    return now > when;
}
/* Return the expire time of the specified key, or -1 if no expire
 * is associated with this key (i.e. the key is non volatile) */
long long getExpire(redisDb *db, robj *key) {
    dictEntry *de;

    /* No expire? return ASAP */
    if (dictSize(db->expires) == 0 ||
       (de = dictFind(db->expires,key->ptr)) == NULL) return -1;

    /* The entry was found in the expire dict, this means it should also
     * be present in the main dict (safety check). */
    serverAssertWithInfo(NULL,key,dictFind(db->dict,key->ptr) != NULL);
    return dictGetSignedIntegerVal(de);
}

这里有几点要注意的，第一是当惰性删除时会根据lazyfree_lazy_expire这个参数去判断是执行同步删除还是异步删除，另外一点是对于 slave，是不需要执行的，因为会在 master 过期时向 slave 发送 del 指令。
光采用这个策略会有什么问题呢，假如一些key 一直未被访问，那这些 key 就不会过期了，导致一直被占用着内存，所以 redis 采取了懒汉式过期加定期过期策略，定期策略是怎么执行的呢

/* This function handles 'background' operations we are required to do
 * incrementally in Redis databases, such as active key expiring, resizing,
 * rehashing. */
void databasesCron(void) {
    /* Expire keys by random sampling. Not required for slaves
     * as master will synthesize DELs for us. */
    if (server.active_expire_enabled) {
        if (server.masterhost == NULL) {
            activeExpireCycle(ACTIVE_EXPIRE_CYCLE_SLOW);
        } else {
            expireSlaveKeys();
        }
    }

    /* Defrag keys gradually. */
    activeDefragCycle();

    /* Perform hash tables rehashing if needed, but only if there are no
     * other processes saving the DB on disk. Otherwise rehashing is bad
     * as will cause a lot of copy-on-write of memory pages. */
    if (!hasActiveChildProcess()) {
        /* We use global counters so if we stop the computation at a given
         * DB we'll be able to start from the successive in the next
         * cron loop iteration. */
        static unsigned int resize_db = 0;
        static unsigned int rehash_db = 0;
        int dbs_per_call = CRON_DBS_PER_CALL;
        int j;

        /* Don't test more DBs than we have. */
        if (dbs_per_call > server.dbnum) dbs_per_call = server.dbnum;

        /* Resize */
        for (j = 0; j < dbs_per_call; j++) {
            tryResizeHashTables(resize_db % server.dbnum);
            resize_db++;
        }

        /* Rehash */
        if (server.activerehashing) {
            for (j = 0; j < dbs_per_call; j++) {
                int work_done = incrementallyRehash(rehash_db);
                if (work_done) {
                    /* If the function did some work, stop here, we'll do
                     * more at the next cron loop. */
                    break;
                } else {
                    /* If this db didn't need rehash, we'll try the next one. */
                    rehash_db++;
                    rehash_db %= server.dbnum;
                }
            }
        }
    }
}
/* Try to expire a few timed out keys. The algorithm used is adaptive and
 * will use few CPU cycles if there are few expiring keys, otherwise
 * it will get more aggressive to avoid that too much memory is used by
 * keys that can be removed from the keyspace.
 *
 * Every expire cycle tests multiple databases: the next call will start
 * again from the next db, with the exception of exists for time limit: in that
 * case we restart again from the last database we were processing. Anyway
 * no more than CRON_DBS_PER_CALL databases are tested at every iteration.
 *
 * The function can perform more or less work, depending on the "type"
 * argument. It can execute a "fast cycle" or a "slow cycle". The slow
 * cycle is the main way we collect expired cycles: this happens with
 * the "server.hz" frequency (usually 10 hertz).
 *
 * However the slow cycle can exit for timeout, since it used too much time.
 * For this reason the function is also invoked to perform a fast cycle
 * at every event loop cycle, in the beforeSleep() function. The fast cycle
 * will try to perform less work, but will do it much more often.
 *
 * The following are the details of the two expire cycles and their stop
 * conditions:
 *
 * If type is ACTIVE_EXPIRE_CYCLE_FAST the function will try to run a
 * "fast" expire cycle that takes no longer than EXPIRE_FAST_CYCLE_DURATION
 * microseconds, and is not repeated again before the same amount of time.
 * The cycle will also refuse to run at all if the latest slow cycle did not
 * terminate because of a time limit condition.
 *
 * If type is ACTIVE_EXPIRE_CYCLE_SLOW, that normal expire cycle is
 * executed, where the time limit is a percentage of the REDIS_HZ period
 * as specified by the ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC define. In the
 * fast cycle, the check of every database is interrupted once the number
 * of already expired keys in the database is estimated to be lower than
 * a given percentage, in order to avoid doing too much work to gain too
 * little memory.
 *
 * The configured expire "effort" will modify the baseline parameters in
 * order to do more work in both the fast and slow expire cycles.
 */

#define ACTIVE_EXPIRE_CYCLE_KEYS_PER_LOOP 20 /* Keys for each DB loop. */
#define ACTIVE_EXPIRE_CYCLE_FAST_DURATION 1000 /* Microseconds. */
#define ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC 25 /* Max % of CPU to use. */
#define ACTIVE_EXPIRE_CYCLE_ACCEPTABLE_STALE 10 /* % of stale keys after which
                                                   we do extra efforts. */
void activeExpireCycle(int type) {
    /* Adjust the running parameters according to the configured expire
     * effort. The default effort is 1, and the maximum configurable effort
     * is 10. */
    unsigned long
    effort = server.active_expire_effort-1, /* Rescale from 0 to 9. */
    config_keys_per_loop = ACTIVE_EXPIRE_CYCLE_KEYS_PER_LOOP +
                           ACTIVE_EXPIRE_CYCLE_KEYS_PER_LOOP/4*effort,
    config_cycle_fast_duration = ACTIVE_EXPIRE_CYCLE_FAST_DURATION +
                                 ACTIVE_EXPIRE_CYCLE_FAST_DURATION/4*effort,
    config_cycle_slow_time_perc = ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC +
                                  2*effort,
    config_cycle_acceptable_stale = ACTIVE_EXPIRE_CYCLE_ACCEPTABLE_STALE-
                                    effort;

    /* This function has some global state in order to continue the work
     * incrementally across calls. */
    static unsigned int current_db = 0; /* Last DB tested. */
    static int timelimit_exit = 0;      /* Time limit hit in previous call? */
    static long long last_fast_cycle = 0; /* When last fast cycle ran. */

    int j, iteration = 0;
    int dbs_per_call = CRON_DBS_PER_CALL;
    long long start = ustime(), timelimit, elapsed;

    /* When clients are paused the dataset should be static not just from the
     * POV of clients not being able to write, but also from the POV of
     * expires and evictions of keys not being performed. */
    if (clientsArePaused()) return;

    if (type == ACTIVE_EXPIRE_CYCLE_FAST) {
        /* Don't start a fast cycle if the previous cycle did not exit
         * for time limit, unless the percentage of estimated stale keys is
         * too high. Also never repeat a fast cycle for the same period
         * as the fast cycle total duration itself. */
        if (!timelimit_exit &&
            server.stat_expired_stale_perc < config_cycle_acceptable_stale)
            return;

        if (start < last_fast_cycle + (long long)config_cycle_fast_duration*2)
            return;

        last_fast_cycle = start;
    }

    /* We usually should test CRON_DBS_PER_CALL per iteration, with
     * two exceptions:
     *
     * 1) Don't test more DBs than we have.
     * 2) If last time we hit the time limit, we want to scan all DBs
     * in this iteration, as there is work to do in some DB and we don't want
     * expired keys to use memory for too much time. */
    if (dbs_per_call > server.dbnum || timelimit_exit)
        dbs_per_call = server.dbnum;

    /* We can use at max 'config_cycle_slow_time_perc' percentage of CPU
     * time per iteration. Since this function gets called with a frequency of
     * server.hz times per second, the following is the max amount of
     * microseconds we can spend in this function. */
    timelimit = config_cycle_slow_time_perc*1000000/server.hz/100;
    timelimit_exit = 0;
    if (timelimit <= 0) timelimit = 1;

    if (type == ACTIVE_EXPIRE_CYCLE_FAST)
        timelimit = config_cycle_fast_duration; /* in microseconds. */

    /* Accumulate some global stats as we expire keys, to have some idea
     * about the number of keys that are already logically expired, but still
     * existing inside the database. */
    long total_sampled = 0;
    long total_expired = 0;

    for (j = 0; j < dbs_per_call && timelimit_exit == 0; j++) {
        /* Expired and checked in a single loop. */
        unsigned long expired, sampled;

        redisDb *db = server.db+(current_db % server.dbnum);

        /* Increment the DB now so we are sure if we run out of time
         * in the current DB we'll restart from the next. This allows to
         * distribute the time evenly across DBs. */
        current_db++;

        /* Continue to expire if at the end of the cycle more than 25%
         * of the keys were expired. */
        do {
            unsigned long num, slots;
            long long now, ttl_sum;
            int ttl_samples;
            iteration++;

            /* If there is nothing to expire try next DB ASAP. */
            if ((num = dictSize(db->expires)) == 0) {
                db->avg_ttl = 0;
                break;
            }
            slots = dictSlots(db->expires);
            now = mstime();

            /* When there are less than 1% filled slots, sampling the key
             * space is expensive, so stop here waiting for better times...
             * The dictionary will be resized asap. */
            if (num && slots > DICT_HT_INITIAL_SIZE &&
                (num*100/slots < 1)) break;

            /* The main collection cycle. Sample random keys among keys
             * with an expire set, checking for expired ones. */
            expired = 0;
            sampled = 0;
            ttl_sum = 0;
            ttl_samples = 0;

            if (num > config_keys_per_loop)
                num = config_keys_per_loop;

            /* Here we access the low level representation of the hash table
             * for speed concerns: this makes this code coupled with dict.c,
             * but it hardly changed in ten years.
             *
             * Note that certain places of the hash table may be empty,
             * so we want also a stop condition about the number of
             * buckets that we scanned. However scanning for free buckets
             * is very fast: we are in the cache line scanning a sequential
             * array of NULL pointers, so we can scan a lot more buckets
             * than keys in the same time. */
            long max_buckets = num*20;
            long checked_buckets = 0;

            while (sampled < num && checked_buckets < max_buckets) {
                for (int table = 0; table < 2; table++) {
                    if (table == 1 && !dictIsRehashing(db->expires)) break;

                    unsigned long idx = db->expires_cursor;
                    idx &= db->expires->ht[table].sizemask;
                    dictEntry *de = db->expires->ht[table].table[idx];
                    long long ttl;

                    /* Scan the current bucket of the current table. */
                    checked_buckets++;
                    while(de) {
                        /* Get the next entry now since this entry may get
                         * deleted. */
                        dictEntry *e = de;
                        de = de->next;

                        ttl = dictGetSignedIntegerVal(e)-now;
                        if (activeExpireCycleTryExpire(db,e,now)) expired++;
                        if (ttl > 0) {
                            /* We want the average TTL of keys yet
                             * not expired. */
                            ttl_sum += ttl;
                            ttl_samples++;
                        }
                        sampled++;
                    }
                }
                db->expires_cursor++;
            }
            total_expired += expired;
            total_sampled += sampled;

            /* Update the average TTL stats for this database. */
            if (ttl_samples) {
                long long avg_ttl = ttl_sum/ttl_samples;

                /* Do a simple running average with a few samples.
                 * We just use the current estimate with a weight of 2%
                 * and the previous estimate with a weight of 98%. */
                if (db->avg_ttl == 0) db->avg_ttl = avg_ttl;
                db->avg_ttl = (db->avg_ttl/50)*49 + (avg_ttl/50);
            }

            /* We can't block forever here even if there are many keys to
             * expire. So after a given amount of milliseconds return to the
             * caller waiting for the other active expire cycle. */
            if ((iteration & 0xf) == 0) { /* check once every 16 iterations. */
                elapsed = ustime()-start;
                if (elapsed > timelimit) {
                    timelimit_exit = 1;
                    server.stat_expired_time_cap_reached_count++;
                    break;
                }
            }
            /* We don't repeat the cycle for the current database if there are
             * an acceptable amount of stale keys (logically expired but yet
             * not reclained). */
        } while ((expired*100/sampled) > config_cycle_acceptable_stale);
    }

    elapsed = ustime()-start;
    server.stat_expire_cycle_time_used += elapsed;
    latencyAddSampleIfNeeded("expire-cycle",elapsed/1000);

    /* Update our estimate of keys existing but yet to be expired.
     * Running average with this sample accounting for 5%. */
    double current_perc;
    if (total_sampled) {
        current_perc = (double)total_expired/total_sampled;
    } else
        current_perc = 0;
    server.stat_expired_stale_perc = (current_perc*0.05)+
                                     (server.stat_expired_stale_perc*0.95);
}

执行定期清除分成两种类型，快和慢，分别由beforeSleep和databasesCron调用，快版有两个限制，一个是执行时长由ACTIVE_EXPIRE_CYCLE_FAST_DURATION限制，另一个是执行间隔是 2 倍的ACTIVE_EXPIRE_CYCLE_FAST_DURATION，另外这还可以由配置的server.active_expire_effort参数来控制，默认是 1，最大是 10

1 2	onfig_cycle_fast_duration = ACTIVE_EXPIRE_CYCLE_FAST_DURATION + ACTIVE_EXPIRE_CYCLE_FAST_DURATION/4*effort

然后会从一定数量的 db 中找出一定数量的带过期时间的 key（保存在 expires中），这里的数量是由

config_keys_per_loop = ACTIVE_EXPIRE_CYCLE_KEYS_PER_LOOP +
                           ACTIVE_EXPIRE_CYCLE_KEYS_PER_LOOP/4*effort
```                                 
控制，慢速的执行时长是
```C
config_cycle_slow_time_perc = ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC +
                                  2*effort
timelimit = config_cycle_slow_time_perc*1000000/server.hz/100;

这里还有一个额外的退出条件，如果当前数据库的抽样结果已经达到我们所允许的过期 key 百分比，则下次不再处理当前 db，继续处理下个 db

Comparator使用小记

发表于 2020-04-05 分类于 Java ，集合阅读次数： Disqus：

在Java8的stream之前，将对象进行排序的时候，可能需要对象实现Comparable接口，或者自己实现一个Comparator，

比如这样子

我的对象是Entity

public class Entity {

    private Long id;

    private Long sortValue;

    public Long getId() {
        return id;
    }

    public void setId(Long id) {
        this.id = id;
    }

    public Long getSortValue() {
        return sortValue;
    }

    public void setSortValue(Long sortValue) {
        this.sortValue = sortValue;
    }
}

Comparator

public class MyComparator implements Comparator {
    @Override
    public int compare(Object o1, Object o2) {
        Entity e1 = (Entity) o1;
        Entity e2 = (Entity) o2;
        if (e1.getSortValue() < e2.getSortValue()) {
            return -1;
        } else if (e1.getSortValue().equals(e2.getSortValue())) {
            return 0;
        } else {
            return 1;
        }
    }
}

比较代码

private static MyComparator myComparator = new MyComparator();

    public static void main(String[] args) {
        List<Entity> list = new ArrayList<Entity>();
        Entity e1 = new Entity();
        e1.setId(1L);
        e1.setSortValue(1L);
        list.add(e1);
        Entity e2 = new Entity();
        e2.setId(2L);
        e2.setSortValue(null);
        list.add(e2);
        Collections.sort(list, myComparator);

看到这里的e2的排序值是null，在Comparator中如果要正常运行的话，就得判空之类的，这里有两点需要，一个是不想写这个MyComparator，然后也没那么好排除掉list里排序值，那么有什么办法能解决这种问题呢，应该说java的这方面真的是很强大

看一下nullsFirst的实现

final static class NullComparator<T> implements Comparator<T>, Serializable {
        private static final long serialVersionUID = -7569533591570686392L;
        private final boolean nullFirst;
        // if null, non-null Ts are considered equal
        private final Comparator<T> real;

        @SuppressWarnings("unchecked")
        NullComparator(boolean nullFirst, Comparator<? super T> real) {
            this.nullFirst = nullFirst;
            this.real = (Comparator<T>) real;
        }

        @Override
        public int compare(T a, T b) {
            if (a == null) {
                return (b == null) ? 0 : (nullFirst ? -1 : 1);
            } else if (b == null) {
                return nullFirst ? 1: -1;
            } else {
                return (real == null) ? 0 : real.compare(a, b);
            }
        }

核心代码就是下面这段，其实就是帮我们把前面要做的事情做掉了，是不是挺方便的，小记一下哈

docker使用中发现的echo命令的一个小技巧及其他

发表于 2020-03-29 更新于 2022-06-20 分类于 Linux ， Docker ，命令， echo ，发行版本阅读次数： Disqus：

echo 实操技巧

最近做 docker 系列，会经常需要进到 docker 内部，如上一篇介绍的，这些镜像一般都有用 ubuntu 或者alpine 这样的 Linux 系统作为底包，如果构建镜像的时候没有替换源的话，因为特殊的网络原因，在内部想编辑下东西要安装个类似于 vim 这样的编辑器就会很慢很慢，像视频里 two thousand years later~ 而且如果在容器内部想改源配置的话也要编辑器，就陷入了一个鸡生蛋，跟蛋生鸡的死锁问题中，对于 linux 大神来说应该有一万种方法解决这个问题，对于我这个渣渣来说可能只想到了这个土方法，先 cp backup 一下 sources.list, 再 echo “xxx” > sources.list, 这里就碰到了一个问题，这个 sources.list 一般不止一行，直接 echo 的话就解析不了了，不过 echo 可以支持”\n”转义，就是加-e看一下解释和示例，我这里使用了 tldr ，可以用 npm install -g tldr 安装，也可以直接用man，或者–help 来查看使用方式

查看镜像底包

还有一点也是在这个时候要安装 vim 之类的，得知道是什么镜像底包，如果是用 uname 指令，其实看到的是宿主机的系统，得用cat /etc/issue

这里稍稍记一下

寻找系统镜像源

目前国内系统源用得比较多的是阿里云源，不过这里也推荐清华源, 中科大源, 浙大源这里不要脸的推荐下母校的源，不过还不是很完善，尽情期待下。