MySQL索引对NULL值的处理

谁践踏了优雅 2021-06-22 15:37 865阅读 0赞
  1. # 索引不会包含有NULL值的列
  2. 只要列中包含有NULL值都将不会被包含在索引中,复合索引中只要有一列含有NULL值,那么这一列对于此复合索引就是无效的。所以我们在数据库设计时不要让字段的默认值为NULL

在很多库表设计规范、某某军规的文章中,是不是经常会看到类似这样的内容。小编也经常看到这样的内容,并且在编写规范的时候,准备也把这一条加进去。但在按部就班之余,小编抽空验证了一下,发现事实却并非如此!

小编使用的MySQL版本是社区版 5.7.21

新建测试表 t1,插入不含NULL值得100行数据,然后插入1行带NULL的数据 insert into t1(id) values(101); 表中有主键id,索引a

  1. CREATE TABLE `t1` (
  2. `id` int(11) NOT NULL,
  3. `a` int(11) DEFAULT NULL,
  4. `b` int(11) DEFAULT NULL,
  5. PRIMARY KEY (`id`),
  6. KEY `a` (`a`)
  7. ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4

测试1,包含NULL单列索引的查询,可以看到即使是查找 IS NULL的行,也是可以用上索引的

  1. 测试1
  2. desc select * from t1 where a > 82;
  3. +----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-----------------------+
  4. | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
  5. +----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-----------------------+
  6. | 1 | SIMPLE | t1 | NULL | range | a | a | 5 | NULL | 18 | 100.00 | Using index condition |
  7. +----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-----------------------+
  8. desc select * from t1 where a is NULL;
  9. +----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-----------------------+
  10. | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
  11. +----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-----------------------+
  12. | 1 | SIMPLE | t1 | NULL | ref | a | a | 5 | const | 1 | 100.00 | Using index condition |
  13. +----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-----------------------+
  14. desc select * from t1 where a = 20 or a is null;
  15. +----+-------------+-------+------------+-------------+---------------+--------+---------+-------+------+----------+--------------------------+
  16. | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
  17. +----+-------------+-------+------------+-------------+---------------+--------+---------+-------+------+----------+--------------------------+
  18. | 1 | SIMPLE | t1 | NULL | ref_or_null | idx_ab | idx_ab | 5 | const | 2 | 100.00 | Using where; Using index |
  19. +----+-------------+-------+------------+-------------+---------------+--------+---------+-------+------+----------+--------------------------+

注意对 NULL 值的检索只能使用 is null / is not null / <=>,不能使用=,<,>这样的运算符(mysql中可以用a <=> NULL 表示查找 a is NULL’的行)

测试2,包含NULL复合索引的查询,首先加一个复合索引 alter table t1 drop index a,add index idx_ab(a,b); 可以看到不管是指定 a is null ,或者指定 b is null ,都可以利用上索引 idx_ab(key_len 可以看出)

  1. 测试2
  2. desc select * from t1 where a=50 and b>20;
  3. +----+-------------+-------+------------+-------+---------------+--------+---------+------+------+----------+--------------------------+
  4. | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
  5. +----+-------------+-------+------------+-------+---------------+--------+---------+------+------+----------+--------------------------+
  6. | 1 | SIMPLE | t1 | NULL | range | idx_ab | idx_ab | 10 | NULL | 1 | 100.00 | Using where; Using index |
  7. +----+-------------+-------+------------+-------+---------------+--------+---------+------+------+----------+--------------------------+
  8. desc select * from t1 where a=50 and b is null;
  9. +----+-------------+-------+------------+------+---------------+--------+---------+-------------+------+----------+--------------------------+
  10. | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
  11. +----+-------------+-------+------------+------+---------------+--------+---------+-------------+------+----------+--------------------------+
  12. | 1 | SIMPLE | t1 | NULL | ref | idx_ab | idx_ab | 10 | const,const | 1 | 100.00 | Using where; Using index |
  13. +----+-------------+-------+------------+------+---------------+--------+---------+-------------+------+----------+--------------------------+
  14. desc select * from t1 where a is null and b>20;
  15. +----+-------------+-------+------------+-------+---------------+--------+---------+------+------+----------+--------------------------+
  16. | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
  17. +----+-------------+-------+------------+-------+---------------+--------+---------+------+------+----------+--------------------------+
  18. | 1 | SIMPLE | t1 | NULL | range | idx_ab | idx_ab | 10 | NULL | 1 | 100.00 | Using where; Using index |
  19. +----+-------------+-------+------------+-------+---------------+--------+---------+------+------+----------+--------------------------+
  20. desc select * from t1 where a is null and b is null;
  21. +----+-------------+-------+------------+------+---------------+--------+---------+-------------+------+----------+--------------------------+
  22. | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
  23. +----+-------------+-------+------------+------+---------------+--------+---------+-------------+------+----------+--------------------------+
  24. | 1 | SIMPLE | t1 | NULL | ref | idx_ab | idx_ab | 10 | const,const | 1 | 100.00 | Using where; Using index |
  25. +----+-------------+-------+------------+------+---------------+--------+---------+-------------+------+----------+--------------------------+

由此,只要列中包含有NULL值都将不会被包含在索引中,复合索引中只要有一列含有NULL值,那么这一列对于此复合索引就是无效的。所以我们在数据库设计时不要让字段的默认值为NULL。这句的前半句是不对的(可参考官网说明: https://dev.mysql.com/doc/refman/5.7/en/is-null-optimization.html),但是后半句的结论确是可以采纳的。

虽然MySQL可以在含有null的列上使用索引,但不代表null和其他数据在索引中是一样的。不建议列上允许为空,最好限制 not null ,并设置一个默认值,比如0和’’空字符串等,如果是datetime类型,可以设置成’1970-01-01 00:00:00’这样的值。对MySQL来说,null 是一个特殊的值,Conceptually, NULL means “a missing unknown value” and it is treated somewhat differently from other values。 对null做算术运算的结果都是null,count时不会包括null行,null 比空字符串需要更多的存储空间等。

附:上面说到可用通过 key_len 看出使用了索引列的个数,a,b 都是 int 类型,4 byte,为什么 key_len 是 5 byte 和 10 byte 呢?是因为如果索引列定义时允许NULL,其key_len还需要再加 1 bytes. 参考好友王的文章,可以移步我们的站点查看详情: http://www.fordba.com/spend-10-min-to-understand-how-mysql-use-index.html

发表评论

表情:
评论列表 (有 0 条评论,865人围观)

还没有评论,来说两句吧...

相关阅读