2014年12月17日星期三
Cassandra and Aerospike, the one million TPS war
Dec 2nd, 2011, Netflix first put one million writes on 288 EC2 m1 extra large nodes, triple-replicate. $0.078 per million writes.
The wars began!
http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html
July 19th, 2012, Netflix tested Cassandra with 12 SSD-based EC2 nodes, triple-replicate. $0.013 per million writes.
http://techblog.netflix.com/2012/07/benchmarking-high-performance-io-with.html
March 20th, 2014. 330 Cassandra nodes with Google's GCE. No hardware spec.
http://googlecloudplatform.blogspot.sg/2014/03/cassandra-hits-one-million-writes-per-second-on-google-compute-engine.html
December 3rd, 2014. Aerospike made it with 50 nodes. $0.01 per million writes.
http://www.aerospike.com/blog/1m-wps-6x-fewer-servers-than-cassandra
To be continued...
2014年8月5日星期二
MySQL in sub query performance pitfall
假如Account表有一个主键UUID, 那么下面这个SQL明显是应该很快返回的对吧
mysql> EXPLAIN select * from ACCOUNT WHERE UUID IN('43d92fe6-2379-4697-9ad9-74e14bb6e3e7');
+----+-------------+---------+-------+---------------+---------+---------+-------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+-------+---------------+---------+---------+-------+------+-------+
| 1 | SIMPLE | ACCOUNT | const | PRIMARY | PRIMARY | 110 | const | 1 | |
+----+-------------+---------+-------+---------------+---------+---------+-------+------+-------+
1 row in set (0.01 sec)
加入我要去join另外一个表ACCOUNT_LINK, UUID在这张表上也是有索引的, 那么也应该很快吧?
mysql> EXPLAIN select * from ACCOUNT INNER JOIN ACCOUNT_LINK ON ACCOUNT.UUID=ACCOUNT_LINK.UUID WHERE AUTH_DOMAIN='snda' and ACCOUNT_LINK.LOGINNAME='danieltest001';
+----+-------------+--------------+--------+-----------------------------------------------------+---------------+---------+---------------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------+--------+-----------------------------------------------------+---------------+---------+---------------------------+------+-------------+
| 1 | SIMPLE | ACCOUNT_LINK | ref | PRIMARY,idx_loginname,IDX_ACC_LINK_DOMAIN_LOGINNAME | idx_loginname | 272 | const | 1 | Using where |
| 1 | SIMPLE | ACCOUNT | eq_ref | PRIMARY | PRIMARY | 110 | account.ACCOUNT_LINK.UUID | 1 | |
+----+-------------+--------------+--------+-----------------------------------------------------+---------------+---------+---------------------------+------+-------------+
2 rows in set (0.05 sec)
如果我只是把Join改成IN, 也应该很快吧? 因为sub query 只返回一条记录, 区分度非常高, 外面的查询正好又可以用索引. 但是, MySQL会先查外面的表, 太逆天了吧?
mysql> EXPLAIN select * from ACCOUNT WHERE UUID IN(select UUID from ACCOUNT_LINK WHERE AUTH_DOMAIN='snda' and LOGINNAME='danieltest001');
+----+--------------------+--------------+-----------------+-----------------------------------------------------+---------+---------+------------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+--------------+-----------------+-----------------------------------------------------+---------+---------+------------+---------+-------------+
| 1 | PRIMARY | ACCOUNT | ALL | NULL | NULL | NULL | NULL | 8493047 | Using where |
| 2 | DEPENDENT SUBQUERY | ACCOUNT_LINK | unique_subquery | PRIMARY,idx_loginname,IDX_ACC_LINK_DOMAIN_LOGINNAME | PRIMARY | 292 | func,const | 1 | Using where |
+----+--------------------+--------------+-----------------+-----------------------------------------------------+---------+---------+------------+---------+-------------+
2 rows in set (0.00 sec)
订阅:
博文 (Atom)