Daniel's Blog

2014年12月17日星期三

Cassandra and Aerospike, the one million TPS war

Dec 2nd, 2011, Netflix first put one million writes on 288 EC2 m1 extra large nodes, triple-replicate. $0.078 per million writes.
The wars began!
http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html

July 19th, 2012, Netflix tested Cassandra with 12 SSD-based EC2 nodes, triple-replicate. $0.013 per million writes.
http://techblog.netflix.com/2012/07/benchmarking-high-performance-io-with.html

March 20th, 2014. 330 Cassandra nodes with Google's GCE. No hardware spec.
http://googlecloudplatform.blogspot.sg/2014/03/cassandra-hits-one-million-writes-per-second-on-google-compute-engine.html

December 3rd, 2014. Aerospike made it with 50 nodes. $0.01 per million writes.
http://www.aerospike.com/blog/1m-wps-6x-fewer-servers-than-cassandra

To be continued...

2014年8月5日星期二

MySQL in sub query performance pitfall

假如Account表有一个主键UUID, 那么下面这个SQL明显是应该很快返回的对吧

mysql> EXPLAIN select * from ACCOUNT WHERE UUID IN('43d92fe6-2379-4697-9ad9-74e14bb6e3e7');

+----+-------------+---------+-------+---------------+---------+---------+-------+------+-------+

+----+-------------+---------+-------+---------------+---------+---------+-------+------+-------+

+----+-------------+---------+-------+---------------+---------+---------+-------+------+-------+

1 row in set (0.01 sec)

加入我要去join另外一个表ACCOUNT_LINK, UUID在这张表上也是有索引的, 那么也应该很快吧?

mysql> EXPLAIN select * from ACCOUNT INNER JOIN ACCOUNT_LINK ON ACCOUNT.UUID=ACCOUNT_LINK.UUID WHERE AUTH_DOMAIN='snda' and ACCOUNT_LINK.LOGINNAME='danieltest001';

+----+-------------+--------------+--------+-----------------------------------------------------+---------------+---------+---------------------------+------+-------------+

+----+-------------+--------------+--------+-----------------------------------------------------+---------------+---------+---------------------------+------+-------------+

+----+-------------+--------------+--------+-----------------------------------------------------+---------------+---------+---------------------------+------+-------------+

2 rows in set (0.05 sec)

如果我只是把Join改成IN, 也应该很快吧? 因为sub query 只返回一条记录, 区分度非常高, 外面的查询正好又可以用索引. 但是, MySQL会先查外面的表, 太逆天了吧?

mysql> EXPLAIN select * from ACCOUNT WHERE UUID IN(select UUID from ACCOUNT_LINK WHERE AUTH_DOMAIN='snda' and LOGINNAME='danieltest001');

+----+--------------------+--------------+-----------------+-----------------------------------------------------+---------+---------+------------+---------+-------------+

+----+--------------------+--------------+-----------------+-----------------------------------------------------+---------+---------+------------+---------+-------------+

+----+--------------------+--------------+-----------------+-----------------------------------------------------+---------+---------+------------+---------+-------------+

2 rows in set (0.00 sec)

2013年7月26日星期五

Mac OSX Lion之后某些应用内长按a出现拉丁注音字母的问题

对于我而言, 需要全局禁止:

sudo defaults write -g ApplePressAndHoldEnabled -bool false

这个东西太讨厌了

2012年10月8日星期一

Cassandra运维问题小记

一眨眼cassandra从0.7到1.0.8用了快半年多快一年了, 总体来说, 这真是一个非常好的开源项目, 但是运维过程中也遇到过一些小问题, 总结一下, 省得自己忘记

1. ParNew GC Pause
有三台服务器经常会产生SocketTimeout, 看了日志发现每次compaction之后就会产生一次ParNew GC, 最长有6秒左右, 有时候没有compaction平均每5-10分钟也会ParNew GC一次, 不过这种情况相对比较短, 在300-500ms左右. 观察了其他节点, 一到两个月才会出现一次200ms以上的ParNew GC.

ParNew是一个stop-of-the-world GC, 当young的对象promote到old generation的时候, 这个GC会暂停整个VM一会. 最长6秒的GC会导致前端Hector客户端SocketTimeout.

jmap看了一下heap分布, 又看了一下启动脚本, 发现总共4GB的heap, 其中young generation是200MB * cores = 800MB, survivor ratio=8, 这样eden=720MB, suvivor=160MB. cassandra的启动脚本这样计算young gen是为了减少promotion, 避免更严重的old generation的频繁GC.

用jmanage连上jmx接口看了一下gc动作, 每5分钟从1.3G到2.6G之间抖动, , 这是比较大的young generation gc, 所以会经常有200ms以上的ParNew GC. 但是另外三个节点抖动没这样么剧烈.

查找资料后来发现jdk6_u06在young到old的promotion的时候一次只能复制16MB, 要u22之后才能去掉这个限制. 再次检查产品环境发现init.d里的启动脚本有问题, 有问题的三个节点用的是jdk6_u06启动的, 这么大的young generation做promotion一次只能copy 16MB肯定会有问题.于是在三个节点上改启动脚本用jdk7启动, 重启后发现三个节点有两个正常, 一个仍然频繁GC.

JDK7在处理小于32G的heap的时候会自动压缩64位的指针为32位, 但是jdk6某些版本需要额外加参数才能打开, 这会大幅度减小heap的大小, 这样的好处有
1. 虽然内存(DRAM)不贵, 但是缓存(SRAM)贵, 压缩后的指针虽然需要encode/decode但是可以让L1/L2 cache保存更多的指针, 如果按照管官方文档说的可以减小1倍, 那么等同于也就提高一倍L1/L2 cache了.
2. 减小内存总线的压力. NUMA架构下尽管已经被SMP的全局数据总线效果好很多, 但是处理器交叉访问内存的时候仍然会对总线宽度产生压力. 压缩后的指针对总线的压力更小.

最后有问题的那个节点是服务器硬件本身就比较老, 2007年的E5405处理器, 在Centos5下面跑连NUMA都不支持. 没办法, 只能通过VM调优, 降低内存使用, 继而减少GC行为. 为此再次加入两个String相关的VM参数优化内存,
JVM_OPTS="$JVM_OPTS -XX:+UseStringCache"
JVM_OPTS="$JVM_OPTS -XX:+OptimizeStringConcat"
加入这两个参数重启后, 再也没有出现200ms以上的ParNew GC.

2. Dead node appears in the ring
有一个节点因为主板硬件故障, 被我们decomission & remove token了. 当时没问题, 但是后来发现重启其他节点会看到. 估计这是cassandra的一个小bug, 解决方法很暴力, 直接通过JMX改Gossiper, See http://mail-archives.apache.org/mod_mbox/cassandra-user/201206.mbox/%3C02CB6332-9EF7-434F-96EE-80F93CC5EB8D@hibnet.org%3E

下次再实验一下
1. G1 GC的运行效率如何
2. 使用JNA之后heap使用能下降多少

2012年5月11日星期五

Mac/Linux做Rest Web Service开发一个非常方便的alias

只有一行

alias jsoncat='python -c "import sys, json; print json.dumps(json.load(sys.stdin), sort_keys=True, ensure_ascii=False, indent=4)"'

然后你就可以

Daniel-Wus-iMac:~ danielwu$ echo '{"a":"b"}' | jsoncat
{
"a": "b"
}

2011年4月28日星期四

A quick map access benchmark

I got a 10K entries big hash map and have 1k threads to traverse the map.

Single thread
Erlang ETS 3+ seconds
Java HashMap 0.18 seconds
Java ConcurrentHashMap 0.41 seconds
High-scale lib NonBlockingHashMap 0.43 seconds

1000 threads
Erlang ETS 2+ seconds
Java HashMap N/A (not thread-safe)
Java ConcurrentHashMap 0.56 seconds
High-scale lib NonBlockingHashMap 0.51 seconds

2011年4月13日星期三

json java libraries benchmark - jsonlib, jackson, gson

Because I don't parse huge json string so I only test object binding which is much more convenient. For the same reason, I did not have jettison or gson-streaming in this test because they are much faster with the trade-off of lower level of encapsulation and OOP.

For gson and jackson, the instance of Gson and ObjectMapper can be reused. Jackson performs very good because its ObjectMapper caches class mapping meta data. Jackson will be slow if you don't reuse ObjectMapper, it's designed for reuse.

The scala built-in parser is extremely slow (100 X slower!) and lift-json is the defacto lib for scala, which is about 15% slower than jackson.

-	Bean to JSON	JSON to Bean
Jsonlib 2.4	2058	3055
Gson 1.7	1481	1472
Jackson 1.7.6	694	667

Test env: jdk6 on Mac OS X 10.6.7
unit: millseconds

Conclusion: Jackson is the best solution for server side because of its cache. Another win of jackson is that it has zero dependency, jsonlib sucks a lot in both performance and dependency. Gson is also very fast with zero dependency, without cache, it's ad-hoc usage is the best, it's best for less heavy use of json, like android device or web start applet (170K only).

2010年5月7日星期五

Fix eclipse corrupt index

When your eclipse index is corrupt, delete these files:

C:\pogo\p4\.metadata\.plugins\org.eclipse.jdt.core\savedIndexNames.txt
C:\pogo\p4\.metadata\.plugins\org.eclipse.jdt.core\*.index

2010年4月19日星期一

利用Dropbox同步Squirrel SQL Client配置

安装dropbox并设置好同步, 找到这个文件
C:\Documents and Settings\<用户名>\.squirrel-sql
先备份之, 然后复制一份到你的dropbox共享目录,比如我的是"C:\Documents and Settings\danielwu\Desktop\My Dropbox\Config Files".

在Squirrel SQL client的启动脚本里最后启动java的地方加上一个参数
start "Squirrel SQL Client" ...... -Duser.home="C:\Documents and Settings\danielwu\Desktop\My Dropbox\Config Files" ......

启动Squirrel SQL看看是否正常, 如果一切正常那么,把所有需要同步的电脑都这样改脚本, 等dropbox同步好了,就可以了.

2010年4月4日星期日

卸载微软拼音2007输入法

实在搞不懂为什么微软从Office2000开始总要搭配这个病毒一样的输入法。

1. MsiExec.exe /X{90120000-0028-0804-0000-0000000FF1CE}
2. 控制面板里删掉mspy 3
3. 注册表删掉Windows/Current_Version/IME下的东西
重启

2009年6月23日星期二

Oracle EM Console 更改hostname后无法启动

Oracle EM Console 比较弱智，改了机器名或者IP经常就不能正常启动，用这个命令行来重新配置EM Console吧

emca -config dbcontrol db

2009年5月14日星期四

Oracle三种常用Join

The three most commonly used joins are Indexed Nested Loops, Hash Join, and Sort-Merge Join.

Indexed Nested Loops

The Nested Loop join is an iterative join: for each row in the first (inner) row source, lookup matching rows in the second (outer) row source. If the nested lookup of the second row source performs a Unique or Range Index Scan, then we call this Indexed Nested Loops.

Indexed Nested Loops is used primarily in low volume joins; it is efficient over small volumes and versatile enough to be used in a variety of situations. Although it is fully scalable, Indexed Nested Loops is inefficient over large data volumes.

Hash Join

The hash join is used for high-volume equi-joins (joins with equals predicates). Oracle performs a single read of the smaller row source (call this T1) and builds a hash table in memory. The join key is used as the hash-key of the hash table. Then a single pass of the larger row source (call this T2) is performed, hashing the join key of each row to obtain an address in the hash table where it will find matching T1 rows.

Provided T1 remains small enough to build the hash table in memory, T2 can be scaled up to any arbitrarily large volume without affecting throughput or exceeding temp space. If T1 cannot be hashed in memory, then a portion of the hash-table spills to disk. When the hash table is probed by T2, the rows with join keys that match those parts of the in-memory hash table are joined immediately; the rest are written to TEMP and joined in a second pass. The bigger T1 is, the smaller the proportion of the hash table that can fit in memory, and the larger the proportion of T2 that must be scanned twice. This slows the Hash Join down considerably and also makes the join non-scalable.

Sort-Merge

A sort-merge join works by reading each row-source in the join separately; sorting both sets of results on the join column(s); then concurrently working through the two lists, joining the rows with matching keys. Sort-Merge is generally faster than Indexed Nested Loops but slower than Hash Join for equi-joins. It is used almost exclusively for non-equi joins (>, <, BETWEEN) and will occasionally be used when one of the row sources is pre-sorted (eg. a GROUP BY inline view)

If both row sources are small then they may both be sorted in memory, however large sorts will spill to disk making then non-scalable.

There is no way to make a Sort-Merge join scalable. The only other way to resolve a non-equijoin is to use Nested Loops, which is slower. As volumes increase, Sort-Merge will continue to out-perform Nested Loops, but will eventually run out of Temp space. The only solution is to extend TEMP, or convert the join to Nested Loops (and then wait).

2009年3月23日星期一

Don't cache Singleton object in a serializable object

不要在可序列化对象中缓存Singleton

如果你有一个对象其中某个字段保存了一个对Singleton的引用，那么这个对象在序列化读取后会导致在同一个虚拟机里Singleton对象有两个。比如下面的例子

public class Test3 {

   public static void main(String[] args) throws IOException, ClassNotFoundException {
       System.out.println(ABC.getInstance());
       ByteArrayOutputStream out = new ByteArrayOutputStream();
       new ObjectOutputStream(out).writeObject(ABC.getInstance());
       ObjectInputStream in = new ObjectInputStream(new ByteArrayInputStream(out.toByteArray()));
       ABC abc = (ABC) in.readObject();
       System.out.println(abc);

   }
}

class ABC implements Serializable {
   private ABC() {}
   private static ABC instance = new ABC();
   public static ABC getInstance() {
       return instance;
   }
}

你会发现，打印出的两个ABC的instance是不一样的。解决方法很简单，在ABC加入这个方法
    protected Object readResolve() {
        return instance;
    }

这样可以重置instance到这个虚拟机中的Singleton实例，还可以把引用instance的字段作为transient字段节省IO时间。
但是这不是最好的办法，最好就是，如果你知道ABC是Singleton，那么就永远不要把ABC赋值到你的对象成员变量里，getInstance()因为是static方法，编译器通常会做inline的，频繁调用不会导致频繁方法栈操作，所以缓存它意义不大。

2009年3月22日星期日

Literal Pitfall

我们过去经常使用这样的方式定义常量, 比如我最不喜欢的java.util.Calendar类里面定义月份有
public static final int    APRIL =   3
public static final int     MAY    =   4
public static final int     JUNE =   5
...

我相信很多人也是这样定义常量或者枚举型。其实这样会有一个很严重的问题，和编译器的行为有关系。VM Spec 2.17.4中描述类初始化的发生条件时提到ClassA的某个常量字段比如ClassA.MAX被访问的时候不会导致ClassA类被初始化。

2.17.4 Initialization
........
A class or interface type T will be initialized immediately before one of the following occurs:

    * T is a class and an instance of T is created.

    * T is a class and a static method of T is invoked.

    * A nonconstant static field of T is used or assigned. A constant field is one that is (explicitly or implicitly) both final and static, and that is initialized with the value of a compile-time constant expression. A reference to such a field must be resolved at compile time to a copy of the compile-time constant value, so uses of such a field never cause initialization.

原因是如果ClassB引用ClassA.MAX，编译器会把ClassA.MAX的常量值复制到ClassB的常量池中。这样显然效率更高。

我们做一个实验，有两个类
public class ConstClass {
   public static final int TEST = 5;
}

public class RefClass {
    public static void main(String[] args) {
       System.out.println(ConstClass.TEST);
   }
}
编译后，执行RefClass明显应该打印5，我们用javap看一下pesudo code:
public class RefClass extends java.lang.Object{
public RefClass();
Code:
   0:   aload_0
   1:   invokespecial   #1; //Method java/lang/Object."<init>":()V
   4:   return

public static void main(java.lang.String[]);
Code:
   0:   getstatic       #2; //Field java/lang/System.out:Ljava/io/PrintStream;
   3:   iconst_5
   4:   invokevirtual   #3; //Method java/io/PrintStream.println:(I)V
   7:   return

}

这里你会看到iconst_5，RefClass并没有让VM加载ConstClass,事实上，你删除ConstClass.class也没有关系。
问题来了，如果你这样定义常量或者枚举值，将来如果ClassA.MAX的值你需要更改，那么你必须重新编译所有引用过这个值的类！那些类需要重新编译，这是非常难预测的，尤其是被频繁使用的API.

如何克服呢？有两种方式，一种是提供一个getTEST()来返回常量值，比如把刚才的类改成
public class ConstClass {
   public static final int TEST = 5;
   public static int getTEST() {
       return TEST;
   }
}
public class RefClass {
   public static void main(String[] args) {
       System.out.println(ConstClass.getTEST());
   }
}
由于getTEST()是static，编译器可能会inline他，因此效率不会太低

还有一种方式，是jdk1.5之后提供的enum
让我们重新写这两个类
public enum ConstClass2 {
   TEST(5);
   private int value;
   ConstClass2(int v) {
        this.value = v;
   }
   public int getValue() {
       return this.value;
   }
}

public class RefClass2 {
   public static void main(String[] args) {
       System.out.println(ConstClass2.TEST.getValue());
   }
}
在用javap看看引用类
public class RefClass2 extends java.lang.Object{
public RefClass2();
Code:
   0:   aload_0
   1:   invokespecial   #1; //Method java/lang/Object."<init>":()V
   4:   return

public static void main(java.lang.String[]);
Code:
   0:   getstatic       #2; //Field java/lang/System.out:Ljava/io/PrintStream;
   3:   getstatic       #3; //Field ConstClass2.TEST:LConstClass2;
   6:   invokevirtual   #4; //Method ConstClass2.getValue:()I
   9:   invokevirtual   #5; //Method java/io/PrintStream.println:(I)V
   12: return

}
你会发现这次5没有被复制到引用类的常量池，相反getstatic代替了iconst_，相当于ClassA.MAX会被解释成ClassA.getMAX()，这样效果其实和上面说的另外一种方法getTEST()是类似的。

其实enum还有其他的好处，JDK guide中说 (http://java.sun.com/j2se/1.5.0/docs/guide/language/enums.html)

Not typesafe - Since a season is just an int you can pass in any other int value where a season is required, or add two seasons together (which makes no sense).
No namespace - You must prefix constants of an int enum with a string (in this case SEASON_) to avoid collisions with other int enum types.
Brittleness - Because int enums are compile-time
constants, they are compiled into clients that use them. If a new
constant is added between two existing constants or the order is
changed, clients must be recompiled. If they are not, they will still
run, but their behavior will be undefined.
Printed values are uninformative - Because they are
just ints, if you print one out all you get is a number, which tells
you nothing about what it represents, or even what type it is.

其中第三点就是本文描述的问题，另外typesafe也是个问题，比如你完全可以把Integer.MAX_VALUE 传给 Calendar.set(Integer.MAX_VALUE, somevalue)，另外没有命名空间而且打印出来也很不友好。

所以，总之，还是enum吧？

2009年1月18日星期日

用 PhantomReference 避免OutOfMemory

PhantomReference比Weak/Soft的引用强度都要低,PhantomReference.get()总是返回null,为什么呢？

其实PhantomReference而要是不可缺少的重要引用类型, 我们知道WeakReference在finalize方法调用或被gc清理之前会进入ReferenceQueue, 这时候没有任何strong reference引用对象，然后可以通过ReferenceQueue去做收尾工作。但是在finalize()方法，或者重新给这个对象一个引用使它reachable(从某个活动线程调用栈，或者静态变量可达), 使这个对象延长生命周期暂时不会被gc清理。而PhantomReference只会在对象被从内存中清除后才会进入队列，get()总是返回null (ReferenceQueue通知一个PhantomReference的时候，既然内存都已经物理释放了，当然也无法给你一个对象，所以get总是返回null也有这个原因), 所以你没有办法重新使这个对象再次reachable。PhantomReference只是提供了一种方式让你跟踪一个曾经产生过的对象，由此让你知道这个对象到底有没有被物理的清除。

比如当你的applet程序需要处理一个非常大的图片的时候，你可能希望图片处理结束并且内存被释放之后再处理下一个图片。如果用PhantomReference来引用上一个图片对象，当ReferenceQueue通知你的时候，你就可以知道上一个内存对象已经被物理清除，你可以继续下一个大内存对象的处理了。这样就可以避免由于GC线程优先级低导致上一个大内存对象还没有释放下一个大内存对象又被创建，让OutOfMemoryError出现的概率低一些。

还有一个好处，就是PhantomReference比用finalize方法好的多，因为VM对finalize的处理不如PhantomReference简单可靠，只不过是你要写的代码稍微多一点点而已。

Ethan有一个更完整的对四种reference的解释 http://weblogs.java.net/blog/enicholas/archive/2006/05/understanding_w.html

订阅：博文 (Atom)

Daniel's Blog

2014年12月17日星期三

Cassandra and Aerospike, the one million TPS war

2014年8月5日星期二

MySQL in sub query performance pitfall

2013年7月26日星期五

Mac OSX Lion之后某些应用内长按a出现拉丁注音字母的问题

2012年10月8日星期一

Cassandra运维问题小记

2012年5月11日星期五

Mac/Linux做Rest Web Service开发一个非常方便的alias

2011年4月28日星期四

A quick map access benchmark

2011年4月13日星期三

json java libraries benchmark - jsonlib, jackson, gson

2010年5月7日星期五

Fix eclipse corrupt index

2010年4月19日星期一

利用Dropbox同步Squirrel SQL Client配置

2010年4月4日星期日

卸载微软拼音2007输入法

2009年6月23日星期二

Oracle EM Console 更改hostname后无法启动

2009年5月14日星期四

Oracle三种常用Join

2009年3月23日星期一

Don't cache Singleton object in a serializable object

2009年3月22日星期日

Literal Pitfall

2009年1月18日星期日

用 PhantomReference 避免OutOfMemory

我的简介

博客归档

标签

Twitter Updates

Twitter Updates