Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I beyond 600K? #3

Open
Ranler opened this issue Mar 30, 2013 · 9 comments
Open

How can I beyond 600K? #3

Ranler opened this issue Mar 30, 2013 · 9 comments

Comments

@Ranler
Copy link

Ranler commented Mar 30, 2013

OS: CentOS6.2
Kernel: 2.6.32-279.14.1.el6.x86_64
RAM: 32GB ECC
CPU: Xeon E5645 @2.40GHz * 2
JDK: 1.6.0_31

首先第一次按照http://http-kit.org/600k-concurrent-connection-http-kit.html 的设置

客户端基本正常:

...
time 200s, concurrency: 547579, total requests: 2560038, thoughput: 26.39M/s, 12786.00 requests/seconds
time 201s, concurrency: 547779, total requests: 2580908, thoughput: 26.35M/s, 12809.68 requests/seconds
time 202s, concurrency: 548083, total requests: 2598713, thoughput: 26.33M/s, 12816.88 requests/seconds
time 203s, concurrency: 548558, total requests: 2625809, thoughput: 26.42M/s, 12886.39 requests/seconds
time 205s, concurrency: 548709, total requests: 2631873, thoughput: 26.29M/s, 12799.57 requests/seconds
remote closed cleanly
remote closed cleanly
remote closed cleanly
remote closed cleanly
remote closed cleanly
remote closed cleanly
remote closed cleanly

然后服务端就开始报错:

Sat Mar 30 10:06:45 CST 2013 [server-loop] ERROR - queue size exceeds the limit 20480, please increase :queue-size when run-server if this happens often
java.util.concurrent.RejectedExecutionException
        at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:1768)
        at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767)
        at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658)
        at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:78)
        at org.httpkit.server.RingHandler.handle(RingHandler.java:108)
        at org.httpkit.server.HttpServer.decodeHttp(HttpServer.java:114)
        at org.httpkit.server.HttpServer.doRead(HttpServer.java:168)
        at org.httpkit.server.HttpServer.run(HttpServer.java:239)
        at java.lang.Thread.run(Thread.java:662)
...

第二次,修改main.clj中queue-size

...
(defn -main [& args]
  (run-server (-> handler wrap-keyword-params wrap-params)
              {:port 8000 :queue-size 1024000})
  (println (str "Server started. listen at 0.0.0.0@8000")))

接下来测试正常,客户端:

...
time 353s, concurrency: 597072, total requests: 5530283, thoughput: 32.15M/s, 15656.77 requests/seconds
time 354s, concurrency: 597072, total requests: 5568259, thoughput: 32.16M/s, 15714.85 requests/seconds
time 355s, concurrency: 597072, total requests: 5586378, thoughput: 32.30M/s, 15720.95 requests/seconds
time 356s, concurrency: 597072, total requests: 5604011, thoughput: 32.34M/s, 15724.11 requests/seconds
time 357s, concurrency: 597072, total requests: 5629155, thoughput: 32.38M/s, 15746.50 requests/seconds
time 358s, concurrency: 597072, total requests: 5651914, thoughput: 32.44M/s, 15765.80 requests/seconds

第三次,调整ConcurrencyBench.java下每个IP并发数:

 final static int PER_IP = 25000

基本到了660K就上不去了,一直Connection timed out:

...
time 338s, concurrency: 664873, total requests: 4863842, thoughput: 26.50M/s, 14386.49 requests/seconds
time 339s, concurrency: 664871, total requests: 4867369, thoughput: 26.43M/s, 14354.13 requests/seconds
time 340s, concurrency: 664870, total requests: 4871330, thoughput: 26.35M/s, 14323.02 requests/seconds

现在瓶颈在CPU? 或者通过增加IP地址来提高并发?

@shenfeng
Copy link
Member

你的机器真舒服。 应该轻松过百万,调一下TCP参数,轻松过200万,把测试客户端移到另外的机器上,300万可以达到。

建议:

  1. 升级到JDK7
    public static int randidelTime() {
//  把这个数改大
        int ms = 5000 + r.nextInt(45000); // 5s ~ 50s
        return ms;
    }
         if (opened < CONCURENCY) {
// 可以试着调大这个数
                Thread.sleep(20); // open 5000 per seconds most
            }
// 单个IP 可到6万左右
    final static int PER_IP = 20000;
    final static InetSocketAddress ADDRS[] = new InetSocketAddress[30];
    // 600k concurrent connections
    final static int CONCURENCY = PER_IP * ADDRS.length;

    static {
        // for i in `seq 200 240`; do sudo ifconfig eth0:$i 192.168.1.$i up ; done
        final int PORT = 8000;
        final int IP_START = 200;

@Ranler
Copy link
Author

Ranler commented Mar 30, 2013

学校的小集群,呵呵,计划极限的测一下试试,给http-kit当个测试样例。我把客户端移走试试,不知网络IO会不会成瓶颈。

@shenfeng
Copy link
Member

我刚改了一下代码。 在学好真好! 可以试着把 randidelTime 改得很大,网络开销会好很多。

    public static int randidelTime() {
        int ms = 10000 + r.nextInt(90000); // 10s ~ 100s
        return ms;
    }

可以先试一下单机,这种方式最简单了。

客户端移走,稍微改一下代码 ConcurrencyBench 就可。 单IP可以到6万连接。
我没有资源做这种测试,所以也很感兴趣这个结果,望能分享。

@Ranler
Copy link
Author

Ranler commented Mar 30, 2013

那是一定,这边机器闲着也是浪费,有什么测试需求尽管说。

那我先测测单机。

@Ranler
Copy link
Author

Ranler commented Mar 30, 2013

现在又碰到一个瓶颈,大概并发量在100万左右,就开始Connection reset by peer

JDK换成了:

$ java -version
java version "1.7.0_17"
Java(TM) SE Runtime Environment (build 1.7.0_17-b02)
Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode)

服务端: queue-size:204800,309600,影像不大。

客户端:

  • randidelTime中随机90000(90+万),120000,150000,200000,270000,300000(110-万),并发量略有提高(几万左右),不是很明显。
  • Thread.sleep试过20,30,40,最大并发量几乎没变化。

@shenfeng
Copy link
Member

100万应该不是 系统的极限,我曾经达到过160万 (需要revert这个改动shenfeng/dictionary@0b741e8

http://shenfeng.me/how-far-epoll-can-push-concurrent-socket-connection.html

你可以从这几个方面看一下:

  • jvisualvm 看那个进程是不是被gc累坏了,试着调整下JVM的参数,run_server 里面,一般就加大内存试验一下
  • htop或者其它程序监控系统的CPU和内存使用情况。 看看这个时候机器是不是被累坏了

Connection reset by peer 我也有遇到过,不是特别清楚为什么,可能会是由于没有资源了。
看看直接用浏览器访问,看server还能不能响应,和latency。

@Ranler
Copy link
Author

Ranler commented Apr 4, 2013

shenfeng,
抱歉,最近没摸机器。今天继续测了一下,结果如下:

  1. 首先,还是上次的测试参数,查看GC:
    jvm_100_1

可以看到峰值时老年代已经满了,这时并发100万左右频繁报异常。这应该是JVM内存不够的原因。

  1. 把JVM设为-xmx6144m -xms6144m,再次测试:
    jvm_150_1

这次老年代没有满,并发达到了150万左右开始报异常。可以看到有个一直GC繁忙的时间段。
这个时间段就一直抱异常。

下图是host的情况,CPU总体并不繁忙,大概是大部分时间在IO等待。内存已被消耗殆尽,但是test的进程设了-xmx4096m,server进程设了-xmx6144m,这加起来才10GB,但是host用了快30GB,也许是因为Java NIO分配了大量JVM堆外内存的结果(或者是kernel管理大量链接所需的内存?)
TM 20130404111921

如你之前所说,http-kit的最大并发量仍是一个多方面的原因。

@Ranler
Copy link
Author

Ranler commented Apr 4, 2013

还有,关于GC采用哪种方式有没有什么推荐?

@shenfeng
Copy link
Member

shenfeng commented Apr 4, 2013

也许是因为Java NIO分配了大量JVM堆外内存的结果(或者是kernel管理大量链接所需的内存?)

http-kit 只用了64k的 堆外内存(所有的共享这一个)。估计原因是TCP 的read/write buffer耗掉了所有的内存。可以考虑google一下,然后设置得小一点。 默认可能是8k左右,调整到2k或者4k,能double这个数字。

维护一个连接,http-kit 需要大概2k内存,150万 x 2k = 3G, 所以可以设置JVM的内存为4G左右,如果150万。

还有,关于GC采用哪种方式有没有什么推荐?

默认的可能ok。对这个的配置也不熟悉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants