2014-02-26

Ganglia Web UI Performance Improvement for Large Clusters

There are many transitions people tend to make over time as their Ganglia cluster grows in order to maintain performance. Many of them have been covered in other blogs, but here is a brief summary of the scaling steps I usually take, from the initial cluster up through thousands of hosts and hundreds of thousands of metrics.

All notes in this post apply to gmond and gmetad versions 3.5.0 and the ganglia-webfrontend version 3.3.5

The cluster starts out with gmond installed everywhere (configured for unicast) sending all data to the monitoring host (usually also running nagios). gmond listens there and collects all the metrics for my single cluster. gmetad runs there and writes this data to local disk.

Soon I outgrow a single cluster and so spin up multiple gmond processes, each configured to listen on a separate port. gmetad lists each gmond as a separate data_source and I have multiple clusters in the web UI.

Eventually the monitoring host becomes a bit overloaded and so ganglia (both gmetad and gmond) move to a separate host, leaving nagios and all the other cruft there to itself.

The next step is moving the gmond collector processes to their own host. At this point I usually set up two gmond collector servers for redundancy. Each server has the same configuration - one gmond process per ganglia cluster, listening on a separate port. The ganglia web UI and gmetad live on the same server and both gmond collectors are listed on each data_source line. I also create two gmetad/webui hosts at this point, also for redundancy, both to preserve data if the disk dies on one but also to separate out the one nagios talks to from the one I use as the web UI. Distributing traffic in this way helps the web UI stay snappy for people while letting nagios hammer the other.

As the grid grows and the number of metrics increase, local disk on the gmetad / webui host starts to fail to keep up.  I think this is around 70k metrics, but it depends on your disk. The solution at this point is either to install rrdcached or switch to a ramdisk. rrdcached will get you a long ways, but I think over ~350k metrics, local disk can't even keep up with rrdcached's write load. There are undoubtedly knobs to tweak to push rrdcached much further, but just using a ramdisk works pretty well.

All of these steps successfully scale the ganglia core very well. With redundant gmond processes, redundant gmetads, and some sort of ram buffer or disk for the RRDs, you can get up to ~300k metrics pretty easily.

On to the meat of this post.

Somewhere between 250k and 350k metrics, the ganglia web UI started to slow down dramatically. By the time we got to ~330k, it would take over 2 minutes to load the front page. gmetad claimed it was using ~130% CPU (over 100% because it's multithreaded and using multiple cores) but there was still plenty of CPU available. Because it was on a ramdisk, there was no iowait slowing everything down. It wasn't clear where the bottleneck was.

Individual host pages would load relatively quickly (2-3s). Cluster views would load acceptably (~10s) but the front page and the Views page were unusable.

Vladimir suggested that even though there was CPU available, the sheer number of RRDs gmetad was managing was making it slow to respond to requests to its ports. He suggested running two gmetad processes, one configured to write out RRDs and the other available for the web UI.

A brief interlude - gmetad has two duties: manage the RRDs and respond to queries about the current status of the grid, clusters, and individual hosts. It has two TCP ports it listens to, one that gives a dump of all current state and the other which provides an interactive port for querying about specific resources.  The web UI queries gmetad for the current values of metrics and to determine which hosts are up. It then reads the RRDs for the up nodes and clusters to present to the end user.

By separating the two duties gmetad performs into two separate processes, there is no longer contention between managing the files on disk and responding to the web ui. Making this separation dropped the time needed to load the front page from 2m45s to 0m02s.

Here are the changes necessary to run two gmetad processes in this configuration.
  • duplicate gmetad.conf (eg gmetad-norrds.conf) and make the following changes:
    • use a new xml_port and interactive_port. I use 8661 and 8662 to mirror the defaults 8651 and 8652
    • specify a new rrd_rootdir. The dir must exist and be owned by the ganglia user (or whatever user runs gmetad) but it will remain empty. (this isn't strictly necessary but is a good protection against mistakes.)
    • add the option 'write_rrds off'
  • duplicate the start script or use your config management to make sure two gmetad processes are running, one with each config file
  • edit /etc/ganglia-webfrontend/conf.php (or your local config file) and override the port used to talk to gmetad; specify 8661 instead of the default.

That's it!

p.s. There are some claims that the current pre-release gmetad has many improvements in locking and how it hashes hosts, so the performance problems that leads to this solution may vanish with future releases.

6 comments:

Unknown said...

Thank you for the improvement. I've applied it and it works fine.

My question is: Shall we duplicate the data_source directives in the new gmetad instance? Because I am noticing a duplication of network traffic in the gmetad box after the change. Aren't we polling each gmond twice with this improvement?

I tried not to duplicate those directives, but the web UI home page doesn't show the clusters. May be the network traffic overhead is a necessary evil.

ben said...

Yes, it does duplicate the network traffic to the gmond collectors. While this is unfortunate, network bandwidth (at least in my infrastructure) is not a bottleneck for ganglia data.

You're correct that if you don't duplicate the data_source directives in each gmetad.conf file, it won't work. Both gmetad instances need all the data for this trick to work.

I've heard rumors that the latest ganglia builds have a number of improvements that might make this approach unnecessary, though I haven't tested them myself.

Unknown said...

Thanks for the clarification.

I'been running gmetad 3.7.0 (pre-release) for a week, and yes, there are less gaps in my big clusters. But the slow front page loading is still an issue. In fact, that's the main reason I applied this improvement.

Nova Leo said...

Thanks for this cool article. And I have one question:
>>edit /etc/ganglia-webfrontend/conf.php (or your local config file) and override the port used to talk to gmetad; specify 8661 instead of the default.

In gweb2's conf, it use port 8652 by default, not 8651. So we should use 8662 instead of 8661 in your case?

If I use 8661, ganglia web will not show summary value like CPUs total/Hosts up/Hosts down.
8662 seems OK.

sid yu said...

Thank you very much, it is really helpful for us.

Anh Phan said...

thuốc bổ khớp glucosaminChondroitin Sulfate kích thích quá trình tổng hợp các Proteoglucan - thành phần cơ bản tạo nên sụn nên có tác dụng tái tạo mô sụn, bảo đảm sụn vừa chắc vừa có tính đàn hồi.
thuốc khớp glucosamine của mỹ MSM (MethylSulfonylMethane) là chất hữu cơ thiên nhiên Sulfur Assimilation được sản sinh từ biển, được gọi là chất chống viêm.
thuoc xuong khop cua my Chất này chiếm vị trí thứ 9 trong hàm lượng khoáng sản cơ thể, tác dụng tối ưu của nó là trị dứt các cơn đau. Chất MSM là chất bột kết tinh tự nhiên, ổn định, không mùi, không có ảnh hưởng đến ruột.
thuoc glucosamin cua my Công hiệu khi dùng MSM trị các cơn đau: Các cơn đau phần lớn tập trung ở tế bào cơ thịt. có nhiều loại cơn đau được quy là do lực áp xuống tế bào, khi tế bào chịu áp lực bên ngoài, nó sẽ trương phù và trở nên viêm.
thuốc xương khớp glucosamine Thông thường, tế bào sẽ mất đi sự dẻo dai đàn hồi từ đó tạo thành các cơn đau. Hiệu quả khả quan của MSM chính là duy trì sự lưu thông của tế bào, làm tan biến các chất không tốt đối với tế bào, khiến nguồn dinh dưỡng vào cơ thể dễ dàng, phòng tránh tế bào chịu lực áp, từ đó sẽ cắt được các cơn đau.
thuốc bổ xương glucosamine Những bệnh nhân viêm khớp mỗi ngày sau khi bổ sung lượng MSM, sẽ giảm và trị được các bệnh viêm khớp.
thuoc bo khop glucosamineCác thành phần Wellesse Joint Movement Liquid Glucosamine kết hợp cần thiết, thành phần tự nhiên trong một công thức lỏng toàn diện để duy trì sức khỏe chung và bảo vệ sụn khớp từ việc vận động cơ thể hằng ngày.
thuốc bổ xương khớp glucosamine Thật tuyệt vời với công thức chất lỏng có thể dễ dàng hấp thụ vào cơ thể. Wellesse Joint Movement Liquid Glucosamine bổ sung MSM, Glucosamine chondroitin và collagen.
thuoc bo khop glucosamin cua my Các thành phần cần thiết được pha trộn với nhau theo một công thức chất lỏng, rất dễ uống.
thuoc bo khop glucosamin Vì khi chúng ta bổ sung bằng chất lỏng, các thành phần được hấp thụ vào máu có thể sẽ nhanh hơn so với bổ sung dạng viên hoặc bột.
thuoc bo xuong khop cua my Move Free Joint Health là hãng dược phẩm Hoa Kỳ đi đầu về các sản phẩm điều trị liên quan đến cơ và xương khớp.