Session slides, scripts and recording from my session at the Runecast Virtual Conference is available
We’re always looking out for new ways to visualize our performance data. Most often to aid our troubleshooting, but sometimes most for fun. This is one of those times. Let’s check out how we can visualize our compute usage across the world!
This post is a (late) follow-up on a previous post I did about exploring the monitoring endpoints of the vCenter Server Appliance (VCSA), and an addition to the vSphere Performance blog series. Now we will add performance metrics and health status of the VCSA to our monitoring solution. We’ll utilize the REST APIs in vCenter and feed the data into our Influx database and visualize it in Grafana. In vCenter we have the Appliance Management page also refered to as the VAMI.
Recently there was a new release of Telegraf, a monitoring agent from the guys that built InfluxDB. This new version, 1.8.0, comes with a plugin for vSphere which I’m pretty excited about! Previously I’ve been testing Telegraf for monitoring some Linux VMs and also my InfluxDB servers and the agent works as expected and it’s as easy to use as the other products in the TICK stack from Influx. If you’ve followed my blog series about building a monitoring solution for vSphere and other infrastructure components you know that I’ve pulled metrics with PowerCLI scripts.
I had the privilege of delivering 3 sessions at VMUG Norway this week in Oslo, Trondheim and Bergen. With the extremely nice weather in Norway this week in mind the attendance were great and as always the discussions were valuable. My session on vSphere Performance monitoring were the short version of the blog series I did about how we built our solution for doing performance monitoring of vSphere with InfluxDB and Grafana, and how we easily can customize with adding metrics and datasources.
I’ll be speaking at VMUG Norway’s meetings this May. As always there will be “three sessions in three cities”. Oslo, May 29th Trondheim, May 30th Bergen, May 31st The topic for my session will be how we have built our own vSphere Performance monitoring solution which I’ve also done a blog series about. The VMUG meetings are free, for more information check out https://www.vmug.com/norway. I hope you’re able to join!
At work I have done some monitoring projects which I’ve done many blog posts about. At home I have a small vSphere environment serving partially as a Lab but it also runs some services we use at home. Of course I do monitoring of this environment as well, and I use both InfluxDB and Grafana as we do at work. One of my VMs runs Plex Media Server and recently I moved my media library to a separate box running FreeNAS.
In my blog series on building a solution for monitoring vSphere Performance we have scripts for pulling VM and Host performance. I did some changes to those recently, mainly by adding some more metrics for instance for VDI hosts. This post will be about how we included our VSAN environments to the performance monitoring. This has gotten a great deal easier after the Get-VSANStat cmdlet came along in recent versions of PowerCLI.
Last year we started building our own solution for Performance Monitoring of our Infrastructure platform with the focus on the VMware vSphere environment. The components used for this solution is PowerCLI for extracting the metrics, InfluxDB for storing the metrics, and Grafana for presenting the metrics. I did a Blog series on this project which explains in detail what we did when building the solution. The solution has been very well received and are used daily by many of my colleagues, and we frequently update the solution with new metrics and dashboards.
For those of you that have read my blog you probably know I’ve done a series on performance monitoring infrastructure with the help of InfluxDB. InfluxDB is a part of the TICK stack delivered by InfluxData. All components are open-sourced and available. The TICK stack consists of, Telegraf, InfluxDB, Chronograf and Kapacitor. This post will do a quick review and some examples on how I have started exploring them in my Performance monitoring project.