This blog was initially posted by an awesome member of the sysdig community, Kamol Mavlonov on http://blog.microservices.today/. He covers how to get up and running with Dashboards and Alerts to monitor the CPU, Memory, and Disk utilization in your mesos environments.
Creating a Disk Utilization Dashboard
- Under Explore tab select Server -> Overview.
- Choose Group by host (host.mac).
- On Table columns configuration (gear icon) Select the following fields
fs.used.percent
– FS Usage %fs.root.used.percent
– FS Root Usage %fs.largest.used.percent
– FS Largest Usage %fs.bytes.total
– FS Sizefs.bytes.free
– FS Free Spacefs.bytes.used
– Disk Used Bytes
- Change the color coding by clicking on the gear icon of each of the columns. (default is yellow after 50% and red after 80%)
- Pin the tab to the dashboard.
Creating an alert for 60% disk utilization
The following steps will help you to create an alert when the root directory (/) exceeds 60% of its usage:
- Under alert tab click add alert button
- Select the scope as
agent.tag.dcosName
orregion
.agent.tag.dcosName
Is the tag we have added for the dcos cluster.region
specify the aws region name. Assign the value for scope from the dropdown accordingly which one you selected as the scope. - Under
Set the condition
choose type asmanual
- For
Alert when
option Choosefs.used.percent
as the metric > 60% as the threshold value. - For
Segment by
choose second option, selectAny of
andfs.mountDir
metric. - Check the
Where
option and selectfs.mountDir
from the dropdown. Assign the value/
- Choose the minimum monitor value as
1 min
. - Specify the Name, Description and Severity of the alert.
- Enable the notification channel.
- Enable automatic sysdig capture if necessary.
- Click Create button.
Creating an alert for 95% CPU utilization
The following steps will help you to create an alert when a node exceeds 90% of its memory utilization.
- Under alert tab click add alert button
- Select the scope as
agent.tag.dcosName
orregion
.agent.tag.dcosName
Is the tag we have added for the dcos cluster.region
specify the aws region name. Assign the value for scope from the drop down accordingly which one you selected as the scope. - Under
Set the condition
choose type asmanual
- For
Alert when
option Choosememory.used.percent
as the metric > 95% as the threshold value. - For
Segment by
choose second option, selectAny of
andhost.hostName
metric. - Leave the
Where
option unchecked. - Choose the minimum monitor value as
5 min
. - Specify the Name, Description and Severity of the alert.
- Enable the notification channel.
- Enable automatic sysdig capture if necessary.
- Click Create button.
Creating an alert for 90% CPU utilization
The following steps will help you to create an alert when a node exceeds 90% of its cpu utilization:
- Under alert tab click add alert button.
- Select the scope as
agent.tag.dcosName
orregion
.agent.tag.dcosName
Is the tag we have added for the dcos cluster.region
specify the aws region name. Assign the value for scope from the drop down accordingly which one you selected as the scope. - Under
Set the condition
choose type asmanual
- For
Alert when
option Choosecpu.used.percent
as the metric > 90% as the threshold value. - For
Segment by
choose second option, selectAny of
andhost.hostName
metric. - Leave the
Where
option unchecked. - Choose the minimum monitor value as
5 min
. - Specify the Name, Description and Severity of the alert.
- Enable the notification channel.
- Enable automatic sysdig capture if necessary.
- Click Create button.