Out of the box dashboards provide an overview of your Kubernetes environment
The Kubernetes cluster overview provides you immediate insight into your cluster health. Here we see that while the cluster nodes are healthy, 116 workloads have recently failed. Dashboards are easily customized to provide you the key data you need.
Click the green text to view the cluster overview.
Cluster resource overview
Cluster resource overview
Quickly see the number of pods running within each namespace and drill down to individual application.
Get a quick summary of various types of workloads running to ensure everything is behaving as expected
1 / 9
Free Trial
Free Trial
Book a Demo
Book a Demo
Interactive legend
Next step
Hover over for info
Click to see popup
Click to expand
Next step
Hover over for info
Click to see popup
Click to expand
Problem
The problem card highlights the precise root cause and business impact to end users and applications. This information enables you to quickly triage and respond to the problem. In this case, Davis analyzed over 3 billion dependencies in real-time to identify the root cause as an issue in the CheckDestination service. Pinpointing the exact problem keeps you from endless data analysis and unnecessary war rooms, so you can meet SLAs.
Welcome to Dynatrace!
The Kubernetes cluster page provides you an overview of the cluster utilization, workloads and events. On this cluster we can see a Back-off event has occurred in the ‘online-boutique’ namespace, indicating failed workloads. From this cluster page you can dive into nodes and hosts or explore workloads.
Track the availability, health, and utilization of your Kubernetes infrastructure
The Kubernetes cluster page provides you an overview of the cluster utilization, workloads and events. On this cluster we can see a Back-off event has occurred in the ‘online-boutique’ namespace, indicating failed workloads. From this cluster page you can dive into nodes and hosts or explore workloads.
View all workloads
View all workloads
The out-of-the-box summary of cluster utilization and workloads provides a bird's eye view of your Kubernetes clusters
See the summary and details of important events to determine severity and specific actions needed
2 / 9
Free Trial
Free Trial
Book a Demo
Book a Demo
Interactive legend
Next step
Hover over for info
Click to see popup
Click to expand
Response Time Analysis
The average response time observed during the problem timeframe is shown distributed across calls to other services, calls to databases, and code-level execution, with the biggest hotspots highlighted. Here, an anomaly in code execution time is the key contributor to the degraded CheckDestination response time. Quickly identifying the origination of performance bottlenecks enables targeted action to not only address current issues quickly, but also identify improvements that will have the greatest impact.
Problem
Quickly identify problematic workloads and filter by cluster, namespace or type
The cluster workloads page provides relevant information so you can intuitively find the workload of interest. Here we see there is a problem with the online-boutique 'frontend’, which has a single service running on it.
Intuitive filtering gets you to the most relevant workloads faster using filters such as cluster, namespace, pod, pod state, and much more
3 / 9
Free Trial
Free Trial
Book a Demo
Book a Demo
Interactive legend
Next step
Hover over for info
Click to see popup
Click to expand
Method Hotspots
Method hotspots checks all the classes and methods that were executed to run the service under the call tree, and provides the exact method call contributing to the issue. In this case the method LocationParser.parseSectionIndex was identified as responsible for the problem. This automatic observability shows Dev precisely where to look in the codebase and saves hours manually reading thousands of lines of code to understand what needs to be fixed.
View the service details of your workloads to understand the broader context
The service overview page shows which applications or services use the service “frontend” and if there are outgoing calls to other services or databases. Even though the response time appears to be normal, we see a high failure rate of 3.46% that Dynatrace AI has automatically associated to a single problem.
Specifically, the called service "checkout" has a high failure rate
Immediately see any problem associated with the service "frontend"
4 / 9
Free Trial
Free Trial
Book a Demo
Book a Demo
Interactive legend
Next step
Hover over for info
Click to see popup
Click to expand
Service Details
The service page shows which applications or services use the service and if the service makes any calls to other services or databases. Problem-specific service summaries quickly highlight the impact to the service and beyond. During this problem timeframe we see that service requests suffered response time and CPU consumption spikes. Tracing how these spikes propagate upstream from this backend service is critical to understanding the business impact.
Analyze backtrace
Analyze backtrace
Gain additional context for failures across relevant timeframes
Diving into the details of the failure rate, we can see failures across time and the types and groups of failed requests. The most relevant timeframe is automatically determined by Dynatrace AI and can be easily manually set as desired. Let’s analyze the backtrace to see if any other application besides ‘online-boutique’ will be impacted.
5 / 9
Free Trial
Free Trial
Book a Demo
Book a Demo
Interactive legend
Next step
Hover over for info
Click to see popup
Click to expand
Backtrace
A backtrace shows the sequence of upstream services that result in a request. In this case, we can see that multiple frontend services call CheckDestination, clearly highlighting service impacts of the problem. When troubleshooting or planning changes to a particular service, it is crucial to understand the upstream call chain to continuously ensure application and experience quality.
Trace the call back upstream to identify the impact to application frontends
Details of specific user actions, including the number of users and failed calls provides context and impact to the user experience
A backtrace shows the sequence of upstream services that result in a request. In this case, we see that only the ‘online-boutique’ application calls the ‘frontend’ service. It’s clear that the increased service failure rate will impact end users and the business, with the details of those impacts included.
6 / 9
Free Trial
Free Trial
Book a Demo
Book a Demo
Interactive legend
Next step
Hover over for info
Click to see popup
Click to expand
Service Flow
Service flow maps the sequence of service calls that are triggered by any service request. Here we can trace the response time degradation through the services, all the way to the CheckDestination service. Beyond troubleshooting, understanding the relationships between services enables you to more effectively make migration or architecture decisions.
Explore application performance and user impact associated with Kubernetes
The out-of-the-box user experience summary shows user demographics, behavior, and experience, as well as associated resources and services consumed by those users
The application summary page shows both performance and user behavior information. In this case the service failures on the application have resulted in 18 errors/minute and appear to impact load time. Again, we see the associated Problem that Dynatrace’s AI engine identified, this time in the context of the application.
7 / 9
Free Trial
Free Trial
Book a Demo
Book a Demo
Interactive legend
Next step
Hover over for info
Click to see popup
Click to expand
PurePath
Dynatrace’s patented distributed trace, PurePath, brings together data from multiple sources across hybrid and multi-cloud environments to analyze transactions end-to-end across every tier of your application technology stack. As applications and microservices become more complex, understanding the relationship between calling/receiving services is key in order to find the source of a problem. And using OpenTelemetry as an additional data source, Dynatrace extends your coverage and turns telemetry data into actionable answers, faster than ever.
Automatic and intelligent observability contextualizes your Kubernetes workloads
The problem card automatically delivers the precise root cause and business impact to end users and applications. We can see that 190 users have been impacted by the increased ‘frontend’ failure rate. Davis automatically identified the issue and provided all the context you need to quickly remediate. This automatic and intelligent observability provides you the context of how Kubernetes workloads fit into the broader application and end user environment.
The automatically generated business impact analysis shows you the number of users, services and applications affected. The severity of the impact guides your remediation actions.
8 / 9
Free Trial
Free Trial
Book a Demo
Book a Demo
Interactive legend
Next step
Hover over for info
Click to see popup
Click to expand
Close problem
Close problem
Deployment
Activating Dynatrace on your Kubernetes cluster is lightning fast and easy. Select a few options, copy the deployment script, and watch Kubernetes monitoring rollout and connect to all your applications in Dynatrace. After the deployment is finished, our out-of-the-box dashboards show you everything you need to see on a single pane of glass.
Welcome to Dynatrace!
This interactive product tour explores a backend service incident. Since Dynatrace automates business impact and root cause analysis, you can quickly triage, respond, and understand the impact.
9 / 9
Free Trial
Free Trial
Book a Demo
Book a Demo
Interactive legend
Next step
Hover over for info
Click to see popup
Click to expand