This reviewer's identity has been verified by our review moderation team. They have asked not to show their name, job title, or picture.
ScaleOps was easy to setup in our 6 main GKE clusters each with about ~1k unique workloads using 350 nodes and ~3k cores in total. We worked closely with ScaleOps' team during the integration, and they provided support quickly. We've saved about 30k USD per month in compute costs with ScaleOps optimized rightsizing since installing it. In addition to cost savings, we've seen a huge improvement in the frequency of OOM kills in the platform, from The automated right-sizing is much easier to scale in our organization with hundreds of developers than previous solutions for workload right-sizing. The custom policies of ScaleOps make it easy to control right-sizing for different workloads' latency, reliability, and technical needs, giving us the flexibility to drive aggressive cost optimization where we can, while leaving a wide berth for more critical and demanding workloads. Using their GitOps-style configuration of policies and attaching workloads to those policies allows us to control this at scale despite having 6k distinct workloads to configure. Review collected by and hosted on G2.com.
While running ScaleOps in-cluster offers certain benefits for reliability and security, the required Prometheus deployment consumes a significant amount of CPU and memory, which reduces our overall cost savings. Additionally, managing Prometheus ourselves means we need to develop expertise in its operation to keep ScaleOps functioning properly. As with automated rightsizing, relying on ScaleOps to maintain optimal workload sizing introduces a dependency, making ScaleOps itself a potential single point of failure in our infrastructure. There are also some bugs in the UI. For example, CronJobs do not apply their GitOps-configured policies until they actually run. The user interface is extremely sluggish, likely because of the high number of workloads we have. It appears the front-end loads all workloads into memory and manipulates them during interaction, which causes slow JavaScript responses to actions like clicking and mouse-over.
Many of the features besides workload rightsizing seem immature and/or don't provide anything more than what is provided baseline by Kubernetes or GKE. For example the cluster headroom can easily be accomplished using the official Kubernetes node over-provisioning documentation (https://kubernetes.io/docs/tasks/administer-cluster/node-overprovisioning/) with significantly more ability to customize for your use-case. Their "spot optimization" feature doesn't provide any value-add for us that we don't already get from GKEs built-in compute-class feature. I'm not sure of the real-world use-case of their replicas optimization feature, but it seems completely useless to us. I think the other features besides workload rightsizing have been added as fluff to market the product to management without providing real-technical value to companies using ScaleOps. This isn't a real a problem for us, as we weren't interested in anything but workload rightsizing when we purchased the solution. Review collected by and hosted on G2.com.
The reviewer uploaded a screenshot or submitted the review in-app verifying them as current user.
Validated through Google using a business email account
This reviewer was offered a nominal incentive as thanks for completing this review.
Invitation from a seller or affiliate. This reviewer was offered a nominal incentive as thanks for completing this review.




