Estimating the Total Costs of Your Cloud Analytics Platform

Organizations today need a broad set of enterprise data cloud services with key data functionality to modernize projects and utilize machine learning. They need a platform designed to address multi-faceted needs by offering multi-function data management and analytics to solve the enterprise’s most pressing data and analytic challenges in a streamlined fashion. They need a selection that allows a worry-less experience with the architecture and its components.

Addressing real-world use cases requires the application of multiple functions working together on the same data. Building the data ecosystem to support this converged data use case can be a daunting task. There are many solutions and alternatives, and too many vendor claims. Building the entire cloud technology platform to address enterprise-wide data challenges and needs can be achieved one of three ways: build the stack within the same cloud vendor’s umbrella of products; stitch together various vendor product offerings; or utilize a single vendor multi-purpose stack.

Some architectures look integrated — but may be more complex and more expensive. When almost every additional demand of performance, scale, or analytics can only be met by adding new resources, it gets expensive. Stacks are innumerable but a few are popular.

Popular Stacks

Highlights of the Azure stack include Synapse, Synapse SQL Pool, Azure Data Factory, Azure Stream Analytics, Azure Databricks Premium Tier, HDInsight, Power BI Professional, Azure Machine Learning, Azure Active Directory P1, and Azure Purview.

The AWS stack includes Amazon Redshift, Glue, Kinesis, EMD, Spectrum, Quicksight, SageMaker, IAM, and AWS Glue Data Catalog.

The Google stack is BigQuery, Dataflow, Dataproc, Cloud IAM and Google Data Catalog.

Another stack could be called the Snowflake Stack since Snowflake is the featured vendor for dedicated compute, storage, and data exploration, but it is really a multi-vendor heterogeneous stack. This includes a data integration tool like Informatica or Talend, Kafka Confluent Cloud, Azure Databricks Premium Tier, Cloudera Data Hub + S3, Tableau, SageMaker, Amazon IAM, and a Data Catalog like Alation or Collibra.

The cost numbers below will focus on the stack costs of projects, including development costs. If you are doing a full ROI for these projects, you would need to consider cost of money, a probability distribution, the n-ordered benefits and determining and using only what is tangible.

SaleBestseller No. 1
INSIGNIA 32-inch Class F20 Series Smart HD 720p Fire TV with Alexa Voice Remote (NS-32F201NA23, 2022 Model)
  • 720p resolution View your favorite movies, shows...
  • Alexa voice control - The Alexa Voice Remote lets...
  • Fire TV experience built-in - Watch over 1 Million...
  • Supports Apple AirPlay - Share videos, photos,...
  • Supports HDMI ARC - Sends audio directly from the...
SaleBestseller No. 2
VIZIO 40-inch D-Series Full HD 1080p Smart TV with AMD FreeSync, Apple AirPlay and Chromecast Built-in, Alexa Compatibility, D40f-J09, 2022 Model
  • 1080p High-Definition - Watch TV in crisp, clear...
  • Full Array LED Backlight - Evenly distributed LEDs...
  • IQ Picture Processor - Delivers superior picture...
  • V-Gaming Engine Automatically optimizes picture...
  • SmartCast - With intuitive navigation, enjoy...

Also, when projects are done in an agile fashion with functionality metered out, it can be difficult to say when initial project costs end, and costs go into maintenance. I use the usual enterprise standard and draw the line between initial costs and maintenance around the point where most of the functionality is delivered. In this context, it is very important to consider both the accumulated costs to that point as well as the “maintenance” costs for bug fixes, enhancements, and updates on an ongoing basis afterwards.

Breaking Down Costs

For a single (multi-quarter) project on these stacks, including people costs, will cost between $2.7M and $8M in a medium enterprise and $7M to $23M in a large enterprise. Using the modern stack, the first time will pave the way for future uses.

For all enterprises uses of the modern platform including production costs, a 2-year total cost of ownership for medium enterprises, ranges from $6M to $15M. For large enterprises, i.e., over $1B revenue, the cost ranges from $17M to $42M.

Perils of TCO measurement aside, enterprise projects should be attaining high returns. However, if the application is not being implemented to a modern standard, using a machine learning stack, there are huge inefficiencies and competitive gaps in the functionality. Therefore, many enterprises are considering leveling up or migrating these use cases now and reaping the benefits.

A full analytics platform in the cloud is more than just a data warehouse, cloud storage, and a business intelligence solution. There are at least 11 categories needed to establish both equivalence among analytics stacks’ offerings and a fair estimate of costing. All these components are essential to having a full enterprise-ready analytics stack.

New
Samsung 85 Inch DU8000 Crystal UHD LED 4K Smart TV Bundle with 2 YR CPS Enhanced Protection Pack (2024 Model)
  • SAMSUNG USA AUTHORIZED - Includes 2 Year Extended...
  • Samsung 85 Inch DU8000 Crystal UHD LED 4K Smart TV...
  • UHD Dimming | Auto Game Mode (ALLM) | Alexa...
  • SAMSUNG TIZEN OS: Stream your favorite shows, play...
  • BUNDLE INCLUDES: Samsung DU8000 Series 4K HDR...
New
Samsung 75 Inch DU8000 Crystal UHD LED 4K Smart TV Bundle with 2 YR CPS Enhanced Protection Pack (2024 Model)
  • SAMSUNG USA AUTHORIZED - Includes 2 Year Extended...
  • Samsung 75 Inch DU8000 Crystal UHD LED 4K Smart TV...
  • UHD Dimming | Auto Game Mode (ALLM) | Alexa...
  • SAMSUNG TIZEN OS: Stream your favorite shows, play...
  • BUNDLE INCLUDES: Samsung DU8000 Series 4K HDR...

The categories, or components in a modern enterprise analytics stack, that I included in the TCO calculations are as follows:

  • Dedicated Compute
  • Storage
  • Data Integration
  • Streaming
  • Spark Analytics
  • Data Exploration
  • Data Lake
  • Business Intelligence
  • Machine Learning
  • Identity Management
  • Data Catalog

These stacks can be used for a variety of machine learning projects including customer analytics, fraud detection, supply chain optimization and IoT analytics. Of course, each project could use a slightly different set of components, or quantity of each component.

Original Post>