As part of digital transformation, an enterprise needs to align both business and technology investments in Azure Cloud services. It’s not just about saving money but to make strategic investments that drive the best return on investment. In the last 4 years, Tanuja has worked with many large organizations and observed common theme of concerns, such as “Cloud is expensive”, “I am always over the budget”, “I don’t have an insight into the cloud spend”. The concerns are mostly based on:
- Lack of visibility into their Azure spending patterns and resource optimization.
- Budgets are moving targets due to the elasticity of cloud and ease of adding resources.
- Test environments aren’t shut down and continue to incur cost.
- Limited insight into actions for optimizing Azure resources and spending.
- Lack of prioritization by decision makers as they don’t have deep insight into overall spend.
Because there is a pressure to cut the cost, it’s often easy to fall for the quick wins. Even though the quick wins reduce spend tactically, enterprises often find themselves in the same situation within a short time.
A better approach is to adopt an Optimization Mindset. It’s a process where you intentionally follow best practices before and after you migrate or onboard your workload to Azure. Any cloud investments decisions should drive towards best performance, quality, and cost. In other words, prioritization and optimization of your investments in cloud computing should strive to maximize the business value. Optimization is an evolving process and culture. To avoid reactive optimization efforts, perform a root cause analysis when cloud cost becomes a mainstream concern. Keep in mind, lowest cost solution is not necessarily the most prudent. For example, when onboarding a workload or services with a cost savings goal often has trade-offs in the areas of security, performance, reliability, or sustainability.
The companies that reap the greatest benefits of cloud computing are focused and intentional about their financial investments.
Optimization refers to managing your cloud spend with transparency and accuracy; from designing your architecture to optimizing costs, reducing waste and monitoring. A culture of optimization is based on three areas that should be implemented as repeatable and actionable process.
In this area of optimization, the focus is on “quick wins” such as reducing waste by right sizing the resources and decommissioning unused or underutilized resources that incur cost.
Let’s take “compute” layer for optimization:
Compute layer consists of Virtual Machines: Appropriate use of right VM type (CPU, Memory, Storage) is important but with the continuous monitoring of usage pattern of the VM and “right-sizing” as needed can potentially reduce the waste of underutilized VM. The size and capacity of VMs in the production environments doesn’t necessarily have to match in Dev\Test environments. Lower size VMs in Non-Production environments can reduce the cost.
Another Example would be use of “Azure Spot VMs” as this offer can significantly reduce the cost compared to standard VMs. If you have stateless applications, or Dev\Test environments with workloads that can handle interruptions consider this savings offer.
You can enforce governance by building Financial Operations (FinOps) or Cloud Consumption Office (CCO), gathering requirements, estimating cost, building a subscription and cost model, and designing solutions with those aspects in mind.
1. Build FinOps Model (Finance Operations Team or Cloud Consumption Office)
FinOps is a cultural practice and a balanced approach for making tough decisions for your cloud services investments. The FinOps team or CCO can help align business, finance, and technology needs with the goal of optimizing investments.
The FinOps team is essential for an organization’s cloud consumption and operational structure. They will drive strategic approach with trade-offs between technical needs, business needs, and financial investments from the overall portfolio management.
2. Estimate Cost (Azure Migrate Tool, TCO Calculator)
As you plan the onboard process of a workload on a platform there are some guardrails that would facilitate a consistent onboarding experience. Platform should be developed as per the best practices aligned with Azure landing zones.
Estimate the cost savings by using one of these tools.
All pre-migration steps such as discovery, assessments, and right-sizing of on-premises resources are included for infrastructure, data, and applications. Azure Migrate’s extensible framework allows for integration of third-party tools. You can identify, prioritize, and select the workloads that will be migrated to the cloud, and execute upon those migrations. Think of onboarding as a migration process. It’s the final major step to realizing the investment in cloud infrastructure. This key step must not be under estimated in the rush to release to the cloud.
3. Create a workload intake process.
Any department or cost center in an organization should create a workflow that is initiated by request for onboarding a new workload. Integrate this process into your rhythm of business to identify resources and services required, cost estimates for the workload, and budget allocations. This process should be automated where possible.
- Naming (standard Naming convention)
4. Resource organization – subscriptions and management groups
- Tagging (cost center, dept environment)
- Subscription design (subs organization based on MG)
- Management group design (MG design)
As you begin to design the architecture of the workload, consider the areas that have direct impact on the cost and usage. Consider:
- Azure region because prices vary by regions.
- Subscriptions and offer types. Dev subscription where available is cheaper.
- Right sizing Azure resources.
- PaaS or SaaS services for existing workloads and setting up alerts and monitoring the usage.
For additional best practices for the workload, see Azure Well-Architected |
Organizational accountability and visibility
Create a rhythm of business with optimization mindset with visibility to all the stakeholders.
For newly onboarded workload, set up rigorous rhythm of business by ensuring that there is a top-down and bottom-up visibility and accountability to all cloud usage.
Accountability sets expectations on who is responsible at granular level of Azure resources spend. Every workload in production or is a newly onboarded workload, must have accountable roles and, or teams.
Start with people who are accountable for the overall management of cloud governance. Identify the organizational structure and the operating model so that you have clarity around ownership, roles, and responsibilities. When thinking about the operating model, there are three key tiers. This approach provides a clear delineation of responsibilities shared between IT teams that support the large-scale enterprise.
- Global. All Azure subscriptions are owned by top level enterprise. Governance at this level is meant to ensure general policies are in place to protect against “cloud sprawl or resources sprawl”.
- Departments: All Azure subscriptions owned by each departments or cost centers
- Controls for the resources and services used to provide internal applications.
- Templates and automation to improve standardization and developer productivity.
- Security toolkits that provide standard policy and reporting across teams.
- DevOps: Individual teams that are responsible for workloads and services.
- Policies that are applied to service level requirements.
- Data consolidation.
- Development pipeline and automation tools.
Azure DevOps can be used to support all teams (dev, infra, finance, and others) in creating a culture of optimization. Your organization should follow a hierarchical workflow with Epics, Scenarios, and Features to support the goals for optimization, and then review the goals every sprint. Often the service outages can be prevented by replacing the human input (error prone) with an automated DevOps pipeline with approval process starting with lower environments, leading up to production.
- Ensure you have a Cost Optimization Epic to have accountability of the cost management goal.
- Make sure every group and every team in the enterprise has their Scenario and Features to map to for their optimization work that they complete.
- Share the learnings and celebrate those teams who create optimization features (new ways of optimizing their IaaS or PaaS layers) and share that during your Sprint Planning sessions.
A deep insight into Azure spend and key metrics is essential for an optimized cloud services consumption. Here are the key metrics and indicators that will guide the resolution or mitigation of business risks and provide visibility across the organization.
- Annual Spend: The total annual cost for services provided by a cloud provider.
- Monthly Spend: The total monthly cost for services provided by a cloud provider.
- Budgets and Forecast vs Actuals Ratio: The ratio comparing forecasted and actual spend (Monthly or Annual).
- Pace of adoption (MOM) Ratio: The percentage of the delta in cloud costs from month to month.
Additionally, cost and optimization/modernization efforts should be reviewed weekly at the Group level in a “Cloud Consumption Review” process.
With those answers, you’ve already started a process of building a strategy towards your future Azure investments. If you are well into your cloud journey, it’s time to review your Azure portfolio and see how you can further optimize. This portfolio should be reviewed periodically to understand your cloud investments and ROI. This image is just an example of setting guidelines on managing your cloud portfolio:
Analyze your current Azure investments distribution to evaluate whether your investments (workload, services) should:
- Retire: This is a great opportunity to retire applications and streamline the portfolio. Evaluate whether to retire or consolidate functionality into one app or service line.
- Replace: Any of your current\future workloads adopt a SaaS solution, such as Office 365, SharePoint Online, or some third-party solutions, as they become available. Replacing and moving to SaaS offering also reduces your responsibilities for SLAs.
- Rebuild or Refactor: Are there workloads or applications that can be replaced with SaaS solutions. If your early migration was lift and shift on IaaS, are there components that can be moved to PaaS? Modernization is synonymous with optimization in this case.
- Remain on premises: It is understandable that some of the on-premises workloads\services cannot be moved to the cloud, but the goal should be to adopt cloud services and reduce that percent of your portfolio to bare minimum.
Monitoring and Automation
Continuous monitoring of usage and setting up alerts are key aspects of optimization.
Where possible, automate processes that are repeated in everyday business operations. According to FinOps organization, all cloud computing customers desire automation in the areas shown in this image.
Monitoring cloud operations delivers a comprehensive solution for collecting, analyzing, and acting on telemetry from your cloud and on-premises environments. Continuous monitoring and alerting are key to healthy cloud operations. Setting budgets and monitoring amortization per subscription or resources provides a deeper breakdown on the costs.
In the context of the Azure spend, there are multiple ways to monitor your cloud spend. Start by looking at Azure Cost Management in Azure portal.
Automation: Process Automation in Azure Automation allows you to automate frequent, time-consuming, and error-prone management tasks. This service helps you focus on work that adds business value. There is guidance and tool available for some of the key desired Automations such as “managing anomalies” “resource utilization & Efficiency”. “Automate shutdown\start”. This guidance includes process automation, configuration management, update management, shared capabilities, and heterogeneous features.
Azure optimization mindset is developing a rhythm of business with an awareness and focus to continuously improve and get most out of your investments in cloud computing. It takes thought leadership to drive the optimization process as a culture and repeatable use of the process.