Scaling a Backstage Developer Portal for a FinServ Dev Team



Technical teams at financial services organizations find themselves in a fiercely competitive industry where anything less than a top-quality digital experience could impact customer loyalty.

Part of that competition stems from the operational costs behind the digital strategy. With competitively priced alternatives easily accessible for customers — a particularly relevant threat at a time of rising interest rates — anything that inflates the cost of delivering digital products makes it harder to compete on price.

This alone would be difficult enough without the complex regulatory restrictions and security concerns that FinServ organizations face. Compliance lapses bring substantial fines that further limit pricing flexibility, while reputational damage from security vulnerabilities could send customers out the door.

Earlier this year, we met with a European FinServ organization to discuss ways to navigate these challenges to better support its development teams.

The Team and Technology

One of its top priorities was finding a way to help development teams move faster as they adopt advanced cloud native architectures.

The team had embraced AWS cloud services with velocity in mind but had found that the complexity of the organization’s standards and configuration requirements were standing in the way.

Development teams relied on DevOps engineers to orchestrate application environments supporting the development life cycle, many of which required multiple SaaS and PaaS services (databases, message queues, caches, etc.). This resulted in lengthy provisioning processes that further slowed developers and extended-release timelines.

Even with most configurations defined as code via Terraform, leaders saw more room to improve developer productivity. They had been eyeing microservices to help accelerate provisioning but had become concerned about the risks of unmonitored cloud infrastructure deployments.

This led to a tradeoff: Governance over configurations took precedence over developer access, diminishing the velocity they had sought through the public cloud.

To help strike a balance, the organization decided on a technology ecosystem that consisted of:

  • Backstage as the internal developer platform (IDP) to provide developers self-service access to application resources;
  • Amazon Web Services (AWS) cloud infrastructure defined via Terraform, stored in Bitbucket and automated via Quali Torque;
  • And integrations with ServiceNow to trigger approval requests to their DevOps teams for new releases, changes to infrastructure and other activity.

The objective was to deliver a platform that satisfied both the development team’s request for cloud access and the DevOps team’s standardization priorities. Developers could access environments directly within their Backstage IDP, while DevOps maintained visibility and control over configurations via ServiceNow.

The Orchestration Layer

To reduce the amount of manual provisioning carried out by the DevOps teams, we started by connecting the organization’s BitBucket repositories to Quali’s Torque platform.

This allowed the team to discover and import the infrastructure defined in their Terraform modules, then generate new YAMLs defining all the SaaS and PaaS services, dependencies and outputs needed to support each specific developer use case.

Once tested and approved by DevOps, these YAML files are stored in git as the template for the environment’s configuration and made available to the development teams via Backstage. This eliminated the need for development teams to request environments from DevOps every time they needed them.

Meanwhile, DevOps set up notifications for anomalies in these environments — such as unexpected changes to the configurations, noncompliant cloud service configurations or long environment runtimes — and could reconcile them without interrupting the developer’s workflow.

The Platform

To optimize the experience, we integrated Backstage with Quali Torque so developers could access all approved environment templates independently. Role-based access controls and encryption for account credentials are managed via Quali Torque, satisfying DevOps concerns over unapproved modifications to configurations or security risks from exposed secrets.

When a developer initiates the creation of a new software component, cloud resource or a development environment via Backstage, Quali Torque orchestrates and deploys those components based on the configurations defined in the YAML.

This allows the DevOps team to set rules to deny deployments that violate their configuration or operational standards automatically.

Defined in Terraform modules managed in git, these policies instruct Quali Torque on which environments could be deployed — and which could not. Creating a policy to prohibit a specific service configuration, for example, would deny the deployment of any environment containing that configuration.

Applying policies to individual development teams, managed via the Quali Torque workspaces that dictate user access, allows DevOps to align configuration and operational standards with each development team’s use cases.

This approach also automates the operation of cloud resources supporting each environment. For a team that relies on ephemeral environments that only run during normal working hours, policies can automate the deployment of the AWS resources supporting that team’s environments at the beginning of the workday, then automatically terminate those resources at the end of the workday.

The Developer Experience

Developers are typically software component-aware. They work in the scope of a feature that starts in specific microservices under their team ownership. Backstage provides the developers with a single view of the team’s responsibilities and information about the specific software components to support them.

For this team, Backstage allows development teams to view, understand and access the environments where their software component is executed, and perform Day 2 operations like connecting, terminating and re-starting resources within those environments on demand. The policies remove the cognitive load of day-to-day operational implications from both the development and DevOps teams. Additional integrations allow development teams to access these environments via their command line interface (CLI), integrated development environment (IDE) or CI/CD platform.

If a developer or development team needs to run an environment beyond the pre-scheduled termination, they can request an extension up to a duration limit that was pre-set by DevOps. Any other requests related to the environment submitted via the Backstage UI trigger a notification to the DevOps team via ServiceNow.

The integrated approach also gives the DevOps team visibility into resource utilization. With developers using their IDP to run AWS cloud resources via templates defined in Quali Torque, DevOps can see how frequently that infrastructure is run, for how long and by whom. This data allows the DevOps team to drill down into the root causes of any operational anomalies, adjust their standards and gather feedback to improve the developer experience of the platform.

IDPs have caught on for their impact on developer experience. In our work with development teams, it’s become clear that an effective developer experience needs to work in concert with DevOps principles.

Group Created with Sketch.

Original Post>