Announcing Amazon EC2 Capacity Blocks for ML to reserve GPU capacity for your machine learning workloads

Amazon announces EC2 Capacity Blocks for Machine Learning (ML), addressing scarcity of GPUs due to the high demand for ML applications. Users can reserve GPU instances for high-performance ML workloads using the pay-as-you-go model. Initially available for EC2 P5 instances in the AWS US East (Ohio) region, this approach also allows planned ML development with predictable access to high-performance instances.

Continue reading

Orchestrating dependent file uploads with AWS Step Functions

This blog post by Amazon S3 details how the Aggregator pattern can handle coupling issues in event-driven architectures (EDA). Using the AWS Step Functions, the object storage service can process data files uploaded by different teams in a particular order. The process involves asynchronous communication between system components that enable these components to function autonomously. This method allows for accurate data correlation and workflow orchestration.

Continue reading

Let’s Architect! Designing systems for stream data processing

Streaming data processing ability can significantly differentiate successful organizations from competitors. AWS provides reliable data pipelines to process real-time streaming data, enabling agile, data-informed decisions and insights into customer behavior. Its modern data architecture breaks down data silos, while ensuring security. The blog highlights how Samsung optimized their own streaming data analytics by migrating to Amazon’s service for Apache Flink, shifting focus from infrastructure to delivering business value.

Continue reading

Journey to Cloud-Native Architecture Series #7:  Using Containers and Cell-based design for higher resiliency and efficiency

The post discusses steps to improve resource efficiency and resiliency in hyperscale environments using containerized applications and cell-based design. The team uses Amazon EKS for container orchestration, AWS App2Container for Java and .NET applications, CloudWatch and Prometheus for logging, and Kubecost for resource utilization evaluation. Deployment strategies were revised and a cell-based architecture using shuffle sharding was introduced to manage black swan events, improve resiliency, and scale.

Continue reading

Observability using native Amazon CloudWatch and AWS X-Ray for serverless modern applications

This blog post provides a detailed guide on using AWS-native observability tools, primarily Amazon CloudWatch and AWS X-Ray, to monitor modern serverless applications. These tools aid in measuring and analysing application logs, metrics, and traces. The post further explores features like custom dashboards, alarms, anomaly detection, and their compatibility with related AWS services. Also discussed are methods for training, cost optimization, security implementations, compliance, and data protection.

Continue reading

Hybrid Cross-Cluster Scaling with Azure Arc for the workloads deployed on Azure Stack HCI

The content explains a solution that enables cross-cluster scaling of workloads on a hybrid infrastructure using Azure Arc-enabled services on Azure Stack HCI clusters. The architecture involves deployment of Azure Stack HCI in two locations, connectivity via SD-WAN, and management through Azure Arc. It discusses the use of AKS on Azure Stack HCI, application deployment via Azure Pipelines, global load balancing, Node autoscaling, and data synchronization with Azure Failover Group. It also explores alternates, security considerations, cost optimization, performance efficiency, and operational excellence.

Continue reading

Extend mainframes to digital channels by using standards-based REST APIs

IBM Z and Cloud Modernization Stack, together with IBM z/OS Connect, offer a low-code solution to access mainframe subsystems data through REST APIs. This allows extension of mainframe applications to Azure without disruption. The architecture integrates efficiently with Azure API management, enabling effective API governance. The setup utilizes Red Hat OpenShift, Azure services, and Microsoft Power Platform for optimal performance, secure access, DevOps approach and low-code application development. IBM z/OS Connect additionally supports parallel processing for high performance efficiency.

Continue reading

Windows 365 Azure network connection

Windows 365 is a cloud-based service that offers personalized Windows computing instances known as Cloud PCs, readily accessible from any location or device. Utilizing a combination of services such as Intune, Entra ID, and Azure Virtual Desktop, it offers a rich Windows desktop experience. Responsibilities for managing Windows 365 are divided into deployment, lifecycle management, and configuration. Microsoft recommends employing specific architecture patterns for optimal benefit, such as incorporating Microsoft-hosted network and Intune-based mobile device management. The service runs periodic automated health checks to evaluate readiness for deployment.

Continue reading

numbers money calculating calculation

Estimating Total Cost of Ownership (TCO) for modernizing workloads on AWS using Containerization – Part 2

The text outlines how to estimate the Total Cost of Ownership (TCO) for modernizing applications through containerization on Amazon Web Services (AWS). Describing a step-by-step approach, the content focuses on calculating costs with only application-level information. An AWS Professional Services-developed containerization model categorizes applications by complexity to determine cost. Costs for compute, storage, network, and effort are individually assessed before establishing total TCO. The authors emphasize that their method provides ballpark figures for preliminary investment decisions.

Continue reading

numbers money calculating calculation

Estimating Total Cost of Ownership (TCO) for modernizing workloads on AWS using Containerization – Part 1

AWS outlines seven common strategies to migrate applications to the cloud, zoned as “The 7 Rs”. AWS also supports modernization through containerization, which offers increased deployment flexibility and efficiency. AWS provides three options to manage containers, and calculating the total cost will require assessing the current environment, choosing AWS Container Services, defining target architecture, calculating platform costs and the cost of the modernization effort. Calculating the TCO is complex and depends on many factors, including server configuration and resource utilization.

Continue reading

Monitor IoT device health at scale with Amazon Managed Grafana­­

This post discusses the creation of an IoT health dashboard using Amazon Managed Grafana to monitor the health of IoT equipment in businesses, which can scale to thousands of devices. The solution involves IoT devices sending data to AWS IoT Core to be processed and visualized. They employ AWS services such as AWS IoT Core, AWS Lambda, Amazon Timestream, and Amazon Managed Grafana. It offers the ability to track device health without constantly verifying individual statuses and solves the common challenge of deriving useful insights from vast IoT device fleets.

Continue reading

Let’s Architect! Designing systems for batch data processing

The content encourages effective data engineering, emphasizing the importance of robust data pipelines for AI-enabled products. It highlights incorporating software engineering best practices in data engineering to improve system quality and reliability. Further, it underscores data quality due to the critical role it plays in outcomes such as machine learning. It also introduces tools like Deequ for data testing, Amazon EMR for big data processing, and Apache Airflow for workflow management.

Continue reading

Monitoring Generative AI applications using Amazon Bedrock and Amazon CloudWatch integration

Amazon’s fully managed service, Bedrock, facilitates the building and scaling of generative AI applications with foundation models from leading providers. Integrated with Amazon CloudWatch, it provides near real-time monitoring and usage analytics. Bedrock also introduces logging in preview, enabling collection of metadata, requests, and responses for all model invocations. Bedrock’s real-time metrics can be utilized to set alarms, detect anomalies, compare latency between models, measure token count, alert on throttling, and various other use cases.

Continue reading

1 5 6 7 8 9 54