Data ingestion with the Kafka connector is an efficient and scalable serverless process on the Snowflake side, but you still need to manage your Kafka cluster, the connector installation, and various configurations
Data ingestion with the Kafka connector is an efficient and scalable serverless process on the Snowflake side, but you still need to manage your Kafka cluster, the connector installation, and various configurations
When MongoDB launched Atlas, its managed database-as-a-service offering, six years ago, managed cloud databases were still a hard sale to about half the executives at enterprises with legacy systems, said
Amazon QuickSight users now can add bookmarks in dashboards to save customized dashboard preferences into a list of bookmarks for easy one-click access to specific views of the dashboard without having to
As an example, we demonstrate how to handle incremental data change in a data lake by implementing a Slowly Changing Dimension Type 2 solution (SCD2) with Hudi, Iceberg, and Delta Lake, then deploy the
As shown in the following diagram, we use AWS Glue Studio as the middleware to pull data from the source database (in this case an Azure SQL Managed Instance), then create and automate the ETL job using
In this post, we talk about how the AWS Data Lab is helping BMW Financial Services build a regulatory reporting application for one of the European BMW market using the Cloud Data Hub on AWS. In the case of
With AWS Glue Studio, analysts and fund managers can analyze the IMM trades in near-real time and compare them with market observations from Morningstar. This post demonstrates how AWS Glue Studio reduces
By comparing the oldest record in the staged stream on view to the current active record in the target satellite table for that hashed business key Having a single point where all these metadata columns have been
We use Olympic games public datasets to configure a Q topic and discuss tips and tricks on how to make further configurations on the topic that enable Q to provide prompt answers using ML-powered, natural
Sensitive data detection in AWS Glue identifies a variety of sensitive data like phone and credit card numbers, and also offers the option to create custom identification patterns or entities to cover your specific use cases.
We provide the Terraform infrastructure definition and the source code for an AWS Lambda function using sample customer user clicks for online website inputs, which are ingested into an Amazon Kinesis Data Firehose
This post walks through a mixed workload scenario to illustrate the use of Amazon EMR managed scaling, node labels, and capacity scheduler configuration to create an elastic EMR cluster that provides elasticity and
Snowflake continues to set the standard for Data in the Cloud by taking away the need to perform maintenance tasks on your data platform and giving you the freedom to choose your data model methodology for the cloud. Snowflake is a massively parallel platform (MPP) through its proprietary technologies. In this blog post we will explain how to make use of a technique unique to Snowflake to efficiently query large satellite tables for the current record by a parent key (hub or link).
Amazon MWAA is a managed orchestration service for Apache Airflow that makes it easier to set up and operate end-to-end data pipelines in the cloud at scale. Each process in Data Hub corresponds to a DAG in