Announcing Amazon EMR Serverless (Preview): Run big data applications without managing servers

With EMR Serverless, you can run applications built using open-source frameworks such as Apache Spark, Hive, and Presto without having to configure, manage, optimize, or secure clusters.Tens of thousands of customers use Amazon EMR, a managed service for running open-source analytics frameworks such as Apache Spark, Hive, and Presto for large-scale data analytics applications.

Continue reading

Amazon QuickSight: 2021 in review

With AWS re:Invent just around the corner, we at the Amazon QuickSight team have put together this post to provide you with a handy list of all the key updates this year. We’ve broken this post into three key sections: insights for every user, embedded analytics with QuickSight, scaling and governance.

Continue reading

Migrating Our Events Warehouse from Athena to Snowflake

At Singular, we have a pipeline that ingests data about ad views, ad clicks, and app installs from millions of mobile devices worldwide. This huge mass of data is aggregated on an hourly and daily basis. We enrich it with various marketing metrics and offer it to our customers to analyze their campaigns’ performance and see their ROI. 

Continue reading

Provide data reliability in Amazon Redshift at scale using Great Expectations library

This post discusses a solution for running data reliability checks before loading the data into a target table in Amazon Redshift using the open-source library Great Expectations. You can automate the process for data checks via the extensive built-in Great Expectations glossary of rules using PySpark, and it’s flexible for adding or creating new customized rules for your use case.

Continue reading

1 25 26 27 28 29 41