IT data analytics Archives - Page 24 of 27 - Global Intelligence and Insight Platform: IT Innovation, ETF Investment, plus Health Wellbeing

Enabling Hadoop Migration to Azure

March 15, 2022 GeneAka

There are many reasons why customers consider migrating their existing on-premises big data workloads to Azure.

Build a serverless pipeline to analyze streaming data using AWS Glue, Apache Hudi, and Amazon S3

March 15, 2022 GeneAka

To deliver on these requirements, organizations have to build custom frameworks to handle in-place updates (also referred as upserts), handle small files created due to the continuous ingestion of changes from upstream systems (such as databases), handle schema evolution, and compromise on providing ACID guarantees on its data lake

To consume this streaming data, we set up an AWS Glue streaming ETL job that uses the Apache Hudi Connector for AWS Glue to write ingested and transformed data to Amazon S3, and also creates a table in the AWS Glue Data Catalog.

Make data available for analysis in seconds with Upsolver low-code data pipelines, Amazon Redshift Streaming Ingestion, and Amazon Redshift Serverless

March 15, 2022 GeneAka

Upsolver is an AWS Advanced Technology Partner that enables you to ingest data from a wide range of sources, transform it, and load the results into your target of choice, such as Kinesis Data Streams and Amazon Redshift.

Build a real-time recommendation API on Azure

March 15, 2022 GeneAka

This reference architecture shows how to train a recommendation model using Azure Databricks and deploy it as an API by using Azure Cosmos DB, Azure Machine Learning, and Azure Kubernetes Service (AKS).

Automate Amazon Redshift load testing with the AWS Analytics Automation Toolkit

March 15, 2022 GeneAka

In this post, we demonstrate the use of the AWS Analytics Automation Toolkit for JMeter load tests on cloud benchmark data, using Amazon Redshift as a target environment

To use the AWS Analytics Automation Toolkit to run a JMeter load test, deploy the toolkit with the JMeter option, load data into your Amazon Redshift cluster, and customize the default test plan as you see fit.

Audit AWS service events with Amazon EventBridge and Amazon Kinesis Data Firehose

March 8, 2022 GeneAka

In this post, we provide a working example of AWS service-generated events ingested to Amazon S3. To make sure we have some service events available in default event bus, we use Parameter Store, a capability of AWS Systems Manager to store new parameters manually. This action generates a new event, which is ingested by the following pipeline.

How the Georgia Data Analytics Center built a cloud analytics solution from scratch with the AWS Data Lab

March 8, 2022 GeneAka

In this post, we share how GDAC created an analytics platform from scratch using AWS services and how GDAC collaborated with the AWS Data Lab to accelerate this project from design to build in record time

Create a low-latency source-to-data lake pipeline using Amazon MSK Connect, Apache Flink, and Apache Hudi

March 8, 2022 GeneAka

This lab walks you through the steps to set up the stack for replicating an Aurora database salesdb to an Amazon Managed Streaming for Apache Kafka (Amazon MSK) cluster, using Amazon MSK Connect with a MySql Debezium source Kafka connector.

An Introduction and Tutorial for Azure Cosmos DB

March 8, 2022 GeneAka

If your organisation has to manage, process and query a great deal of important but short-lived information that is created sporadically and must be reported speedily, then a service like Cosmos DB is ideal.

Automate your Data Extraction for Oil Well Data with Amazon Textract

March 1, 2022 GeneAka

This data is indexed and populated into Amazon OpenSearch Service to search and visualize it in a Kibana dashboard.

Figure 1 illustrates a solution built with AWS, which extracts O&G well data information from PDF documents.

Enhance resiliency with admission control in Amazon OpenSearch Service (successor to Amazon Elasticsearch Service)

March 1, 2022 GeneAka

Admission control in Amazon OpenSearch Service enhances the overall resiliency of OpenSearch clusters by limiting new incoming requests early, at the REST layer, when a node is stressed

How Gemini Built a Cryptocurrency Analytics Platform Using Lakehouse for Financial Services

February 23, 2022 GeneAka

With the sheer volume of historical and live data feeds being ingested, and the need for a scalable compute platform for backtesting and spread calculations, our team needed a performant single source of truth to build the application dashboards.

As the data sets would be leveraged by machine learning and analyst teams, the Delta Lake format provided unique capabilities for managing high volume market/tick data — these features were key in developing the Gemini Lakehouse platform:

Export JSON data to Amazon S3 using Amazon Redshift UNLOAD

February 22, 2022 GeneAka

Example 1 – Unload customer data in JSON format into Amazon S3, partitioning output files into partition folders, following the Apache Hive convention, with customer birth month as the partition key.

Example 3 – Unload line item data (With SUPER column) in JSON format into Amazon S3, partitioning output files into partition folders, following the Apache Hive convention, with customer key as the partition key

Enable users to ask questions about data using natural language within your applications by embedding Amazon QuickSight Q

February 22, 2022 GeneAka

Each user who accesses the Q search bar assumes a role that gives them QuickSight permissions to retrieve a Q-embedded URL.

« 1 … 22 23 24 25 26 27 »