Our new architecture needed to address the deficiencies while preserving the core goal of our service: update stateful artifacts based on incoming financial events.
We used an Apache Spark application on a long-running Amazon EMR cluster to simultaneously ingest input batch data and perform reduce operations to produce the stateless artifacts and a corresponding index file for the stateful processing to use.
Doing more with less: Moving from transactional to stateful batch processing
Our new architecture needed to address the deficiencies while preserving the core goal of our service: update stateful artifacts based on incoming financial events.
We used an Apache Spark application on a long-running Amazon EMR cluster to simultaneously ingest input batch data and perform reduce operations to produce the stateless artifacts and a corresponding index file for the stateful processing to use.