genesis-aka.net
Monitor data quality in your data lake using PyDeequ and AWS Glue
Monitor data quality in your data lake using PyDeequ and AWS Glue - In our previous post , we introduced PyDeequ , an open-source Python wrapper over Deequ, which enables you to write unit tests on your data to ensure data quality. The use case we ran through was on static, historical data, but most datasets are dynamic, so how can you quantify how your data is changing and detect anomalous changes over time?
GeneAka