Great Expectations is an open-source Python framework designed for evaluating and monitoring data quality. Its integration with most market tools is straightforward and fast, thanks to its compatibility with various data sources and orchestration tools. Additionally, the community actively contributes to the development of packages that provide predefined quality checks.
Great Expectations can connect to major database technologies (such as PostgreSQL, BigQuery) and storage solutions (like AWS S3, Google Cloud Storage). It also offers a logging and alerting system for quality check results, providing better visibility into data quality status.
However, Great Expectations has several significant limitations. Its learning curve is steep due to complex concepts that are not easy to grasp. Creating custom quality checks is often challenging and unintuitive, especially given that its documentation is insufficient for easing adoption. Additionally, the tool lacks robust features for cross-table validation, limiting its applicability. Finally, the user experience of its quality reports is suboptimal, making it difficult to track quality trends over time.
Theodo’s point of view
We do not recommend using Great Expectations due to its limitations, which make it difficult to adopt and scale effectively. For data quality monitoring, we suggest considering alternatives like Elementary, particularly if your pipeline relies on dbt.
Lorem ipsum dolor sit amet consectetur. Eu tristique a enim ut eros sed enim facilisis. Enim curabitur ullamcorper morbi ultrices tincidunt. Risus tristique posuere faucibus lacus semper.
En savoir plus