Tired of running your roadmap from a spreadsheet?Book a demo
Back to glossaryDefinition

What is Data Lake?

A centralized repository that stores raw data in its native format at any scale.

A data lake is a centralized storage repository that holds vast amounts of raw data in its native format (structured, semi-structured, and unstructured) until needed for analysis. Unlike data warehouses that require schema-on-write, data lakes use schema-on-read, offering flexibility in how data is processed. They support diverse workloads including batch processing, real-time analytics, and machine learning. However, without proper governance, data lakes can become 'data swamps' where data is disorganized and unusable.

Put this into practice

Assess your maturity, discover initiatives, and build your transformation roadmap.

Book a demo