Member-only story
Delta Lake — An Overview
The article covers the fundamentals of delta lake — definition, benefits, core components, limitations and future roadmap
What is Delta Lake?
Delta lake is one of the key products offered by Databricks, which is the “Data + AI” company. The company was founded in 2013 by the original creators of Apache Spark™.
By definition, Delta Lake is an open format storage layer that delivers reliability, security and performance on data lake — for both streaming and batch operations. By replacing data silos with a single home for structured, semi-structured and unstructured data, Delta Lake is the foundation of a cost-effective, highly scalable lakehouse (If you are new to the lake house architecture, here is a good primer, I captured few days back).
Another important aspect of delta lake is that it’s an open-source technology and is available through Apache license 2.0.
Current Challenges
This is the current state where both datalakes and data-warehouses play their significant roles, when it comes to the organization’s analytics needs.