Data lake

In my view –

Data lake is a conceptual term which describes the enterprise technology capability and toolset to ingest any variety of data irrespective of volume, store them and make business insights out of it.

If the data lake is unmanaged, it’s called data swamp.

Usually it’s comprised of big data capabilities such as blob storage, distributed storage, batch and stream processing, data analytics, machine learning and visualisation.

AWS take on data lake – https://aws.amazon.com/big-data/datalakes-and-analytics/what-is-a-data-lake/

Azure on data lake – https://azure.microsoft.com/en-in/solutions/data-lake/

GCP on data lake – https://cloud.google.com/solutions/build-a-data-lake-on-gcp

Leave a Reply

Your email address will not be published. Required fields are marked *