A data lake is an environment where a vast amount of data, of various types and structures, can be ingested, stored, assessed, and analyzed. Data lake technologies can scale to massive volumes of data, and combining datasets is easy with data stored in a relatively raw form.
A data lake architecture can centralize data over distributed storage, providing a scalable, fast, secure, and economical solution.
Data lakes serve many purposes, including:
An environment for data scientists to mine and analyze vast amounts of raw, structured, and unstructured data
A central storage area for raw data, with minimal (if any) transformation
Store different types of data in their original formats until they need to be structured and analyzed
Image used under license from Shutterstock.com
We use technologies such as cookies to understand how you use our site and to provide a better user experience.
This includes personalizing content, using analytics and improving site operations.
You can change your cookie settings as described here at any time, but parts of our site may not function correctly without them.
By continuing to use our site, you agree that we can save cookies on your device, unless you have disabled cookies.