Data Foundation – A GCP (Google Cloud Platform) Approach
Legacy Data Warehouse was meant to capture mostly structured data for descriptive BI reporting.
Trends changed to building a data lake to capture every aspect of the business operations.
Data growth from terabytes to petabytes or even exabytes brought the realization of separation of storage and compute, and a modernized data warehouse, where the data warehouse makes it possible to extract insights from all of the data.
Cloud Storage – Data Lake to store data from all aspects of your business
BigQuery – Data Warehouse (Modernized) to collect structure, semi-structured and unstructured data for data mining. Provides aspects such as Scalability, Performance, Security, etc
https://cloud.google.com/solutions/build-a-data-lake-on-gcp
https://cloud.google.com/solutions/bigquery-data-warehouse
Data Foundation Framework
Data Foundation : Metadata – What It Is and Why It Matters?
Digital transformation requires us to understand data and use it in an innovative and efficient way; the key component that can help in improving the efficiency of such data-driven initiatives is metadata foundation.
Data Catalog is an approach/tool for assessing and governing the Metadata.
Data Foundation/Management strategy needs to be in place along with data management processes and governance framework for the tools to be efficient.
Use Cases
Key Features
- Serverless
- Metadata as a service
- Central Catalog
- Simplifies data discovery at any scale
- Offers a unified view of all datasets
- Provides a foundation for data governance
https://cloud.google.com/data-catalog
https://cloud.google.com/data-catalog#section-4
Data Foundation: Master Data Management
Click here for the video session of this blog.
GCP Reference Architecture – Data Warehouse and Master Data Management
Data Foundation is key, and a great strategy to add to your Digital Transformation. Please do let us know your thoughts in the blog comments.