Data Lake versus Data Warehouse

11/11/20, 9:10 am

A data warehouse doesn’t need any introduction to those belonging to the world of big data. But ever since the recent addition of a fresh aspect into the big data storage segment, data lakes, many organisations have been left puzzled and confused.

Whilst a data warehouse and data lake might sound similar, there are major differences to each approach. In this post, we highlight the key distinctions between the two as they both serve different purposes and require different sets of acumen to be effectively utilised.

Let’s begin with defining the two!

Data Warehouse

A data warehouse is a high-capacity repository system that typically stores only structured data from disparate sources. The data is processed and formatted for the purpose of comparison and analysis to gather business intelligence.

Data Lake

A data lake is a next-generation hybrid data repository solution. It is optimised for quick ingestion of structured, semi-structured and unstructured data from a wide variety of data sources, plus on-the-fly processing of such data for exploration, analytics and operations.

Key Differences between a Data Lake and Data Warehouse

Data Lake

The whole spectrum from raw (unprocessed, native format), unstructured to structured

Data Types

Data Warehouse

Structured, cleaned, curated and processed

Data Lake

Any source including text and images from social networks, measurements from IoT devices (e.g. light, heat, motion) and more structured data from ERP and CRM systems

Data Sources

Data Warehouse

Transactional systems (e.g. point-of-sale), operational databases (e.g. ERP systems) and applications with historical and relational structured data

Data Lake

Structure is developed when data is used

Data Processing

Data Warehouse

Data must have structure prior to use

Data Lake

Retained for an unlimited amount of time

Data Retention

Data Warehouse

Data is purged periodically; only stores data relevant to analysis

Differences in data use and application

insights-blog-data-lakes-vs-data-warehouse-content-image-nec-005.png

Data Lake or Data Warehouse: What does your Business Need?

A data warehouse consolidates data from multiple databases in order to gain insights, archive structured data as well as for carrying out repeatable processes such as extracting month-on-month sales and to create website reports. It typically allows organisations to meet their day-to-day reporting needs as well as a good level of analysis.

According to projections from IDC, 80% of worldwide data will be unstructured by 2025. Unstructured data typically makes up around 70-80% of an organisation’s total data volume, including email, social media, blogs, videos content and others.

A data lake consolidates both structured and unstructured data from multiple sources and provides the ability to extend the analysis across a much larger data set. It allows organisations access to a larger set of data in order to do a more comprehensive analysis and gather deeper insights in order to react quickly when new business issues are identified, to discover patterns and trends and gain meaningful business insights.

Whether a business needs to utilise a data warehouse or a data lake depends on their goals. However, it is worth nothing that a significant chunk of an organisation's data is unstructured and without proper applications to utilise this data it limits the insights that an organisation can generate.

Download whitepaper

Raja Balakrishnan
National Solutions Manager
Raja.Balakrishnan@nec.com.au

AVIATION

Touchless touchpoints

DEFENCE

Securing our borders

EDUCATION

Customer success

ENTERTAINMENT VENUES AND STADIUMS

Partnership Announcement

GOVERNMENT AND PUBLIC ADMINISTRATION

AnyCase: Case Management built for Government

HEALTH

Customer success

JUSTICE AND CORRECTIONS

Customer success

PUBLIC SAFETY & TRIPLE ZERO (000)

Assistive Document Redaction

TRANSPORTATION

Bus Solutions

TRANSPORTATION

The transport future

GOVERNMENT AND PUBLIC ADMINISTRATION

AnyCase: Case Management built for Government

Biometrics and Digital Identity

BIOMETRICS AND DIGITAL IDENTITY

Touchless touchpoints

Contact Centre and CX

CONTACT CENTRE AND CX

Knowledge Management

Data and Applications

DATA AND APPLICATIONS

AnyCase: Case Management built for Government

Hybrid Cloud

HYBRID CLOUD

NEC Cloud Transformation

Professional Services

PROFESSIONAL SERVICES

Access NEC’s technical SMEs

Secure Networks

SECURE NETWORKS

Secure Access Service Edge (SASE)

Service Management

SERVICE MANAGEMENT

Service Integration and Management (SIAM)

Technology Advisory Services

QUANTUM COMPUTING

Quantum Consultancy

Unified Communications

Cloud Calling & UCaaS

Enterprise Telephony (On-Prem or Hybrid)

Microsoft-Centric Communications

UNIFIED COMMUNICATIONS

UNIVERGE BLUE

CYBER SECURITY

Managed SIEM + XDR

EBOOK

Beyond Policy & Platforms

EBOOK

Beyond Policy & Platforms

NEC Australia

Responsible Business

ABOUT

NEC Australia

Channel Partners

CHANNEL PARTNERS

Unified Communications

Corporate Partners

CORPORATE PARTNERS

Sydney Living Museums

INDUSTRY PARTNERS

Established and emerging ICT brands

CORPORATE PARTNERS

Sydney Living Museums

CAREERS

Become part of the team!