Thoughts on the potential of Hadoop to replace the Architected Data Warehouse

View previous topic View next topic Go down

Thoughts on the potential of Hadoop to replace the Architected Data Warehouse

Post  MilesWis on Mon Mar 30, 2015 8:36 am

We all know that a well designed and architected data warehouse is easy for business users to navigate, contains integrated data across all aspects of the enterprise, holds clean and logically consistent data, and performs well for the exploratory data analysis the business executives need to base their forward-looking strategies on a sound data foundation.

But creating this well designed and architected data warehouse takes talent. There are a lot of charlatans in this space, who claim to have data warehouse and dimensional design expertise, but who really don't have a clue. As a result, we still see many failed, or less than successful DW / BI efforts. Has the emergence of Hadoop signaled the pathway to a less technically challenging, and lower operating cost solution for integation of enterprise data, that can ultimately replace the Architected Data Warehouse?

Will the data warehouse of the future have a totally different structure? Will we see data sources simply dumped into Hadoop clusters and then cross-reference bridge tables populated to achieve cross-system integration? Then the massive parallelism of the Hadoop environment would provide the ability to achieve acceptable performance for exploratory data analysis.

This seems to be too simplistic to me. But this is the message that I am getting from the Hadoop crowd.

Thoughts?

MilesWis

Posts : 3
Join date : 2015-03-30
Age : 69
Location : Milwaukee, Wisconsin, USA

View user profile

Back to top Go down

Re: Thoughts on the potential of Hadoop to replace the Architected Data Warehouse

Post  TheNJDevil on Mon Mar 30, 2015 9:44 am

Hadoop replace traditional dimensionally modeled DW? No Way.

What hadoop can offer is a place to fail cheaply and to prove cheaply what does/doesn't work with each data set/business process. Using Hadoop and a proving ground and then only bringing proven solutions into the data warehouse will significantly decrease the DW failure rates.

TheNJDevil

Posts : 68
Join date : 2011-03-01

View user profile

Back to top Go down

But why not?

Post  MilesWis on Mon Mar 30, 2015 10:03 am

Is there something about the Hadoop environment or architecture that makes it inherently unsuitable for the type of exploratory data analysis that typically informs business strategy? Yes, it comes from an unstructured data background, but the advocates of Hadoop are asserting that it performs effectively for structured and metric data as well. If that is the case, can it lower the expertise level demanded of the technical design and development staff?

MilesWis

Posts : 3
Join date : 2015-03-30
Age : 69
Location : Milwaukee, Wisconsin, USA

View user profile

Back to top Go down

Re: Thoughts on the potential of Hadoop to replace the Architected Data Warehouse

Post  ngalemmo on Mon Mar 30, 2015 10:47 am

Hadoop is not a replacement for a traditional data warehouse.

1. Hadoop is very inefficient for processing structured data.  It needs to repeatedly pack and unpack structured data before it can process it.  It handles fault tolerance through redundant processing, effectively utilizing 25-33% of the available capacity to do actual work.  it is not as cost efficient as proponents would want you to believe.

2. Hadoop is still pretty immature.  While it has been around for a long time, there isn't much out there that frees you from having to write Java to do much of anything.  The Hadoop environment is anything but 'ad-hoc'.

3. 90+% of all data warehouses simply don't have the kind of data volumes Hadoop was intended to handle.  And, as storage and processor technologies continue to advance, there isn't any reason to expect that this ratio will change very much.  There simply isn't any valid reason or advantage to switch.

4. There are many superior alternatives to Hadoop for storing and querying huge volumes of structured data.  And the long-term cost advantage of Hadoop is very small, if none at all.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Thoughts on the potential of Hadoop to replace the Architected Data Warehouse

Post  ron.dunn on Mon Mar 30, 2015 10:40 pm

I'm not as pessimistic as previous commenters.

(1) Every major vendor has a SQL + Hadoop play.

(2) Initiatives like Impala (Cloudera) and Stinger (Hortonworks) create a true SQL engine layer over the top of a Hadoop data store. Coming releases of these products should make SQL-based ETL processing possible.

(3) SQL on Hadoop is an interesting play for cloud-based data warehouses.

(4) Hadoop makes a great place for a persisting load/stage data that doesn't make it through to the Star Schema.

(5) Like it or not, the coming wave of IOT data is likely to hit Hadoop before it gets anywhere near a data warehouse. Hadoop as a data source is already feasible, and will become commonplace in a few short years.

Ralph Kimball did an interesting video for Cloudera, last year:

http://www.cloudera.com/content/cloudera/en/resources/library/recordedwebinar/building-a-hadoop-data-warehouse-video.html

ron.dunn

Posts : 55
Join date : 2015-01-06
Location : Australia

View user profile http://ajilius.com

Back to top Go down

Great video by Ralph on Hadoop for the Data Warehouse

Post  MilesWis on Tue Mar 31, 2015 8:53 am

Thanks for the link. This is fabulous. It is exactly the sort of thing, from a true industry icon, that I was hoping to find.

MilesWis

Posts : 3
Join date : 2015-03-30
Age : 69
Location : Milwaukee, Wisconsin, USA

View user profile

Back to top Go down

Re: Thoughts on the potential of Hadoop to replace the Architected Data Warehouse

Post  TheNJDevil on Tue Mar 31, 2015 9:04 am

I was in a class last month taught by Ralph Kimball and he was very clear when he said that anyone that has not included, or is not strongly considering including, Hadoop in their overall BI architecture was being foolish. The slide he had included Hadoop as a pre-warehouse work area that only specialized users could access. He also said that Hadoop will not replace the EDW, only provide a better sandbox to allow analysts (the fabled data scientist) access to create value quickly. Create that value before moving it out of the data scientists project space, into the EDW where the rest of the company can then use the results.


TheNJDevil

Posts : 68
Join date : 2011-03-01

View user profile

Back to top Go down

Re: Thoughts on the potential of Hadoop to replace the Architected Data Warehouse

Post  ngalemmo on Tue Mar 31, 2015 10:50 am

I would not say I was being pessimistic.

There is a tendency in our business that whenever anything new comes along the hype cycle begins and whatever the technology of the day is will replace just about everything and anything. It simply doesn't happen.

Hadoop is and always will be a processing technology. Data storage was an afterthought, and none of it particularly well suited for a data warehouse. The basic storage methods focus on the 'F' word… files. It is about as old school as you can get.

Hadoop has a place, but it is a complementary technology that can supplement a data warehouse, not replace it.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Thoughts on the potential of Hadoop to replace the Architected Data Warehouse

Post  BoxesAndLines on Tue Mar 31, 2015 1:58 pm

When you start seeing Wall Street financial reports coming out of Hadoop clusters, you might start to worry about your EDW based on relational databases. Until then, the threat of jail time for CXO's due to SOX compliance violations in financial reporting will keep the traditional warehouse in charge. Close just isn't good enough for publicly traded companies.
avatar
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

View user profile

Back to top Go down

Re: Thoughts on the potential of Hadoop to replace the Architected Data Warehouse

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum