Modeling tool with support for hierarchies

View previous topic View next topic Go down

Modeling tool with support for hierarchies

Post  Lumbago on Mon Sep 05, 2011 3:45 am

Hi,

this is my first post to this forum so please be nice! hehe

I'm at the beginning of a fairly large DWH project (by Norwegian standards) and I have been reviewing data modeling tools recently. However, there seems to be a general lack of support for creating dimensional hierarchies in these tools and I was wondering if you guys know of any that support this? I have been reviewing CA ERwin and ER/Studio from Embarcadero which both seems to lack the functionality, while PowerDesigner from Sybase seems to have it. Oracle SQL Data Modeler also has it but this tools doesn't really seem "enterprise friendly"...

Do any of you guys have any recomandations?

Lumbago

Posts : 3
Join date : 2011-09-05

View user profile

Back to top Go down

Re: Modeling tool with support for hierarchies

Post  BoxesAndLines on Tue Sep 06, 2011 9:29 am

What specific functionality is lacking? One solution is to specify the normalized hierarchy in the logical model and then flatten the structure in the physical model.
avatar
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

View user profile

Back to top Go down

Re: Modeling tool with support for hierarchies

Post  KS_EDW on Thu Sep 08, 2011 12:32 am

I’m so glad to hear I’m not the only person frustrated by this lacking functionality. I simply can’t build a project-wide ETL developer friendly STT (source to target) Lineage using standard modeling tools! The detail required is simply too cumbersome to document using these tools!

Anyway, here is what I’ve ended up doing…

I build the “pretty” illustration in a modeling tool without worries of the details. Then I use a custom application to build out the STT and Lineage for the ETL develpers. This allows me to build out the levels within the separate hierarchies of the facts and dimensions (yes… multilevel facts). I can build out an entire project with dozens of facts and dimensions without overlap or confusion and hand the ETL developers a nice and organized (easy to understand) STT Lineage.
I’ve worked projects with some of the largest delivered EDWs available for PeopleSoft, tax systems, GIS, etc… they’ve ALL used Excel to document the lineage.

If a modeling tool salesperson says their tool can do it… make them illustrate it!

Hope this helps.
avatar
KS_EDW

Posts : 20
Join date : 2011-09-07
Age : 42
Location : Kansas

View user profile

Back to top Go down

Re: Modeling tool with support for hierarchies

Post  Lumbago on Fri Sep 09, 2011 2:06 am

I appreciate your answers fellas. I'm not overly worried about the data lineage actually...I have never heard of a modeling tool that has such capabilities although it would be *really* cool and extremely helpful to have it. The ETL-tool we are using (Business Objects Data Services) is supposedly able to do it, but I have some doubts...our solution will with 95% certainty involve Excel as well

The issue I'm more worried about however is the ability to model hierarchies (i.e. Year-Month-Day or Year-Week) and keep track of them directly in the modeling tool. PowerDesigner from Sybase is up to the task but I would really like some alternatives to choose from. Data Modeler from Oracle is suboptimal in the sense that their platform support is really bad (it only supports oracle9+, db2 and sql server 2000/2005)

@KS_EDW: This custom application you are talking about...can you please elaborate a bit about it?

Lumbago

Posts : 3
Join date : 2011-09-05

View user profile

Back to top Go down

Re: Modeling tool with support for hierarchies

Post  KS_EDW on Fri Sep 09, 2011 11:08 am

Spreadsheets are clumsy at best but are just about our only cost-effective tool. Mapping source to target and transformations is not a problem in Excel. Building a lineage, ETL procedures, hierarchies, and levels for an entire enterprise warehouse is! Keeping all that straight in 10s of thousands of rows is a headache! Creating reports for the ODS, ETL, and semantic modelers from Excel models is also very time consuming.
In my experience, the developers were always fumbling through the spreadsheets, hiding columns, sorting rows… causing mistakes…. It was just humanly impossible to keep it all straight for me or the modelers…
Essentially, four distinct build sections need modeled – operational data stage, ETL, DW, and the semantic layer. This application captures all four with one effort. I’ve tried to attach an illustration but "new member" status is blocking it (changed jobs/email addr since last membership). I'll update in a vew days.
The output from the application is very “report” oriented models, mappings, lineages, etc… in that, the developers from all sides can get what they need without sorting through huge spreadsheets. Reports can include DW Table builds, DW table/structure modifications, ETL build, Conforming procedures, Semantic build, level and hierarchy architecture, etc…. PLUS… when there is a change ALL of the reports are updated… PLUS agile can be implemented by separating an enterprise project into sub projects.
One word though... the application does not make it any easier to conceptualize a model while building it. It only makes the organization of documentation easier. It’s very much a mad-scientist approach, but it works beautifully.
I was hoping to send the idea off to a vendor… I don’t have the capacity to create a solid application or distribute it. Any suggestions?

avatar
KS_EDW

Posts : 20
Join date : 2011-09-07
Age : 42
Location : Kansas

View user profile

Back to top Go down

metadata driven DW development

Post  robber on Sat Sep 10, 2011 10:13 am

I've been using a DW development application for a number of years which automatically generates documentation including data lineage and all Source to Target mappings including any transformations. I've got some sample documentation on my website: data warehouse documentation

Have a look and let me know what you think. For example select technical documentation, General Ledger, Fact Tables, fact_gl_transactions
there you can see procedures, dimensions, columns, star schema, source diagram (aka data lineage). This data lineage is at the table level, you can view column level lineage elsewhere.

The metadata is the core, what you design is stored in the metadata repository, what you generate (facts, dims, staging, procedures, etc.) are all generated from the metadata. It generates native db objects, no black box or any extra steps involved.

Let me know what you think

robber

Posts : 41
Join date : 2009-02-28
Location : Canada

View user profile

Back to top Go down

Re: Modeling tool with support for hierarchies

Post  robber on Sat Sep 10, 2011 10:16 am

just re-read the post, i was working bottom up and replied to the data lineage request...

from a hierarchy point of view, same thing, stored in the metadata so look at the cube group in same structure and you will see a hierarchy link...

robber

Posts : 41
Join date : 2009-02-28
Location : Canada

View user profile

Back to top Go down

Re: Modeling tool with support for hierarchies

Post  Lumbago on Tue Sep 13, 2011 6:13 am

@robber: I looked at the documentation you provided and I've also looked a bit on the WhereScape website and from what I've seen so far it looks intriguing. I was surprised however that it wasn't possible to download a trial-version if the software and I was also not able to find any licensing information (pricing) which kind of puts me off. Is it also the case that all ETL is implemented as stored procedures in the/a database?

Lumbago

Posts : 3
Join date : 2011-09-05

View user profile

Back to top Go down

Re: Modeling tool with support for hierarchies

Post  robber on Tue Sep 13, 2011 8:12 am

Lumbago wrote:@robber: I looked at the documentation you provided and I've also looked a bit on the WhereScape website and from what I've seen so far it looks intriguing. I was surprised however that it wasn't possible to download a trial-version if the software and I was also not able to find any licensing information (pricing) which kind of puts me off. Is it also the case that all ETL is implemented as stored procedures in the/a database?

You can download an eval version, pricing is reasonable (100k - 200k USD), ETL is ELT which for many of us is a good thing, work is done in the database, open and transparent.

robber

Posts : 41
Join date : 2009-02-28
Location : Canada

View user profile

Back to top Go down

Re: Modeling tool with support for hierarchies

Post  KS_EDW on Tue Sep 13, 2011 9:05 am

That kind of pricing was exactly our fear. Not to say anything about the product… it looks excellent, but the output is not so unlike our home-grown solution. Usually, for the three projects/organizations I’ve been working for, they have 800K – 2 million for the entire hardware/operating system/database/software budget. We usually can stretch 20K for a “wanted software” (software for which there are practical work arounds for).
I really [REALLY] like the tool and have tagged it for future reference.
avatar
KS_EDW

Posts : 20
Join date : 2011-09-07
Age : 42
Location : Kansas

View user profile

Back to top Go down

Re: Modeling tool with support for hierarchies

Post  robber on Tue Sep 13, 2011 9:29 am

KS_EDW wrote:That kind of pricing was exactly our fear. Not to say anything about the product… it looks excellent, but the output is not so unlike our home-grown solution. Usually, for the three projects/organizations I’ve been working for, they have 800K – 2 million for the entire hardware/operating system/database/software budget. We usually can stretch 20K for a “wanted software” (software for which there are practical work arounds for).
I really [REALLY] like the tool and have tagged it for future reference.

I hear ya but consulting fee's are not cheap either and the productivity increases not to mention ongoing support, maintenance, and enhancements more than pay for the upfront cost. The metadata is a side benefit, you've also built your data warehouse along the way.

But for those smaller gigs where they just cannot pony up have a look at Talend Open Source...

robber

Posts : 41
Join date : 2009-02-28
Location : Canada

View user profile

Back to top Go down

Re: Modeling tool with support for hierarchies

Post  KS_EDW on Tue Sep 13, 2011 9:54 am

Absolutely! Talend looked like a great tool… unfortunately, I no more than installed it than got told to uninstall it. (Warning… I’m going to vent here…) I work in Government…I often find myself between a rock, hard place, and politics. I can deal with “rock” and “hard place” but the “politics” of a public organization are sometimes ludicrous. “We don’t use open source”… someone in functional administration had a bad run-in with an open source tool in the 90s… “really…, the 90s! Come ON!” So, whatever we use must come through a vendor. I'll tell you... if Talend had been bundled with the BI, it would have been okay to use. Arg!

Hate to be so negative… despite my personal take on things…, all of our projects (though modeled by spreadsheet) have been wildly successful. One of the data marts I modeled found $64 million in new revenues in the first 5 days of production… we collected almost $25 million of that within year one with letters - produced from the EDW. That alone paid for the entire 6-month project twenty five times over! We’ve also had a data mart that caused a complete redesign of individual income tax policy, reducing April 15th annual debt by almost 10%. It’s amazing what these things can do when properly designed!
avatar
KS_EDW

Posts : 20
Join date : 2011-09-07
Age : 42
Location : Kansas

View user profile

Back to top Go down

Re: Modeling tool with support for hierarchies

Post  robber on Tue Sep 13, 2011 10:32 am

I'm currently working in Government as well and yes there are some challenges.

There are BI players who will bundle a deal with Talend so you could try that approach - or considering the ROI that you are demonstrating why don't you buy the ETL personally and then charge them a royalty, say 10%, on the new revenue streams. That would be a win/win in my books.

robber

Posts : 41
Join date : 2009-02-28
Location : Canada

View user profile

Back to top Go down

Re: Modeling tool with support for hierarchies

Post  KS_EDW on Tue Sep 13, 2011 10:36 am

Very good suggestions! I'll keep that suggestion in my pocket for the next project! Thanks for your insight and suggestions!
avatar
KS_EDW

Posts : 20
Join date : 2011-09-07
Age : 42
Location : Kansas

View user profile

Back to top Go down

SQL Server Master Data Services

Post  Mike Honey on Thu Sep 15, 2011 12:23 am

This has been a very interesting thread.

I'm wondering if anyone has tried or considered the SQL Server Master Data Services module:
http://msdn.microsoft.com/en-us/sqlserver/ff943581
http://www.microsoft.com/sqlserver/en/us/solutions-technologies/data-warehousing/master-data-services.aspx#

It cites hierarchy management as one of the key features. Its effectively free (with SQL Server 2008 R2 Enterprise+ Editions).

I'm about to embark on an eval of it for a client - they are similar or smaller scale than the projects KS_EDW described above. So I'd be interested to hear of other people's experience and/or opinions on this.

Thanks - Mike
avatar
Mike Honey

Posts : 185
Join date : 2010-08-04
Location : Melbourne, Australia

View user profile http://www.mangasolutions.com

Back to top Go down

Re: Modeling tool with support for hierarchies

Post  KS_EDW on Thu Sep 15, 2011 9:33 am

Hey folks… I thought I would post that diagram.

This is how the modeling tool I discussed earlier works (illustrated). It’s designed around current employer’s operation – using a 7-stage ETL process using OWB as the ETL tool. We operate with a LOT of geographical, financial, and event data. There are standardized object naming conventions also (not the “Naming Standard Validations) node. Projects are handled via “agile” development, so projects are often segmented into distinct sections. Currently we’re running about 4,000 distinct columns through ETL, 100 or so subject areas, and a little over that in dimensions.

http://i44.servimg.com/u/f44/16/84/97/03/presen11.jpg

[img][/img]
avatar
KS_EDW

Posts : 20
Join date : 2011-09-07
Age : 42
Location : Kansas

View user profile

Back to top Go down

Re: Modeling tool with support for hierarchies

Post  BoxesAndLines on Thu Sep 15, 2011 12:26 pm

How do you define "subject area"? With 4000 columns and 100 subject areas, that leave ~40 columns per subject area. That seems a little light to implement any sort of business functionality.
avatar
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

View user profile

Back to top Go down

Re: Modeling tool with support for hierarchies

Post  KS_EDW on Thu Sep 15, 2011 3:14 pm

Ah, good call, some clarity would probably help… we have a HUGE volume of conformed dimensions so the dimensions often end up being 10 or fewer columns. (I think I confused things when I desribed columns as "distinct ETL columns"... should have said "distinct EDW columns".)

Looking at it mathematically, say dividing total columns by subject areas, kind of skews the appears of EDW’s actual distribution of columns throught the surrogates.

Distinct contributing columns is much higher than 4,000 (many more...). But, EDW only maintains about 4,000 distinct columns mapped from or derived in the ETLs. Many subject areas use the same dimension thus, EDW only maintains… for example… say 10 distinct columns but has 100 contributing columns, and, when flattened by the fact’s surrogates, they appear as 100+ to the users ("+" = derived data), but are actually only 10 in EDW. I hope this makes some sense of the situation. Most subject areas maintain roughly 150+ columns (date and location usually take up a lot of those columns).

To expand…. Many (most) of our subject areas use linear referencing methods (GIS) location, date, person, and notes dimensions. These alone are 100s of contributing columns funneled into 10s of EDW columns. Often we have a fact table with 10 or more surrogates for location and another 10 or more for date and another 10 for persons (bridge), and yet another 10 or more for notes (bridge). All together, there are hundreds of distinct contributing columns, but EDW only holds 10s of distinct columns.

To explain this particular fact… This type of fact enables users to interact with EDW through mapping software… plotting the points of progress on a project (1 project = one row) for example, then drill down occurs through bridge tables, but the bridge tables are a distinct fact, having reporting requirements outside of mapping – person for example, having aggregates of salary based on project time. Because location (“GIS”) is the center point and purpose of much of this EDW, it makes the model a little different than the typical EDW model (financial, customer, sales, or HR for example). When illustrated it turns into a huge web with the center points being date, GIS, and resources.

Overall, our typical fact table has 2 – 30 (or so) surrogates. Our facts have between 4 and 20 measures. The dimensions range from 3 to 50 (most having 10 or less) columns. There are of course a couple exceptions – several dimensions with 80 or so columns and a couple facts with 100+ surrogates.

Hope this clarifies… oh, and let me know if something seems wrong… I’m always up for some constructive criticism.
avatar
KS_EDW

Posts : 20
Join date : 2011-09-07
Age : 42
Location : Kansas

View user profile

Back to top Go down

Re: Modeling tool with support for hierarchies

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum