Table and Column Naming Conventions

View previous topic View next topic Go down

Table and Column Naming Conventions

Post  pcs on Tue Mar 10, 2009 7:50 am


Hello,
We are working on implementing the first production ‘Kimball’ data warehouse on a Teradata database for our company.

Teradata database column names are limited to 30 characters. Additionally, our company has naming convention policies in place for tables and columns. Historically our Teradata environment has been used as an operational data store and this naming convention was probably created and has been enforced with this in mind.

My understanding of Kimball best practices (The toolkit books and Design tip 71) conflicts with adhering to our existing naming convention. I understand The Kimball Data Warehouse Toolkit Books and other supporting articles advocate using meaningful, clear, and concise table and column names.

A customer dimension table following Kimball best practices might bear the name Customer_Dimension and might contain column names like Customer_Id, Customer_First_Name, Customer_Middle_Name, Customer_Last_Name etc.

The current naming conventions for our company would present the table name as Cst_Dim and field names as Cst_Id, Cst_Nm_F, Cst_Nm_M, Cst_Nm_L etc.

The conflict here is simple but fundamental to user friendliness and user experience. I think we should push for an exception to the standard naming conventions for our data warehouse and have been asked to present this position to our governance board.

I would argue that our number one goal is to create an accurate, easy to use, and easy to navigate environment. Expanded table and column names provide unambiguous and clear meaning to the content of a given database entity, whereas our existing naming convention, initially conceived for a backend ODS, leaves a lot to be desired from both a usability and understandability point of view.

I anticipate pushback on the following points:
1. If we alter the naming convention for the data warehouse it will be difficult to join back to the ODS for ad hoc follow up analysis.
2. Some meaningful names will exceed 30 characters and will have to be abbreviated somehow. We would have to establish new rules for this abbreviation.

My thoughts, as I am drafting this post, are that even thought it might be easier to follow the existing conventions, especially from a backend perspective, it would be much better from an end user point of view to present the database entities with easier to use and understand names.

I would appreciate any thoughts and experience the community might have on this issue.
Thank you,
Patrick

pcs

Posts: 20
Join date: 2009-02-03

View user profile

Back to top Go down

Re: Table and Column Naming Conventions

Post  BoxesAndLines on Tue Mar 10, 2009 10:42 am

I recommend staying with the current naming standards. I don't feel any of the arguments presented here outweigh the value of consistently named objects. For Teradata, you can store the logical column name in the Title. The Title can be stored in the database available for end users. If I recall correctly you can even use the Title names instead of the column names.

I'm also interested in your performance of true dimensional models on Teradata. Teradata's standard architecture is not ideal for dimensional models. I'm interested in how well this works in your environment. Please keep us posted!

BoxesAndLines

Posts: 1042
Join date: 2009-02-03
Location: USA

View user profile

Back to top Go down

Re: Table and Column Naming Conventions

Post  robber on Tue Mar 10, 2009 10:59 am

Two approaches I've used previously withTeradata is:

1) create datamarts in Oracle or Sql Server and follow Kimball best practices
2) create teradata views to resemble star schema's (ie. convert typical Teradata 3NF tables into Kimball like dimensional structures). You may try to appease the Governanace board here by using meaningful business names in the views while leaving the underlying tables using the corporate naming conventions.

If I had my choice I would go with option 1 but I suspect you'll have more success pushing for option 2.

robber

Posts: 41
Join date: 2009-02-28
Location: Canada

View user profile

Back to top Go down

Re: Table and Column Naming Conventions

Post  pcs on Tue Mar 10, 2009 11:38 am


Thank you both for replying to this query...I appreciate your insight.

Boxes and Lines,
I am curious why teradata's standard architecture is not ideal for dimensional models. I am used to working in the relational world and have a background using Relational and Dimensional data in Oracle and SQLServer but am relatively new to teradata. I was thinking that teradata is a supercharged relational database and would benefit from querying a dimensional model just as the other platforms do. I will dig into this idea a bit further - perhaps we are going down the wrong path with teradata.

We had considered using the title field as you suggest, but did not realize you might be able to query off this field. I'll also dig into specifically how to do this. Thanks again for the thoughts - I will let you know how this goes...

Robber,
One of the reasons we are trying to implement the new warehouse on teradata is the shear volume of data we are dealing with (100 Million + distinct customers, billions of web events to potentially track) - offloading the data into SQL Server was considered, but discarded basically for two reasons; Data Volume and Ad Hoc analysis against the existing ODS already on Teradata. We are planning to use MS Analysis services and MS Reporting services for our presentation layer.

I like the idea of using the teradata views but we need quite a bit of ETL logic. Do you have any thoughts about placing the user friendly views over a dimensional model housed in teradata?
Thanks again to you as well.


pcs

Posts: 20
Join date: 2009-02-03

View user profile

Back to top Go down

Re: Table and Column Naming Conventions

Post  robber on Tue Mar 10, 2009 12:23 pm

The Teradata dba's that I know, that also understand a dimensional model and it's value, do not think a physical dimensional model works very well on the Teradata platform, hence the other options. What sort of transformations do you need that cannot be done as part of the relational model?

robber

Posts: 41
Join date: 2009-02-28
Location: Canada

View user profile

Back to top Go down

Re: Table and Column Naming Conventions

Post  robber on Tue Mar 10, 2009 12:30 pm

Just re-read your last post and noticed you plan to use the Microsoft BI presentation tools which would strengthen the case for using Sql-Server as your datamart platform...

If you stick with Teradata to house your physical datamarts have you investigated the ability/viability of using the Microsoft BI tools directly against Teradata?

robber

Posts: 41
Join date: 2009-02-28
Location: Canada

View user profile

Back to top Go down

Re: Table and Column Naming Conventions

Post  BoxesAndLines on Tue Mar 10, 2009 12:49 pm

Teradata gets its power from parallelizing a query across all of the AMPS. Data is distributed across all of the amps based on a table's defined primary index (not to be confused with the primary key). And if you forgot to define a PI, Teradata will use the first column in the table! A join in Teradata works best when the primary index is shared between tables. This is not possible when building a dimensional model. All of the dimension tables will have a different primary index than the fact table. To get around this, Teradata encourages the use of AJI's (aggregate join indexes). AJI's can add considerable disk space requirements though. I have heard of some successes using dimensional models on Teradata. And if you ask Teradata, they will absolutely say they do dimensional models well (not just views on top of 3NF which they also recommend). I also know companies that have a DW on Teradata that also do BI reporting out of Oracle. I've never personally done benchmarking on dimensional models on Teradata although I have read reports at previous clients with mixed results. That's why I'm interested in your results.

BoxesAndLines

Posts: 1042
Join date: 2009-02-03
Location: USA

View user profile

Back to top Go down

Re: Table and Column Naming Conventions

Post  pcs on Tue Mar 10, 2009 1:13 pm



Robber,
I think I understand the teradata question and will try to address it here... I hear your question as ‘why would we need additional transformation logic on top of the existing relational tables’ correct?
What jumps to mind immediately is the conforming logic we would have to build into the views. Our Operational Data Store (ODS) is our system of record and is stored in the teradata tables. This relational data is very process specific and is not very well integrated. It is also full of process specific codes and lookup values. In many cases, the source data is simply a daily copy of the OLTP datasets from the front end applications.
We could create views over the top of this relational data but I am not sure about the impact of having the necessary conforming logic in views. Wouldn't the views have to reprocess this logic each time they are accessed?
Because it is the official system of record data, we do not have the ability to modify the ODS tables.
We have grown quite a bit through acquisition and internal application development and have historically placed a great deal of emphasis on getting data into the ODS, but not enough priority on making sense of and aggregating the data across the enterprise.
My thought and plan was to combine this data, piece by piece into a conformed enterprise data warehouse using the processes described in the Kimball canon. Basically, we are sourcing from our ODS to try and make sense of what is going on across the enterprise.
Does this make sense? Am I missing something?


pcs

Posts: 20
Join date: 2009-02-03

View user profile

Back to top Go down

Re: Table and Column Naming Conventions

Post  pcs on Tue Mar 10, 2009 1:21 pm


We are having some success using the MS BI Tools with this project - our initial proof of concept data mart uses Integration Services, reporting Services, and Analysis Services with varying degrees of success.

In SSIS we are using the scheduling and auditing in SSIS and basically hand coding the ETL logic...
AS and RS use ODBC connectivity fairly well against Teradata.

pcs

Posts: 20
Join date: 2009-02-03

View user profile

Back to top Go down

Re: Table and Column Naming Conventions

Post  robber on Tue Mar 10, 2009 2:56 pm

That makes sense now, I thought/assumed you had an EDW in place. Typically on Teradata platforms an EDW will be created, as a normalized data warehouse (Inmon approach), including any transformations and then create datamarts from the EDW.

robber

Posts: 41
Join date: 2009-02-28
Location: Canada

View user profile

Back to top Go down

View previous topic View next topic Back to top

- Similar topics

Permissions in this forum:
You cannot reply to topics in this forum