Customer Dimension built from two ERP tables

View previous topic View next topic Go down

Customer Dimension built from two ERP tables

Post  Tadec on Mon Aug 22, 2011 4:17 pm

I am in charge of developing conformed dimensions. One of those dimensions is Customer. I have two tables in SAP for customer. One has attributes just based on customer as the primary key. The second table contains additional attributes but they are by customer and company. I need attributes from both tables. The customer table is about 170,000 records and the customer by company is 119,000. I have all customers in the customer table but I do not have all customers in the customer by company. If I make a conformed customer dimension based on these two tables I will have some customers, about 20,000, that have no company. I thought my surragote key would be based on customer and company but my DBA Architect is indicating this is not wise and I should have two customer tables.
This may be a dumb question but when creating a conformed dimension if I have more than one source data table and they are at different levels of detail should I keep them separate.

Tadec

Posts : 7
Join date : 2011-08-22

View user profile

Back to top Go down

Re: Customer Dimension built from two ERP tables

Post  thomaszhwang on Mon Aug 22, 2011 4:25 pm

I think you should have two separate dimensions: DimCustomer and DimCompany.

thomaszhwang

Posts : 32
Join date : 2011-08-11

View user profile

Back to top Go down

Customer Dimension Built from Two ERP Tables

Post  Tadec on Mon Aug 22, 2011 4:35 pm

Is the reason for the two separate dimensions the fact that my two customer source tables are at different levels of detail? Should dimensions be treated like fact tables where all records in the dimension table should be at the same level of detail?

Tadec

Posts : 7
Join date : 2011-08-22

View user profile

Back to top Go down

Re: Customer Dimension built from two ERP tables

Post  thomaszhwang on Mon Aug 22, 2011 4:55 pm

Because in those two source tables you cannot find one record in each of the tables to physically represent a same customer, can you? If you cannot, what's the point to merge them anyway? Correct me if I'm wrong. Thanks.

thomaszhwang

Posts : 32
Join date : 2011-08-11

View user profile

Back to top Go down

Re: Customer Dimension built from two ERP tables

Post  ngalemmo on Mon Aug 22, 2011 5:05 pm

What does 'company' represent? I've seen in SAP where it represents an internal company, some form of subdivision within the enterprise.

If that is the case, do you have different internal organizations (companies) that maintain different information for the same customer? In other words, can the same customer appear under different companies?

If company is an internal organization and you have only one company, then a single customer dimension should be fine. Just put all the attributes on one row. If there isn't a company row, leave those attributes blank or some default value.

If a customer exists in multiple companies, then you may need a separate dimension (with customer/company as a natural key) or some kind of business rule to determine what to do with the data if you populate a single customer dimension.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Customer Dimension built from two ERP tables

Post  Tadec on Mon Aug 22, 2011 5:13 pm

If I have a customer who is a sold to or payer then I will find the natural ID for this customer in both tables. If the customer is a ship to then it will only be in the customer table. So my merge was an attempt to get the attributes stored in the customer table combined with the attributes stored in the customer by company table. My problem becomes the shipto customers are not in the customer by company so they will never have a company.
In answer to the second response, company does represent an internal company and the same customer can appear under different companies.
Thanks for the quick replies.

Tadec

Posts : 7
Join date : 2011-08-22

View user profile

Back to top Go down

Re: Customer Dimension built from two ERP tables

Post  VHF on Tue Aug 23, 2011 1:03 pm

I’ve also struggled with SAP master data… seems like SAP makes everything 10 times more complex!

If I understand correctly, you have some customer data that is company-specific. For those records the natural business key could be company id + customer number. Then you have other customer data that is not company-specific… presumably that customer data applies enterprise-wide.

One option would be to repeat the non-company-specific rows for each company. In other words spit out a row for each combination of company id and customer number. This would allow you to merge your two sources into a single conformed Customer dimension using a consistent natural key of company id + customer number. This approach does assume a reasonable number of companies… if you have 50 companies then this might lead to a data explosion!

Presumably you would always get company id and customer number in your transaction data so as to be able to look up the appropriate Customer dimension row using the compound natural key (company id + customer number).


Last edited by VHF on Tue Aug 23, 2011 1:04 pm; edited 1 time in total (Reason for editing : fix typo)

VHF

Posts : 236
Join date : 2009-04-28
Location : Wisconsin, US

View user profile

Back to top Go down

Re: Customer Dimension built from two ERP tables

Post  Tadec on Tue Aug 23, 2011 1:13 pm

We have like 4 instances of SAP and I am not sure how many companies in each instance. I would have to check that but the feeling of the DBA is that it could lead to a master data explosion if we did what you suggest. I am not sure I agree but with that said I have another thought.
My thought was why on customer where we have no company could they not place NA in the company column and go ahead and build the surrogate key off of the natural key customer ID + source system + company ID.
Do I need to populate the company ID if it does not have one?

Tadec

Posts : 7
Join date : 2011-08-22

View user profile

Back to top Go down

Re: Customer Dimension built from two ERP tables

Post  ngalemmo on Tue Aug 23, 2011 1:20 pm

... except that ship-to customers are not in the company keyed table. So, as a sales order comes in, you cannot include company when looking up a ship-to. Also ship-to rows cannot include a company in thier natural key. It may be worth asking the business how they actually use company. You may be pleasantly surprised that you get puzzelled looks from the users. This usually means going to the ABAP group who will probably tell you that they only use one company (probably 100) and the other stuff is test data that they haven't figured out how to get rid of.

SAP is soooo much fun.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Customer Dimension built from two ERP tables

Post  Tadec on Tue Aug 23, 2011 1:58 pm

Thank you all very much. I think I finally follow what my DBA was trying to tell me.
That when he loads the fact table with a ship to customer that does not have a company ID, we would have to give it one or, in my thought, put NA in the column Company ID. That same record would then have to exist in the Customer Dimension so everything would relate.

So he feels we should have two customer dimensions with one by company. My direction is to Conform Customer and part of that means try to keep it to one dimension.
Part of the reason for that is to keep the number of dimensions down, and the other reason is we are trying to also have a self serve BI, so we are trying to keep it intutive for the power users or users who will be self serve.
Yes SAP is a whole lot of fun.


Tadec

Posts : 7
Join date : 2011-08-22

View user profile

Back to top Go down

Re: Customer Dimension built from two ERP tables

Post  VHF on Tue Aug 23, 2011 2:23 pm

A separate ShipTo dimension might be useful for self-serve BI as well as solving how to handle non-company-specific master data.

VHF

Posts : 236
Join date : 2009-04-28
Location : Wisconsin, US

View user profile

Back to top Go down

Re: Customer Dimension built from two ERP tables

Post  Tadec on Thu Aug 25, 2011 2:36 pm

I have a basic question to ask in regards to Conformed Dimensions. If you are building a Conformed Customer Dimension would you say that by the very definition of Conformed that you can have only one Customer dimension. If this Customer Dimension has the same natural key and it is just a matter of level of detail would you still say based on the definition of a Conformed dimension that it must be one dimension.

Tadec

Posts : 7
Join date : 2011-08-22

View user profile

Back to top Go down

Re: Customer Dimension built from two ERP tables

Post  ngalemmo on Thu Aug 25, 2011 3:43 pm

I look at conformance as an attribute issue, not a table or key issue. When a user queries the database they group things by attributes. So if you have something like Customer Name, it is less an issue where the column is as it is that the same customer would have the same name. You try to keep everything in one place more to reduce complexity than anything else.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Customer Dimension built from two ERP tables

Post  hang on Thu Aug 25, 2011 4:39 pm

Tadec wrote:In answer to the second response, company does represent an internal company and the same customer can appear under different companies.
It's m-m relationship, so you can't have a single de-normalised customer dimension with company attributes in it.

Having two separate customer and company dimensions makes sense to me. Combining them into one dimension is not dimension conformance, while sharing a dimension by different fact tables is.

If the relationship can be pre-determined independent of fact, then create a fact less fact table or a bridge. Otherwise leave the relationship in the fact table if it's driven by the fact.

hang

Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia

View user profile

Back to top Go down

Multi-source Dimensions When Data can be in one or Both Sources and May Change Over Time

Post  delish on Wed Sep 14, 2011 1:18 pm

I have a similar issue where I have a company dimension that has attributes coming from two source systems.
1) A company might exist in source system 1 but not source system 2
2) A company might exist in source system 2 but not source system 1
3) Source System 1 has the key for source system 2 in it.
4) The sourcey system 2 key is not always populated in source system 1
5) Source system 2 has companies that might never end up in source system 1 and therefore never have a reference between the two
6) The users require that a company be available in the dimension as soon as it is available in either source system 1 or 2
7) It's possible for a company to be created in source system 1 and then created in source system 2 any number of days later
It's possible for a company to be created in source system 2 and then created in source system 1 any number of days later
9) It's possible for a company to be created in either source system 1 or 2 and never in the other

The problem is, the business key is the combination of source system 1 and 2 which means that the key can change for any company that exists in only one of the systems upon insertion into the dimension table.

For example, on 1/1/2000 Dell is added to source system 2. The ETL loads the company into the DimCompany table with a SourceSystem2ID = DELL and a SourceSsytem1ID = 0, empty string - whatever.
Now on 1/2/2000 Dell is added to source system 1 with the appropriate reference to the Source System 2 Key of "DELL." The ETL picks up the company attributes from both systems because it is able to join them together. Since the business key is now 125 (int PK in source system 1) + DELL (char PK in source system 2), it appears to be a new dimension member even though it's not.

We could add this new row and then have a cleanup process that finds the old row with the partial key and attributes but that gets especially messy when looking for existing fact rows referencing the old surrogate key.

The question is, how do we have a multi-source dimension table that is capable of processing updates when we don't know which source system's keys will be available at any given time and when the true business key (all source system PKs available) can change?

delish

Posts : 5
Join date : 2011-09-14

View user profile

Back to top Go down

Re: Customer Dimension built from two ERP tables

Post  VHF on Wed Sep 14, 2011 5:24 pm

I use a PersistentKeyMap table to resolve master data coming from multiple systems with different keys. The ETL process use the PersistentKeyMap to determine if an entity already exists in the DW and the match up identifiers between systems. Let's try it with your example:

delish wrote:For example, on 1/1/2000 Dell is added to source system 2.
After looking in the PersistentKeyMap table and not finding an entry, we add a record, assigning a PersisentKey to be used as the SK of DimCompany:

PersistentKeyKeyTypeSys1KeySys2Key
789CompanyNULLDELL
We can now create a DimCompany record with a SK of 789. The PersistentKeyMap table can be used to resolve to the SK when processing data from source system 2.

delish wrote:Dell is added to source system 1 with the appropriate reference to the Source System 2 Key of "DELL." ... the business key is now 125 (int PK in source system 1)
First we look for an existing value WHERE KeyType = 'Company' AND Sys1Key = '125'. When we fail to find that, we then search for a row WHERE KeyType = 'Company' AND Sys2Key = 'DELL' (since system 1 has the Source System 2 Key.) Bingo! Because we had to fall through and match on the alternate business key, we now update the PeristentKeyMap record to include the business key from System 1:

PersistentKeyKeyTypeSys1KeySys2Key
789Company125DELL
By using the PersistentKeyMap table, we can now load data from either system and resolve to the correct dimension record.

My implementation is in SQL Server 2008. In addition to creating a primary key on PersistentKey, I create a unique index that filters out nulls on KeyType + Sys1Key. Same thing for Key Type + Sys2Key. I now use the PersistentKeyMap table for processing all my dimensions and for resolving SKs when loading fact data.

VHF

Posts : 236
Join date : 2009-04-28
Location : Wisconsin, US

View user profile

Back to top Go down

Re: Customer Dimension built from two ERP tables

Post  ngalemmo on Wed Sep 14, 2011 6:18 pm

This will work as long as there are no errors with the system 2 key held in system 1. If it ever changes, earlier assignments will reference the wrong dimension row.

It is better to maintain individual rows for each row from each source and carry the two keys in the dimension (one natural key from the native source and multiple alternate keys from each source). Facts from system 1 will reference dimension rows from system 1 and facts from system 2 will reference dimension rows from system 2. Use the alternate keys in the dimension table to maintain the attributes. So if there are specific attributes that are only available from system 2, you would use the alternate key to update those attributes on all rows with matching system 2 keys (independent of the dimension's natural key). Should the cross reference change, it becomes a matter of refreshing attributes, not reassigning foreign key references.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Customer Dimension built from two ERP tables

Post  Jeff Smith on Wed Sep 14, 2011 7:30 pm

You can build a single member dimension from the 2 sources. For the members not in the Member-Company, the Company will be blank or, better yet, a default value of something like "Company Unknown".

But, if the Customer's Company is used for other stuff, then maybe it should it's own Dimension. It just depends. Maybe it's really a Company Dimension and the Customer is the representative of the Company. If you have a lot of customers from the same company and you tend to aggregate data or report data at the Company level, then make Company it's own dimension. If Customer is really a representative of the Company, then one dimension. But even if you have 2 dimensions, there still may be customer level attributes on the member/Company source. You can add those attributes to the Customer Dimension, using an "NA" or "Unknown" as values for those attributes on Costomers not in the Customer/Company Source.

Jeff Smith

Posts : 471
Join date : 2009-02-03

View user profile

Back to top Go down

Re: Customer Dimension built from two ERP tables

Post  delish on Thu Sep 15, 2011 2:33 pm

ngalemmo wrote:This will work as long as there are no errors with the system 2 key held in system 1. If it ever changes, earlier assignments will reference the wrong dimension row.

It is better to maintain individual rows for each row from each source and carry the two keys in the dimension (one natural key from the native source and multiple alternate keys from each source). Facts from system 1 will reference dimension rows from system 1 and facts from system 2 will reference dimension rows from system 2. Use the alternate keys in the dimension table to maintain the attributes. So if there are specific attributes that are only available from system 2, you would use the alternate key to update those attributes on all rows with matching system 2 keys (independent of the dimension's natural key). Should the cross reference change, it becomes a matter of refreshing attributes, not reassigning foreign key references.

No changes whatsoever. I could easily have the company in both systems with no reference initially. They would be created as two different companies in the dimension and both surrogate keys could be used by different facts. If the reference was later made I'd have to merge the dimension members and change fact table foreign keys. No good.

I did something similar to what ngalemmo suggested however I split the data across two tables. Fact table 1 has a FK to companyDim 1. Fact table 2 has a FK to companyDim2. I have a view that stiches companyDim1 and CompanyDim2 together so that FactTable1 can be analyzed by both dim's attributes (Fact table 1 data mart only cares about companies in Dim1). Fact table 2 can also be analyzed by Dim1's attributes by way of the view. The key to this solution is that each fact table only contains references to it's corresponding company dim and doesn't need to be able to lookup keys for the other dim.

I should mention that these two fact tables contain different data. I didn't split up the same types of facts across two tables to solve the dimension issue. However, I will want to look at measures from both (when available) across the same company. I don't think this will be a problem. I'll just need to be careful when I set up the relationship metadata in my BI layer.

delish

Posts : 5
Join date : 2011-09-14

View user profile

Back to top Go down

Re: Customer Dimension built from two ERP tables

Post  VHF on Thu Sep 15, 2011 2:58 pm

delish wrote:If the reference was later made I'd have to merge the dimension members and change fact table foreign keys. No good.
I believe ngalemmo is recommending that you maintain two dimension records (one from each source system) and don't try to combine them. Facts from System 1 would point to one dimension record; facts from System 2 would point to to the other. It is OK for more than one dimension record to represent the same business entity. You would need an attribute with a common value in both dimension records (such as Company Name) on which to group by when reporting.

VHF

Posts : 236
Join date : 2009-04-28
Location : Wisconsin, US

View user profile

Back to top Go down

Re: Customer Dimension built from two ERP tables

Post  delish on Thu Sep 15, 2011 3:05 pm

VHF wrote:
delish wrote:If the reference was later made I'd have to merge the dimension members and change fact table foreign keys. No good.
I believe ngalemmo is recommending that you maintain two dimension records (one from each source system) and don't try to combine them. Facts from System 1 would point to one dimension record; facts from System 2 would point to to the other. It is OK for more than one dimension record to represent the same business entity. You would need an attribute with a common value in both dimension records (such as Company Name) on which to group by when reporting.

That part of my response was directed towards the persistent key approach. ngalemmo's approach makes perfect sense for what he is describing and would work for me. However, there are very few, if any, overlapping attributes between the two sources so it makes more sense for me to split them across tables.

delish

Posts : 5
Join date : 2011-09-14

View user profile

Back to top Go down

Re: Customer Dimension built from two ERP tables

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum