Dimensional Modeling - What Goes Into Fact/Dimension?

View previous topic View next topic Go down

Dimensional Modeling - What Goes Into Fact/Dimension?

Post  akjason on Mon Oct 08, 2012 6:34 pm

Hello, let me start by saying that I am new to dimensional modeling and data warehousing but I have been reading the Kimball books to brush up on the concepts. In the data warehouse I am attempting to model, I am not sure whether to model some items as facts or dimensions and have received contradictory advice. I have no control over the source databases meaning I cannot change their design. The primary source database (DS_Pri) tracks applications received and their status (eligible, denied, in appeal etc.) as well as payment and deduction amounts and cases when they are being worked to determine eligibility or in appeal. The second source database (DS_Web) is for applicants to apply online. This Web data is moved into the primary source database periodically, although some Web applications may not move over right away if they are missing info in the Web source. Applicants apply every year and a decision is valid for only that year, they must apply again next year as well if they are interested. Each year about 1 million people apply currently and DS_Pri has 30 years of history! The business users are primarily reporting on aggregated application counts and dollar amounts for payments/deductions but they also do a lot of reports that list the details of individual applications. Below are some key entities in DS_Pri:

  1. APP_NO - This table has a unique number assigned to each application. It includes attributes for SSN, DOB, FNAME, LNAME, BIRTH_FNAME, BIRTH_LNAME
  2. APP - This table holds applications from the DS_Web as well as applications manually entered from paper submissions. It also includes attributes for SSN, DOB, FNAME, LNAME, BIRTH_FNAME, BIRTH_LNAME, ADDRESS. This is a holding table for the original applications before they undergo processing. Just before processing, an App_No is assigned to the application so a record can exist in APP even if there is not yet a record in APP_NO.
  3. APP_Q - This table contains responses to the application questions prior to processing
  4. ELIG - This table contains the application data after processing. It also contains attributes for SSN, DOB, FNAME, LNAME, BIRTH_FNAME, BIRTH_LNAME, ADDRESS and many other attributes found in APP.
  5. ELIG_Q - This table contains responses to the application questions after processing
  6. PERSON - This table contains a record for each unique person, regardless of how many appications are on file. It includes fields for BIRTH_FNAME, BIRTH_LNAME. DS_Web does not have a PERSON link and the PERSON record in DS_Pri is not created until processing when it can be determined if the person is new or not.

Sorry for the detail but I wanted to error on the side of too much info rather than too little. Because payments and deductions are separate business processes from the application itself, I want to model the payments/deductions in a separate fact table. For the applications fact table, I have heard that I should only model the counts in the fact table but I have also heard that I should model the application detail in the fact table as well (SSN, DOB, FNAME, LNAME, BIRTH_FNAME, BIRTH_LNAME, ADDRESS) since some of it might change from application to application and the users will report on it.

Which choice is the best way to go? If only the counts, then what attributes go into the fact table so I can generate the count?

How can I best construct a Person Dimension given the fact that the attributes are duplicated in several source entities and the Person entity is not populated at the time the application is submitted?

Does separating the application fact table from deductions and payments make sense?


Posts : 11
Join date : 2012-10-08

View user profile

Back to top Go down

View previous topic View next topic Back to top

- Similar topics

Permissions in this forum:
You cannot reply to topics in this forum