How to model IP Dimension

View previous topic View next topic Go down

How to model IP Dimension

Post  andriy.zabavskyy on Mon Sep 12, 2011 7:13 am

Hi,

We have a data warehousing system which is dedicated to the network attacks analysis.
The crucial dimensional information is a Source(Attackers) IPs and Destination (Target) IPs.
My question is: how would you recomend to model the Source and Destination IP dimensions.

Note:
The number of unique values in both of them can potentially grow up to billions of records.
Both IPv4 and IPv6 should be supported (IPv6 is stored as 2 bigint columns in fact table).

Thanks in advance,
Andriy

andriy.zabavskyy

Posts : 18
Join date : 2011-09-12

View user profile

Back to top Go down

Re: How to model IP Dimension

Post  BoxesAndLines on Mon Sep 12, 2011 9:51 am

I'd go with degenerate dimension(s).
avatar
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

View user profile

Back to top Go down

Re: How to model IP Dimension

Post  cjrinpdx on Mon Sep 12, 2011 11:50 am

Did you recommend a degenerate dimension due to the large number of records?

cjrinpdx

Posts : 51
Join date : 2011-07-14
Location : Portland, OR

View user profile

Back to top Go down

Re: How to model IP Dimension

Post  BoxesAndLines on Mon Sep 12, 2011 12:02 pm

I recommended since they are unique.
avatar
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

View user profile

Back to top Go down

Re: How to model IP Dimension

Post  cjrinpdx on Mon Sep 12, 2011 3:58 pm

Got ya, and maybe the lack of additional attributes would be a good reason to go with the degenerate dimension. If this dimension was going to be used with multiple fact tables that might be an argument to use a separate dimension table and create a conformed dimension.

cjrinpdx

Posts : 51
Join date : 2011-07-14
Location : Portland, OR

View user profile

Back to top Go down

Re: How to model IP Dimension

Post  BoxesAndLines on Mon Sep 12, 2011 4:23 pm

Yes, but there are often times that there are additional attributes that need to be modeled into separate tables to reduce the cardinality down from a 1-1. Invoice # or Order # are common examples. There are plenty of Order and Invoice attributes that I could put into an Order or invoice dimension, but you will build a better performing model if you break those columns into smaller separate dimensions and store the order#/invoice# on the fact table. In this case, IP looks like it is a single column that is unique, so a DD looks like an optimal fit.
avatar
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

View user profile

Back to top Go down

Re: How to model IP Dimension

Post  andriy.zabavskyy on Tue Sep 13, 2011 5:23 am

Please take into account that IP attribute is based on 2 bigint columns(mainly needed for IPv6). There could be added a string column for more friendly representation but I am not sure it is a good idea to create a DD based on it

andriy.zabavskyy

Posts : 18
Join date : 2011-09-12

View user profile

Back to top Go down

Re: How to model IP Dimension

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum