Sunday, February 19, 2012

How good is your logic?

In relation to another post of mine:

http://forums.microsoft.com/msdn/ShowPost.aspx?postid=1658798&isthread=false&siteid=1

Seeing as I am well and truely stumped - I'm prepared to try look at it (with the help of you) in an entirely different light!

OK - scenario: Company with many 'cost centres', we want to know some details of staff turnover. The fiscal year starts 1 May and ends 31 April. Every fiscal year there is a count of staff 'on the books' to start that year. During that year people are employed and people leave employment. For every person they have an 'Employee Number', there various details, the cost centre they were assigned to, their starting date (date of employment) and their finishing date (date of employment termination).

Say we want to calculate staff turnover % as number of terminations divided by opening headcount. So this seems easy enough. The problem is finding the 'opening headcount' - well that's not true - it's a seemingly easy calculation - opening headcount = sum(starts) - sum(terminations) for what ever year. This can be done in excel quite quickly and easily. The problem lies in the fact that people join and people leave every year with a percentage of people remaining. All I have to work with for every record is the start date and finish date (for sakes of usability i've set - for the purposes of this cube - the finish date to 11/11/2222 if in the actual database the finish date is NULL - that is if an employment hasn't been terminated yet).

The way I have attacked it so far is according to the other thread (link above). I was hoping to simply use a sum(periodstodate)) for both starting dates and finishing dates and then subtract them. But I can't even get the darn sum(periodstodate)) to work properly!

So here I leave it to your wisdom! if you can shed some light - please do. If you want to know more - please ask!

Many thanks

Karl

You'll need to use the descendants function to calculate at the employee level. We had a similar problem calculating customer churn. I'll post the way we did it in the next few weeks (as I have to do this for employee churn as well).|||

Hmmm... Been reading up about descendants - still not sure how it would work - but I believe you would probably know better than I!

Have got the running sum to work now, so it now stands - how would I subtract a cube of column(cumulative start count), rows(FiscalStartDate) from cube column(cumulative finish count), rows(FiscalFinishDate)?

So I now have two calculated measures(members)

1: [Start Cumulative Count]

SUM(PERIODSTODATE([Start Date].[Fiscal].[(All)],

[Start Date].[Fiscal].CurrentMember),

[Measures].[Fact Employee Count])

And 2: [Finish Cumulative Count]

SUM(PERIODSTODATE([Finish Date].[Fiscal].[(All)],

[Finish Date].[Fiscal].CurrentMember),

[Measures].[Fact Employee Count])

Now you can't simply plonk a calculation of [Start Cumulative Count] - [Finish Cumulative Count], because what would you use on rows? The Start Date Hierarcy? Nope - because you would then have 'starting cumulative count' (correct) - 'Cumulative count of people who started and ended in that year' (incorrect) which = not the right answer!

Any thoughts?

|||

Another approach is to explode your data out when you build your fact table.

There are a could of variations to this approach, but based on what I understand of your issue at the moment you could insert a record for the start even and another for the end, storing a count in separate columns.

EmployeeID StartDate EndDate

========== ========= ========

1 1 Jan 07 31 Mar 07

2 1 Apr 05 NULL

Could be inserted into a fact table as

EmployeeID TimeID StartCount EndCount

========== ======== ========== ========

1 20070101 1 NULL

1 20070331 NULL 1

2 20050401 1 NULL

Then for any time period you could take the sum of the starts and subtract the sum of the ends to get to a head count. You can also get the number of new people or terminations between any arbitrary date range.

|||

Now that is some great logic - I kept trying to think of a way to explode out the fact table! Thanks! I'll give that a shot! Will keep you posted.

Karl

|||Another variation that is probably more flexible is to have a single HeadCount measure and something like an EventTypeID where you could track starts, finishes and other events like promotions, transfers etc.|||

I had already done that... Although I'm not usining it at the moment. I still use both startcount and endcount and a TerminationID (which doesn't neccessarily mean terminated - it consists of current employee, left, dismissed and change of contract).

However, I've hit yet another wall........

I now want to work ou t the average length of service for those on hand at teh beginning of each year. This is easy to do for those that have departed - I simply added another column onto the fact table called LengthService which simply used the DATEDIFF(day,start_date, finish_date) when populating it, I then summed the column, and divided the measure by the end count measure and divided again by 365 (to give average length of service in years).

The on hand poses a more complex problem. I have to take into account teh datediff between a start date and the currentmember date, but then I also need to take into account people that have left prior to the year in question... I'm sitting here stewing on a way to do this with nothing of substance yet...

Any thoughts?

Again - many thanks for all the great ideas and help so far - it is mostly appreciated!

Karl

|||

Example with descendants:

CREATE MEMBER CURRENTCUBE.[MEASURES].[Cancel Sub USD]

AS

sum(filter(

(Descendants( [Customer].[SEGMENT - PARENT CO].currentmember,

[PARENT CO] ,SELF )),

([Fiscal Calendar].[YEAR - QUARTER - PERIOD - DATE].currentmember, [Measures].[Parent Company Distinct Count]) <

([Fiscal Calendar].[YEAR - QUARTER - PERIOD - DATE].prevmember, [Measures].[Parent Company Distinct Count])) *

[Measures].[USD AMT] *[Fiscal Calendar].[YEAR - QUARTER - PERIOD - DATE].prevmember )

,

FORMAT_STRING = "Currency",

VISIBLE = 1;

CREATE MEMBER CURRENTCUBE.[MEASURES].[New Sub Cnt]

AS

count(filter(

Descendants( [Customer].[SEGMENT - PARENT CO].currentmember,

[PARENT CO] ,SELF ),

([Fiscal Calendar].[YEAR - QUARTER - PERIOD - DATE].currentmember, [Measures].[Parent Company Distinct Count]) >

([Fiscal Calendar].[YEAR - QUARTER - PERIOD - DATE].prevmember, [Measures].[Parent Company Distinct Count])))

,

FORMAT_STRING = "#,#",

VISIBLE = 1;

|||

Karl,

I think what I would do is to add the Start and End dates as attributes to the employee dimension. This way for a given employee you would be able to extract the corresponding dates. You would need to use the descendants function to make sure you always drill down to the individual employee so that you can extract the corresponding dates.

I think logic like the following should work for you. I have typed it in off the top of my head, so I have not tested it, but I have put comments through it so hopefully you can follow my logic.

Code Snippet

AVG(

-- Get a set of all the employees below the currently selected member

-- which will be ALL Employees if no other member in the hierarchy is selected

DESCENDANTS([Employee].[Employees].Currentmember,,LEAVES)

,DATEDIFF(

[Employee].[Employees].Currentmember.Properties("StartDate").Value

-- we need to check the date diff between the earliest of the selected date

-- or the end date, if a member higher in the date hierarchy is selected

-- (like year, month or quarter) is selected I am using descendants to get the

-- set of dates underneath that member and then using tail to get the last one.

-- so if a month is selected it will do a datediff to the last day of the month

, IIF([Employee].[Employees].Currentmember.Properties("EndDate").Value

< TAIL(DESCENDANTS([Time].[Calendar].CurrentMember,,LEAVES),1).Item(0).MemberValue

,[Employee].[Employees].Currentmember.Properties("EndDate").Value

,TAIL(DESCENDANTS([Time].[Calendar].CurrentMember,,LEAVES),1).Item(0).MemberValue

)

)

)

No comments:

Post a Comment