SQL

My Presentation @ Dev4Devs

Submitted by Robert MacLean on Fri, 05/08/2009 - 14:17

If you are attending Dev4Devs on Saturday (or are here after the event) and you are looking for a copy of the slides and code you can get them below! If are looking for the ADO.NET Data Services cheat sheet I mentioned then you need to go here.

Code

The code here is also different from what I presented in the following ways:

  • There is a timer control in it - so if you add items to the DB while on the site, it updates and shows those changes within 5 seconds.
  • The layout is slightly bigger (bigger header) and has buttons (to make it look like an email client) - these were removed because it doesn’t work at 1024x768 (aka the projector resolution). So they back in their graphical beauty.
  • There is a feed button which links to a ATOM feed for the last 10 emails - something I mentioned you could do, well now you can see it.
  • There is a database creation script, but no data. You need to create your own data.
Slide Show

ADO.NET Data Services - Weird Error

Submitted by Robert MacLean on Mon, 03/16/2009 - 08:09

Working on a project which uses ADO.NET data services for a small part of it. I put the ADO.NET Data Services portion together (add the files, put the EF data model together, link it up etc…) on my machine (Windows 2008 Server, SQL 2008, VS 2008) and it all worked great! Checked into source control and then went about doing something else, ignorant of the coming problem.

Then the developer who would actually consume it (we are consuming it in C# using LINQ) came to me saying that nothing worked :( All we get is the generic ADO.NET Data Services error: “An error occurred while processing this request”. We removed the where clauses and everything worked perfectly, which hinted to me the LINQ/EF expression tree was borking out somewhere. So we went to the EF data model, told it to update and it did a change under the surface and everything worked again?!

image

I love TFS because it allowed me to compare my version and the updated one easily and their was one tiny change (literally 1 byte!) in the edmx file (not the .cs backing file, the edmx file itself). Turns out that in dev they are using SQL Server 2005 and I built the model with SQL Server 2008 - now note there is ZERO 2008 special features in use, but it seems that the EF expression tree doesn’t care and try’s to use some SQL 2008 magic cause it thinks it is SQL 2008. This is controlled by the ProviderManifestToken attribute! Changing that to 2005 (which is what the update did automatically for me) solved the problem!

image

SQL BI Boot Camp - Day 2

Submitted by Robert MacLean on Mon, 11/24/2008 - 10:12

Ah, day 2 has arrived so I got up at 6 to get ready and watch the email trickle down (Emerald Casino appears to only have GPRS for Vodacom) which didn’t finish in the end. Unfortunately this post will not be as informative as the previous two, because what I was doing was taking notes on the machine I was working on and about an hour before the end of the day there was a power failure and I was not able to retrieve my notes :( So not only did we leave early, it was meant to be a short day to begin with… nice for a Friday to avoid traffic and get home early.

What stood out for me for that day was working with PerformancePoint which is a business user tool to build dashboards, KPI’s and scorecards. It was impressive at how easy it was to work with it, especially considering that in the background it’s creating cubes, writing MDX etc… This is the way this is meant to be. However it is a BUSINESS USER tool, so much so that Kevin (the presenter from IS Partners) was telling how they hired four CA’s in the last month and trained them to use PerformancePoint rather than training “BI people” to understand business. From speaking to some of the Microsoft people who have used it, the installation of PerformancePoint though is exceptionally tough and I was disappointed that it was not even touched on :(

Anyway without notes it’s hard to write long posts without possibly making huge mistakes, so I will leave it there for what was covered.

As an event it delivered on the initial point of being a level 200 course and helping understand the concepts and helping business users. The problems with labs and manual are understandable (these things happen) but considering you are paying for this and this is a MICROSOFT event it definitely should’ve been more professional than it felt. I would recommend this course to analysts and business users but not to development or serious BI users.

Layout data horizontally in SRS

Submitted by Robert MacLean on Mon, 11/24/2008 - 09:38

I recently had a question about layout in reporting services. Basically if you use a table you get a vertical layout, like this:

First Name

Last Name

Brad

Pitt

Bill

Gates

Linus

Trovalds

 

But what if you want to go horizontally? Sure there could be UI issues because the report will get wider and wider with more data getting put in and it could influence the quality of export or printing but in a special scenario it would be very useful. Unfortunately the tablix control (Table + Matrix + List in 2008. See my post on what’s new for more info) does offer this natively :( But using a trick you can get around it. First you need a tablix in matrix format (i.e. drag a matrix onto the form), and then add a new column header for each column, but rather than putting a header text in there you put the data in there. You ignore the rows and data cells (in fact you can hide them by setting the hidden property to true).

image

And the output is a horizontal layout of data:

image

SQL BI Boot Camp - Day 1: Afternoon Session

Submitted by Robert MacLean on Mon, 11/24/2008 - 09:34

Reporting Services

After lunch we dived into the next topic I know (don’t worry I learn stuff later in the day), namely Reporting Services :) So we started with the PowerPoint again :| I am not repeating everything he said, since a lot is covered in my earlier post on reporting services.

First where are reports used:

  • Internal reporting – operational & management
  • External – to customers
  • Embedded – Portals, Windows and Web

The lifecycle of report development is:

image

Reporting lifecycle – Once again no clever names (I should work on that).

Some notes on RS which I never knew: Data driven subscriptions are an enterprise edition only feature :(  and drill through only enabled for reports rendered into html (wonder why PDF doesn’t support it).

Reporting Services Labs

Right after the slides were put away we dived into the first lab… actually we didn’t because the report builder 2.0 software was NOT installed on the machines :( Skip over lab 1 and onto lab 2! Lab 2 involved creating a simple report with a chart control on it. It enjoyed this lab even though it was fairly simple.

image

The report from exercise 1. Note two charts but it is a single chart control. Thanks Dundas :)

For exercise 2 we created a report with gauges:

image

The report from exercise 2.

The third exercise involved publishing to a RS server from Visual Studio and then exporting to Word.

image

The report about to be exported in exercise 3.

All in all I found the RS labs very enjoyable if not even at the advertised level 200 for the course.

Introduction to UDM

With that out of the way we looked at UDM, which is the SQL 2005/2008 name for a cube! UDM stands for Unified Dimensional Model.

For UDM you use OLAP which stands for OnLine Analytical Processing and is really just retrieving data from 1 or many databases and presenting it in a way which is easy for retrieving/analysis of the data.

Uses of OLAP

  • Sales analysis
  • Forecasting and budgeting
  • Financial reporting
  • Web Statistics
  • Survey Result
  • ETL Process Analysis

SQL  Server Analysis Services (SSAS) is made up of an OLAP component and a data mining component.

Basic data mining is also available for free in Excel (separate download).

Building a cube provides more functionality to users, enables high performance queries, abstracts multiple sources and encapsulates business rules.

Now that the introduction is done it was on to the labs. The first lab was to build a cube! There seem to be WAY too many steps involved in doing this (maybe I am just a lazy developer) but it quickly became an exercise on following a manual and not really any understanding of what and why each step was being done. One of the comments I over heard was that it was a lot of work to build a pivot table, because that’s all we ended up with. I think that actually highlighted the lack of information about why we did things :S

image

Using the data browser in the cube we built.

We then fired up Visual Basic :shock: to build an installer for for the cube so it could be installed on multiple servers easily. This was very interesting, mostly because I spent a lot of time in the past with MSI and never really got to grips with the advanced features (like embedding VB code).

image

My VB code! Actually it’s copy and paste from the snippets file.

 image

Deploying my cube using the installer.

To do that we were given 75min and the manual said it expects us to complete lab 1 in 75 minutes (I did it in 71), but no extra time was allocated for lab 2 which was estimated 45 min, lab 3 or lab 4 which were estimated at 15min each. That meant unless you were super fast with this (i.e. you knew it) you couldn’t get to 75% of the labs :(

Introduction to MDX

After the cubes we moved into MDX which is a language similar to SQL but optimized for handling analysis of data in a cube. It’s specifications are either from OLEDB for OLAP (ODBO) or XML/A.

Since it is a language it is not a end user, or even power user, tool. It is a development tool and the skills needed to help learn it are:

  • Good to have: SQL Skills, Excel Skills (Formulas)
  • Not helpful: C# or ASP.NET

So what does it look like: SELECT <Dicers> FROM <Cube or sub query> WHERE <Slicer>

SQL vs. MDX

  • In SQL the result is defined by columns which populate rows while in MDX it is defined by rows and columns and populating cells.
  • In SQL the row headings are data while in MDX they are a schema.
  • SQL aggregates dynamically into groups while in MDX data is referenced by cell sets.

What is a MDX Set: Grouping of items together. Uses standard set notation: { … }

You can add formatting (like currency information or thousand separators) to the result in the cells using the CELL FORMATING statement.

Tuple – A list of members from different hierarchies separated by commas. Pronounced: Too-ple like quadruple. These are what goes in the <Dicers> and <Slicers> section of the example above.

Imagine the following 3 tables

Towns:

  • CT
  • JHB

Values:

  • SALES
  • COS

Sales People:

  • JIM
  • JACK

You could built the following tuples:

  • All sales for CT: (ct,sales) or (sales,ct) <- Order does not matter.
  • All sales for JHB and CT: (ct,sales)+(jhb,sales)
  • All JIM’s sales in CT and all JACK’s sales in JHB: (ct,sales,jim)+(jhb,sales,jack)
  • Note that would exclude all JIM’s sales in JHB and all JACK’s sales in CT.

Types of tuples:

  • Complete tuple: references every dimension
  • Partial tuple: omits some dimension. Will always use current member if dimension is omitted.

In MDX tuples have functions like: Percentage of parent, prevmember, parallelperiod and children.

Now that we have looked at MDX we hit the lab which was very refreshing as they provided about 10 broken MDX queries and you had to apply your knowledge (and skills with the help file) to fix each one. This was a really good lab since it actually enabled you to learn and understand what you were doing. It was a little frustrating at first because it’s different but more of the labs should’ve been this way. In the end I got some right and some I just got stuck on and had to peek at the answers to see how to fix them.

clip_image002

One of the MDX queries I fixed.

By now it was after 6pm and time to check-in and head for dinner and drinks.

SQL BI Boot Camp - Day 1: Morning Session

Submitted by Robert MacLean on Mon, 11/24/2008 - 09:27

Introduction

So after the drive to Emerald Casino (which took half the time I expected) and registration we started. Frikkie Bosch, marketing manager for app products (BizTalk, SQL, VS), from Microsoft is acting as facilitator for the boot camp! The actual presenter for the days is Kevin Coestzee from IS Partners. The room (pictured below) is big with space for approx. 80 people and Frikkie introduced all the companies attending at the start (lots of big names, including Microsoft staff!) and then Kevin took over and stated that the aim of the course and a brief overview of the SQL Server 2008 BI stack. It would be a level 200 course and some sections catering for business users (about 40% of the attendees) and some sections for technical (about 60% of the attendees).

Photo042 

The training room. This would fill up when everyone eventually arrived.

The PowerPoint section

Now that the stage is set lets look at what was covered in the mornings PowerPoint session:

First up what is BI: Taking knowledge, aggregating data and providing it to business.

BI’s use can be to understand the health of the organisation, collaborate the shared view of business drivers and reduce decision time and these are facilitated through four scenarios:

  • Operational reporting
  • Activity management
  • Data Mart – analytics & decision making
  • Data warehouse

What is the BI evolution:

2

BI Evolution – I couldn’t come up with names for each stage.

What is a star schema design:

  • It is a fact table with a number of linked dimensional tables. Fact tables do not link to other fact tables

3

A star schema.

  • A dimensional table is list of unique dimensions (people, time periods, stores) with a primary key and meta data on the items. They link and support numeric data.
  • A fact table combines all dimensional table primary keys with the collection of measurements associated with a business final result of process. For instance the final amounts of a sale, or salary payments. There are two ways to design it:

    1) Columns layout – An individual column for each type (salary, sale etc…). Very wide layout with many columns which could be empty.

    2) Single value column with a foreign key to a dimension table to designate type (salary, sale, etc…). Very narrow layout but very long and no empties.

    The fact table will be the biggest table in the DB.

  • This design is good for BI because
    • It allows for high performance queries.
    • It is designed for models.
    • It is low maintenance

To me it seems similar to normalized database model.

What is a snowflake schema design:

It is a normalised version of the star schema where the dimensional table may be related to one or more other dimensional tables. Useful for hierarchy designs, like company organisation charts.

Date Dimensional Table

  • Most common dimensional table created as it is used in almost every system.
  • Date information is consistent across fact tables.
  • Useful common attributes for the dimensional table include: year, quarter, month, day

Parent Child Dimensional Tables

There are tables which are self referring. So one column contains the primary key, while another a foreign key to the first column. Good for organizational charts, where all employees in a single table and the foreign key points to the employees manager.

Slowly Changing Dimensions

The version control system of BI ;) So they support data warehouses by providing information on the past accurately. It is a concept introduced by Ralph Kimball and is used to track changes over changes time (employee sales when they move from store 1 to store 2 etc…).

There are three types:

  • Type 1: No tracking.
  • Type 2: Full versioning
  • Type 3: Partial versioning (current + 1 or 2 back) – Exceptionally seldom used

Type 2 is done by having a start and end date columns for the record which define when it was created and when it was superseded by a new version. If the end date is null then it is current. When a new row is inserted the old data is copied across from the old row, the changes applied and the end date set on the old row.

Type 3 is done by adding a column for the fields to be tracking to the same row and coping the old data to that column before updating the column.

What is Data Integration?

  • Transform corporate data into information.
  • Enterprises spend 60-80% of their BI resources in the data integration stage.

The Labs

And that ended the marathon PowerPoint slide deck! :) Now on to the first two labs focusing on SSIS, which I already know well, but I am here so I dived into them because you never know what you will find. Each lab is broken into a number of exercises and the first lab (Introduction to ETL techniques using SSIS) is made up of four exercises. Exercises 1 through 3 focused on creating a SSIS package which retrieves csv files from an FTP server and stores them locally. Those are filtered by date using a variable and then unpivot the CSV data and lookup data from dimensions and load all of it into a fact table. I really enjoyed this part, even though it is the “simple” lab is is very cool, especially since you get to read from FTP and do some parsing. The section on unpivots and derived columns (including the horrid expression that is needed) could’ve had more information and the lab should have had a way to see the data before and after.

image

The control flow view of the first lab.

Exercise 4 was about investigating the slowly changing dimension transform while the entire lab 2 was to manage changing data using change data capture or the MERGE statement. Unfortunately both exercise 4  and lab 2 were to me useless. The involved opening a prebuilt solution and going through each of the parts, opening the components and reviewing the settings. Normally no information on why each setting was set the way it was. There were sporadic helpings in the manual but they seemed to wane the closer to the end of the lab you got. This was especially bad for MERGE (which is new to SQL 2008, and I only know that because I read it this week in prep for a session on what’s new in SQL 2008 I am giving) which really needs more information on the use because it is so powerful. These parts really were the “nice to feature x is there” but they were more of a demo without a presenter than a lab. 

image

One of the data flow views from the second lab.

SQL Server BI Boot Camp

Submitted by Robert MacLean on Mon, 11/17/2008 - 09:55

This week Thursday and Friday you will be able to find me at Emerald Casino! No, I haven’t given up my cycling addiction for a gambling one (although in yesterdays 94.7 race at the 70km mark I did think about it). Reality is more boring than that I am attending a two day boot camp on SQL BI! The camp covers areas I know well like Integration and Reporting services and areas I don’t know well like analysis services and performance point. I am looking forward to it and presenting a condensed, focused version on my return to the rest of BB&D (and possible S.A. Architect).

If you are attending also let me know via the comments so we can hook up for drinks on the Thursday night! Otherwise keep an eye on the blog for regular postings during the event!

DTS to SSIS

Submitted by Robert MacLean on Fri, 11/07/2008 - 13:03

I presented a high level session on SQL Server Integration Services (SSIS) this week with the focus on what’s new in 2008, and when and where to use SSIS. Less focus was given to the how because how is easy with SSIS but I did do 30min+ of live demo’s so that showed the high level how as well. SSIS is in it’s second version (2005 being v1) because before that in SQL 2000 and before DTS existed and I was asked during the session on how to migrate from DTS to SSIS. I stupidly mentioned that there was a third party tool but no Microsoft one, because in 2005 there wasn’t and the what’s new in 2008 doesn’t mention anything (1 + 1 = assumption… and you know where that gets you).

Skip forward two days and guess what Zayd the “Server Dude” finds by accident… a TechNet article on how to migrate. Not only showing the third part option but the NEW built in option in SQL Server 2008 which I said didn’t exist! Anyway if you are trying to move from DTS to SSIS then you have to check it out at http://technet.microsoft.com/en-us/library/cc768544.aspx

SQL 2008 - Saving Changes Is Not Permitted

Submitted by Robert MacLean on Fri, 11/07/2008 - 08:22

I was working on a database for some sample data a few weeks back using SQL Server 2008 for the first time. I initially configured the table in one way and decided to change it later, because I wanted to do something else with the data. However when I tried to save it I got the following message:

a1

As there was no data in yet, I wasn’t worried about tables being dropped and re-created so I needed to enable the option mentioned in the message. The option can be found in the options menu item (duh!) under Designers –> Table and Database Designers. You need to uncheck the option Prevent saving changes that require table re-creation and click OK!

a2

SQL Reporting Services TR - Think you've seen it?

Submitted by Robert MacLean on Mon, 10/27/2008 - 15:55

I did a 20min whirl wind tour of SRS 2008 at Dev4Devs (see here for the event info and here for the content), and I mentioned I will be running this again in November (see here). For those who did attend my session at Dev4Devs you may want to come and see this one, because I have 60min to fill. Even if I spoke like Kirk I couldn’t stretch 23 min (which is what I used) to 60 so I’ve added some really cool new content. The new content covers the designer and the WinForm and ASP.NET chart controls (yeah I know it’s not really SRS but it’s borderline and seriously how cool are they?!) and also actually get into where that web server went!

Here is the new introduction slide…

Clipboard01

Hope to see you there!