Calendar
<<  March 2010  >>
SuMoTuWeThFrSa
28123456
78910111213
14151617181920
21222324252627
28293031123
45678910
Blogroll
    Posted by Tim Mitchell on 30 September 2009, 08:28

    So you’ve got some packages that regularly extract data to one or more text files, but you know that from time to time some of the queries will not return any data.  However, you find in SSIS that, in a flat file export package, the output file is created regardless of whether any rows are written to the file, and in cases where there are no rows returned from the source, you’ll end up with empty data files in your output directory.  Although not a critical failure, having empty output files can be a nuisance, and unfortunately, there aren’t any native settings in the Flat File Destination or the Flat File Connection Manager objects that will clean up empty files.  Fortunately, SSIS does provide a combination of other tools that you can use to emulate this behavior.

    In our example, I’m going to create a package to extract data from a table using a query for which I know that no rows would be returned.  When I connect this data source to a flat file destination and execute the package, I’ll see an empty data file in my export directory.  Next, to demonstrate the intended behavior, I’ll add a Row Count transformation to store the number of affected rows in a variable, and create a File System Task object to delete the output file.  Finally, I’ll use an expression constraint to only delete the file if the variable attached to the row count is zero.

    First, let’s create a test table to query:

    USE testdb
    GO

    CREATE TABLE invoices (
      invoiceid     INT    IDENTITY ( 1 , 1 ),
      vendorid      INT,
      invoicedate   DATETIME,
      invoiceamount DECIMAL(10,2))

    INSERT invoices
    VALUES(12, '8/3/2009', 4125.50),
          (53, '8/13/2009', 1095.25),
          (46, '8/15/2009', 729.50),
          (33, '8/23/2009', 3421.50)

    Now, I’ll create a basic package that will export to text the invoices for the past 30 days, a reasonable business requirement.  Since we don’t currently have any invoices matching that criteria, we’ll end up with an empty output file. The original package is shown below:

    df_f1

     

    Now for the new-and-improved version, I’m going to drop a Row Count transformation into the data flow to save the number of affected rows to a variable:

     

    df_f2

     

    Finally, I’ll create a File System Task and configure it to delete the output file.  To insure that a valid data file is not deleted, I’ll create a precedence constraint using an expression to only execute the delete if the row count variable is equal to 0:

     

    df_f3


    When you execute this package, you’ll see that the File System Task object is executed because there are no rows matching our query.  You can test the package by inserting another row into the database that will be returned by the query, and you’ll see that the data file is exported but not deleted.

    I’ve attached the before and after SSIS packages if you’d like to take it for a test drive.  Enjoy!

    Posted by Tim Mitchell on 29 September 2009, 07:15

    It's been well-documented through myriad blogs and forum posts about the case sensitivity of the comparisons in the SSIS lookup transformation (a good review can be found here). In a nutshell, a comparison using the lookup transformation is case sensitive when using the default setting of Full Cache, even if the values in the database are stored in a case insensitive collation. This happens because, in Full Cache mode, the comparison is done by the SSIS engine rather than the database engine, the former of which differentiates between "Value", "VALUE", and "VaLuE" regardless of the collation setting.

    But did you know that this transformation is space sensitive as well? Consider the following T-SQL code:

    SELECT Cast('char value' AS CHAR(20)) [val1]
    INTO   test1

    SELECT Cast('char value' AS CHAR(40)) [val2]
    INTO   test2

    SELECT t1.val1,
           t2.val2
    FROM   test1 t1
           INNER JOIN test2 t2
             ON t1.val1 = t2.val2


    As you would expect, executing this code results in a successful match (INNER JOIN), even though we're comparing CHAR values of differing lengths (for more information, see this article for more information about spaces and padding in SQL Server).

    However, when the same comparison is run through an SSIS lookup transformation in Full Cache mode, the lookup on our sample data will fail. Similar to the case sensitive lookup, you'll find that the SSIS engine would treat the strings 'Hello World' and 'Hello World  ' (note the trailing spaces) as dissimilar values.  Unlike in SQL Server, trailing whitespace is significant in SSIS value comparisons.

    As a workaround, you can use the TRIM() function in SSIS and the RTRIM() T-SQL function to insure that your comparisons are ignorant of trailing whitespace.  Alternatively, you could use a cache mode other than Full Cache, but you should be aware of the other implications before making such a change. 

    Note that this behavior is limited to fixed-length character fields, but could lead to some unexpected and hard-to-detect problems if you aren’t aware of the behavior.

    Posted by Tim Mitchell on 19 September 2009, 17:25

    I’m more than 2 months into the netbook experience, and I’m happy to report that it’s still a good investment.  Last weekend, I made the leap to Windows 7 RTM.  So far so good!

    I was more than a little surprised to find out that there was no upgrade path from Windows XP to Windows 7.  Although there are a few free utilities that can help ease the pain by migrating user settings and documents, a move from XP to 7 will involve a completely new install.  That said, I’m usually not a fan of an OS upgrade anyway, since these tend to be less stable than a clean install in the long run.  Fortunately, I’ve not had this machine long, so it didn’t take long to back up all of my data for the reinstall.

    The installation of the OS and reinstall of my apps took a good part of the afternoon.  Moving to an operating system 8 years newer, I expected a performance drag due to the additional overhead, but was pleasantly surprised at the snappiness of Windows 7 on the netbook.  I’ve noticed that I’m under a bit more memory pressure than I was with XP, but I’ve still got the standard 1gb of RAM installed (I can go up to 2gb on this model).  The battery life seemed to have diminished a bit, but after reinstalling the Toshiba power management software, it’s still in the neighborhood of 7 to 8 hours.

    I’ve noticed that if I have SQL Server, SQL Server Management Studio and Visual Studio running at the same time, the CPU tends to run higher than normal.  I have a little more disk paging than before, most likely due to the memory pressure.  Overall, though, the jump to Windows 7 has been a positive move for me.

    Posted by Tim Mitchell on 19 September 2009, 16:41

    A recent blog by Jack Corbett (No Training Budget Still No Excuse, followed up by SSC’s Steve Jones here) hit the nail on the head with respect to taking charge of one’s own career.  I won’t belabor the point by repeating their words here, but the message is that everyone has access to quality SQL Server training even if your training budget is a goose egg.

    Jack’s article pointed out a number of organizations and websites that offer free SQL Server educational resources, and I was pleased to find one asset on his list that I had never used.  I’ll add on a few Microsoft BI resources that I’ve discovered in my travels:

    http://www.learnmicrosoftbi.com – A website with dozens of videos on Microsoft BI, mostly on SQL Server Analysis Services

    http://www.sqldts.com/ and http://www.sqlis.com/ – Sister websites with resources for SQL Server DTS and SSIS, respectively

    Pragmatic Works webinars – Brian Knight and company have published several of on-demand webinars for business intelligence professionals

    SQL Lunch – This event is just emerging, but will start in September and offer semi-weekly lunchtime webcasts on a variety of topics (follow Patrick LeBlanc’s blog for more details on this one).

    MSSQLTips.com – A SQL Server resource site, which, among other offerings, sends out a SQL Server Tip Of The Day each weekday via e-mail.

    Microsoft Podcasts – Not just for SQL Server, there are Microsoft podcasts and other resources for almost every Microsoft technology.  If you spend any time commuting, burn some of these to CD and learn on the road!

    The Midnight DBA – According to their own description, husband and wife DBA team Sean and Jen McCown put the kids to bed and stay up late making videos.  [OK, they’re SQL Server videos – stop snickering.]  There’s a wide variety of content here, and is well worth a visit.

    Twitter – OK, this one is reaching just a bit.  I add this because there is a fair number of SQL Server pros on Twitter, many of whom are more than willing to offer advice, share upcoming events (webcasts and such) and “tweet” interesting information.  SQLServerPedia has a good index page listing many of the SQL folks who use Twitter.

    Posted by Tim Mitchell on 18 September 2009, 13:59

    In the next few months, I’ll be giving a couple of talks on SQL Server business intelligence.  For October, I have the pleasure of presenting for SQL Lunch, a new online learning series pioneered by Patrick LeBlanc.  On October 12th, I’ll be discussing ways to leverage SQL Server Report Builder 2.0 against your existing SSRS infrastructure to allow users to create their own ad-hoc and published reports.

    Also, I’ll be visiting the Ft. Worth SQL Server User Group on November 18th.  I’ll be discussing Intermediate SSIS, demonstrating some of the more complex tasks and transformations and demonstrating how to leverage these tools to assist with challenging ETL scenarios.  Hope to see you there!