A quick ETL PowerShell script to copy data between databases

I’ll start this post by announce that we Just moved our SQLHelper powershell module from CodePlex into GitHub, you can grab it here:


We also made a new sample to demonstrate the simplicity of use of this module, the objective is to build a very simple ETL script that make these steps:

  1. Query a collection of tables from a source database (can be any type: SQL/Oracle/…)
  2. Insert into a SQLServer database using bulk loading
    • If the table does not exist  in the destination, it must be created automatically

The powershell code:

Import-Module ".\SQLHelper.psm1" -Force

$sourceConnStr = "sourceconnstr"

$destinationConnStr = "destinationconnstr"

$tables = @("[dbo].[DimProduct]", "[dbo].[FactInternetSales]")

$steps = $tables.Count
$i = 1;

$tables |% {

    $sourceTableName = $_
    $destinationTableName = $sourceTableName

    Write-Progress -activity "Tables Copy" -CurrentOperation "Executing source query over '$sourceTableName'" -PercentComplete (($i / $steps)  * 100) -Verbose

    $sourceTable = (Invoke-DBCommand -connectionString $sourceConnStr -commandText "select * from $sourceTableName").Tables[0]

    Write-Progress -activity "Tables Copy" -CurrentOperation "Creating destination table '$destinationTableName'" -PercentComplete (($i / $steps)  * 100) -Verbose

    Invoke-SQLCreateTable -connectionString $destinationConnStr -table $sourceTable -tableName $destinationTableName -force

    Write-Progress -activity "Tables Copy" -CurrentOperation "Loading destination table '$destinationTableName'" -PercentComplete (($i / $steps)  * 100) -Verbose

    Invoke-SQLBulkCopy -connectionString $destinationConnStr -data $sourceTable -tableName $destinationTableName                


Write-Progress -activity "Tables Copy" -Completed


Concatenate a row set into a sql column

Other day I made a post on how to concatenate a rowset into a TSQL variable: here

In this post I’ll show how to make a concat into a column without using any variable:

select (

select Col + ', '

from ( select 'Value 1' Col union select 'Value 2' union select 'Value 3' ) aux FOR XML PATH('')

) ConcatColumn

The trick is on "FOR XML PATH("")" that transform the rows into a xml string where the node name is passed by parameter, in this case I pass an empty string, so the result is:


Drop a SQL Server Schema and all related Objects

I love to organize my SQL Server database objects using schemas and in my current project I need to create a lot of them dynamically so I also need to manage a way to delete them :).

So I implemented a stored procedure that allows me to delete the schema and all it related objects in a database (also specified by parameter).


exec dbo.DropSchema @schemaName = ‘schema’, @database = ‘[Database]’, @whatif = 1

You can download the procedure script here.

If you find any bugs or have any sugestion please fell free to comment.


Slowly Changing Dimensions (SCD) with MERGE Statement

I was reading the “The Kimball Group Reader Relentlessly Practical Tools for Data Warehousing and Business Intelligence” book, witch I highly recommend, and found a very interesting technique for dealing with Slowly Changing Dimensions of type 2 using the Sql Server 2008 Merge statement:

INSERT INTO Customer_Master
SELECT Source_Cust_ID, First_Name, Last_Name, Eff_Date, End_Date, Current_Flag
MERGE Customer_Master CM
USING Customer_Source CS
ON (CM.Source_Cust_ID = CS.Source_Cust_ID)
INSERT VALUES (CS.Source_Cust_ID, CS.First_Name, CS.Last_Name, convert(char(10), getdate()-1, 101), ‘12/31/2199’, ‘y’)
WHEN MATCHED AND CM.Current_Flag = ‘y’
AND (CM.Last_Name <> CS.Last_Name ) THEN
UPDATE SET CM.Current_Flag = ‘n’, CM.End_date = convert(char(10), getdate()-2, 101)
OUTPUT $Action Action_Out, CS.Source_Cust_ID,
CS.First_Name, CS.Last_Name,convert(char(10),getdate()-1, 101) Eff_Date,‘12/31/2199’ End_Date, ‘y’Current_Flag) AS MERGE_OUT


Lines 1–3 set up a typical  INSERT statement. What we will end up inserting are the new values of the type 2 rows that have changed.

Line 4 is the beginning of the MERGE statement, which ends at line 13. The MERGE statement has an OUTPUT clause that will stream the results of the MERGE out to the calling function. This syntax defines a common table expression, essentially a temporary table in the FROM clause, called MERGE_OUT.

Lines 4–6 instruct the MERGE to load customer source data into the customer master dimension table.

Line 7 says when there is no match on the business key, we must have a new customer, so line 8 does the INSERT. You could parameterize the effective date instead of assuming yesterday’s date.

Line 10 identify a subset of the rows with matching business keys, specifically, where it’s the current row in the customer master and any one of the type 2 columns is different.

Line 12 expires the old current row in the customer master by setting the end date and current row flag to ‘n’.

Line 14 is the OUTPUT clause, which identifies what attributes will be output from the MERGE, if any. This is what will feed into the outer INSERT statement. The $Action is a MERGE function that tells us what part of the merge each row came from. Note that the output can draw from both the source and the master. In this case, we are outputting source attributes because they contain the new type 2 values.

Line 17 limits the output row set to only the rows that were updated in customer master. These correspond to the expired rows in Line 11, but we output the current values from customer source in Line 12.

Concatenate a row set into a string with one statement in TSQL

Imagine a SQL statement like this:

select 'a' Col


select 'b'


select 'c'


select 'd'

If you want to concatenate all the rows into a string one possible solution is to declare a cursor and make the concatenation.

But you can also do it like this:

declare @concatStr nvarchar(max)

SELECT @concatStr = isnull(@concatStr + ', ','') + aux.Col

select 'a' Col


select 'b'


select 'c'


select 'd'

) aux

print @concatStr


“a, b, c, d”

Delete older repeating lines using ROW_NUMBER() + Partition By + CTE’s

The Row_Number() function with Partition By and CTE’s it’s a very powerful feature avaliable to Sql Server developers, I’ll make a demonstration of a delete operation that I needed to apply in a recent project that I’am involved.

Imagine that you have an Products table like:


As you can see there’s two repeating products: “Bike” and “Pen”.

Now if you want to delete the repeating lines of the table above but only the older ones as far as I can tell it’s not possible to do with only one TSQL statement (for example you need a cursor), but if you use Row_Number() plus Common Table Expressions (CTE’s) that is indeed possible.

With Row_Number() we can obtain the number of the row in the result but we can also group by (Partition By) a field in this case “ProductId” and order the groupings by DateModified:

FROM Products p


Now with a CTE we can perform a Delete operation over that result like this:

WITH ProductsOrdered AS
    FROM dbo.Products p
DELETE FROM ProductsOrdered
WHERE Row > 1

And that’s it, with only this TSQL statement tou can delete the repeating lines of Products table but only the older ones.

For more information check this excelent article: http://www.sqlmag.com/Article/ArticleID/49240/sql_server_49240.html