Using #PowerBIPS to Upload CSV Files to PowerBI

In June I made a guest post on the PowerBI Developer Blog about #PowerBIPS that showed how to upload a collection of local csv file into PowerBI, check it out here:

http://blogs.msdn.com/b/powerbidev/archive/2015/06/08/using-a-power-bi-app-to-upload-csv-files-to-a-dataset.aspx

PS – Thanks Josh 😉

1447.dashboard before4682.explore.png-550x0


cls

$ErrorActionPreference = "Stop"

$currentPath = (Split-Path $MyInvocation.MyCommand.Definition -Parent)

 
#Create Archive Folder 

new-item -Name "Archive" -Force -ItemType directory -Path "$currentPath\CSVData"  | Out-Null

Import-Module "$currentPath\Modules\PowerBIPS" -Force

while($true)

{

	# Iterate each CSV file and send to PowerBI

	Get-ChildItem "$currentPath\CSVData" -Filter "*.csv" |% { 

		$file=$_               

		#Import csv and add column with filename

		$data = Import-Csv $file.FullName | select @{Label="File";Expression={$file.Name}}, *

		# Send data to PowerBI

		$data |  Out-PowerBI -dataSetName "CSVSales" -tableName "Sales" -types @{"Sales.OrderDate"="datetime"; "Sales.SalesAmount"="double"; "Sales.Freight"="double"} -batchSize 300 -verbose

		# Archive the file

		Move-Item $file.FullName "$currentPath\CSVData\Archive\" -Force

	}   

	Write-Output "Sleeping..."

	Sleep -Seconds 5

}

Create a Real-Time IT Dashboard with PowerBIPS

Last week we published on GitHub a powershell module for the new PowerBI developer REST API’s named “PowerBIPS”, download it here:

https://github.com/DevScope/powerbi-powershell-modules/

In this post I will demonstrate the ease of use of this module and show you step-by-step how to build a powershell script that upload server stats to PowerBI allowing the users to build a real-time IT dashboard like this:

image

The script is divided in the following steps (full script):

  1. Import the PowerBIPS module
  2. Get the Authentication Token
  3. Create a PowerBI DataSet
  4. Upload data to PowerBI

Import the PowerBIPS module

The script starts by importing the powershell module:

# Module need to be installed on %USERPROFILE\WindowsPowershell\Modules\
Import-Module PowerBIPS -Force

Or by pointing directly to the .psm1 file:

Import-Module <path to module>\PowerBIPS.psm1 –Force

After this all the PowerBIPS cmdlets are available for use in the PowerShell console.

Get the Authentication Token

Next we need to get the authentication token that is needed to communicate with the API, this token identifies your PowerBI tenant and grants access to it:

$authToken = Get-PBIAuthToken -clientId "<your client id>"

This cmdlet require you to provide the Client Id of the Native Application that you need to create in the windows azure active directory of yout tenant:

PS -Follow this guide to create an Azure Client App (thanks to mattmcnabb)

image

When you execute this cmdlet the following popup is shown asking for the credentials to connect to the PowerBI account:

image

It’s also possible to set in the cmdlet the username and password which is the ideal mode for automation jobs.

Create a PowerBI DataSet

Before creating the DataSet we will test if a DataSet with name “IT Server Monitor” exists in PowerBI, if not create a new one:


$dataSetMetadata = Get-PBIDataSet -authToken $authToken -dataSetName "IT Server Monitor"

if (-not $dataSetMetadata)
{
  # If cannot find the DataSet create a new one with this schema	
	
  $dataSetSchema = @{
	name = "IT Server Monitor"
	; tables = @(
		@{name = "Processes"
		; columns = @( 
			@{ name = "ComputerName"; dataType = "String"  }					
			, @{ name = "Date"; dataType = "DateTime"  }
			, @{ name = "Hour"; dataType = "String"  }
			, @{ name = "Id"; dataType = "String"  }
			, @{ name = "ProcessName"; dataType = "String"  }
			, @{ name = "CPU"; dataType = "Double"  }
			, @{ name = "Memory"; dataType = "Double"  }
			, @{ name = "Threads"; dataType = "Int64"  }					
			) 
		})
  }	

  $dataSetMetadata = New-PBIDataSet -authToken $authToken -dataSet $dataSetSchema -Verbose
}

The $dataSetSchema variable is a hierarchical hashtable that define the schema of the dataset (a System.Data.DataSet is also supported)

The “New-PBIDataSet” cmdlet in case of success returns the PowerBI dataset metadata with the internal id of the dataset.

Upload data to PowerBI

In this script we will start by deleting all the data in the “Processes” table (optional)

Clear-PBITableRows -authToken $authToken -dataSetId $dataSetMetadata.Id -tableName "Processes"

And each 10 seconds cycle the servers and for each one:

  1. Get all the running processes and related stats: CPU, Memory,…
  2. Upload the collected data to PowerBI
$computers = @("DSRomano")

while($true)
{	
    $computers |% {
		
	$computerName = $_
		
	$timeStamp = [datetime]::Now
		
	# Dates sent as string because of an issue with ConvertTo-Json (alternative is to convert each object to a hashtable)
		
	Get-Process -ComputerName $computerName | Select  @{Name = "ComputerName"; Expression = {$computerName}} `
		, @{Name="Date"; Expression = {$timeStamp.Date.ToString("yyyy-MM-dd")}} `
		, @{Name="Hour"; Expression = {$timeStamp.ToString("HH:mm:ss")}} `
		, Id, ProcessName, CPU, @{Name='Memory';Expression={($_.WorkingSet/1MB)}}, @{Name='Threads';Expression={($_.Threads.Count)}} `
	| Add-PBITableRows -authToken $authToken -dataSetId $dataSetMetadata.Id -tableName "Processes" -batchSize -1 -Verbose
		
     }
	
     sleep -Seconds 10
}

After you run the script the following DataSet should appear on your PowerBI account:

image  image

And you can start searching, pining, slicing & dicing with this real time data:

image

image

image

The full PowerShell script can be downloaded here.

A quick ETL PowerShell script to copy data between databases

I’ll start this post by announce that we Just moved our SQLHelper powershell module from CodePlex into GitHub, you can grab it here:

https://github.com/DevScope/sql-powershell-modules

We also made a new sample to demonstrate the simplicity of use of this module, the objective is to build a very simple ETL script that make these steps:

  1. Query a collection of tables from a source database (can be any type: SQL/Oracle/…)
  2. Insert into a SQLServer database using bulk loading
    • If the table does not exist  in the destination, it must be created automatically

The powershell code:

Import-Module ".\SQLHelper.psm1" -Force

$sourceConnStr = "sourceconnstr"

$destinationConnStr = "destinationconnstr"

$tables = @("[dbo].[DimProduct]", "[dbo].[FactInternetSales]")

$steps = $tables.Count
$i = 1;

$tables |% {

    $sourceTableName = $_
    $destinationTableName = $sourceTableName

    Write-Progress -activity "Tables Copy" -CurrentOperation "Executing source query over '$sourceTableName'" -PercentComplete (($i / $steps)  * 100) -Verbose

    $sourceTable = (Invoke-DBCommand -connectionString $sourceConnStr -commandText "select * from $sourceTableName").Tables[0]

    Write-Progress -activity "Tables Copy" -CurrentOperation "Creating destination table '$destinationTableName'" -PercentComplete (($i / $steps)  * 100) -Verbose

    Invoke-SQLCreateTable -connectionString $destinationConnStr -table $sourceTable -tableName $destinationTableName -force

    Write-Progress -activity "Tables Copy" -CurrentOperation "Loading destination table '$destinationTableName'" -PercentComplete (($i / $steps)  * 100) -Verbose

    Invoke-SQLBulkCopy -connectionString $destinationConnStr -data $sourceTable -tableName $destinationTableName                

    $i++;
}

Write-Progress -activity "Tables Copy" -Completed

ETLPowerShellScript

Converting a PowerPivot workbook w/ PowerQuery connections into SSAS Tabular

No, SSAS Tabular mode don’t support PowerQuery connections… (yet… But I’am confident that will be in the future)

But I was happy to verify that it’s very easy to change the connection and point into another table in a supported provider (ex: a SQL Server DW).

It’s true that you will loose all the transformations made in powerquery, but in the scenario that we are converting a self-service BI model made in Excel into a more enterprise level BI application where the transformations need to be made by a more “professional” ETL tool like SSIS tool, it’s good to know that it’s possible to convert the connection very easily.

PS – This is not the same history with Linked Tables in PowerPivot, when converting into a tabular database those tables will end up as Pasted Tables which are very hard to update, so stay away from Linked Tables… As an alternative I find preferable to create a powerquery into the Excel table and then load into the model.

Start converting the powerpivot workbook into a SSAS Tabular project:

image

Then go to Model –> Existing Connections, and you’ll find all your PowerQuery connections:

image

Try to process one of them, and you’ll get this error:

image

Now you need for each PowerQuery connection to change the connection provider into a supported one and point into a object that has the same name and schema (if it doesn’t have the same schema/name you will have to change the table partition query).

In this case I have a table named “Employee” with these columns:

image

Create a table in your datawarehouse with the same name and schema:

image

Edit the “PowerQuery – Employees” connection and change the provider:

image

Connect into my SQLServer DW:

image

image

The simply process the table and it will process sucessfuly, and the new data will be loaded:

image

image

My favorite default Query Load option in PowerQuery

By default PowerQuery loads the built/search query in a blank worksheet, but this can be a little bit time consuming when you have many queries or the query has a lot of rows.

So my recommendation is to by default disable the load of the query in the Options:

image

This way when you create/reference a query in PowerQuery it’s saved in the workbook but it’s not loaded into the worksheet nor the data model:

image

When you are finished with the queries you can choose where to load them, right-clicking the query and choosing the  “Load To…” option:

image

Another PowerQuery DateDimension query

This is nothing new and as already been blogued by Chris Webb and Matt Masson:

https://cwebbbi.wordpress.com/2013/11/19/generating-a-date-dimension-table-in-power-query/

http://www.mattmasson.com/2014/02/creating-a-date-dimension-with-a-power-query-script/

But based on their work a made a new one with some new quickwins:

  • Suport for the day of the week start (in Portugal tipicaly starts on monday not sunday)
  • Only StartDate is mandatory, in this case a Date Dimension is generated til the end of the year
  • Some changes in the columns values

image

image


let CreateDateTable = (StartDate as date, optional EndDate as nullable date, optional Culture as nullable text, optional FirstDayOfWeek as nullable number) as table =>  
 let    
    Culture = if (Culture = null) then "pt-PT" else Culture,
    FirstDayOfWeek = if (FirstDayOfWeek = null) then 1 else FirstDayOfWeek, 
    EndDate  = if (EndDate = null) then Date.EndOfYear(StartDate) else EndDate,
    DayCount = Duration.Days(Duration.From(EndDate - StartDate)) + 1,
    Source = List.Dates(StartDate,DayCount,#duration(1,0,0,0)),
    TableFromList = Table.FromList(Source, Splitter.SplitByNothing()),    
    ChangedType = Table.TransformColumnTypes(TableFromList,{{"Column1", type date}}),
    RenamedColumns = Table.RenameColumns(ChangedType,{{"Column1", "Date"}}),
    InsertYear = Table.AddColumn(RenamedColumns, "Year", each Date.Year([Date])),
    InsertQuarter = Table.AddColumn(InsertYear, "Quarter", each Date.QuarterOfYear([Date])),
    InsertMonth = Table.AddColumn(InsertQuarter, "Month", each Date.Month([Date])),
    InsertDay = Table.AddColumn(InsertMonth, "Day", each Date.Day([Date])),    
    InsertMonthName = Table.AddColumn(InsertDay, "MonthName", each Date.ToText([Date], "MMMM", Culture), type text),
    InsertCalendarMonth = Table.AddColumn(InsertMonthName, "MonthOfYear", each (try(Text.Range([MonthName],0,3)) otherwise [MonthName]) & " " & Number.ToText([Year])),
    InsertCalendarQtr = Table.AddColumn(InsertCalendarMonth, "QuarterOfYear", each "Q" & Number.ToText([Quarter]) & " " & Number.ToText([Year])),    
    InsertWeek = Table.AddColumn(InsertCalendarQtr, "Week", each Date.WeekOfYear([Date])),
    InsertCalendarWeek = Table.AddColumn(InsertWeek, "WeekOfYear", each "W" & Number.ToText([Week]) & " " & Number.ToText([Year])),
    InsertDayWeek = Table.AddColumn(InsertCalendarWeek , "WeekDay", each Date.DayOfWeek([Date], FirstDayOfWeek ) + 1),
    InsertDayName = Table.AddColumn(InsertDayWeek, "WeekDayName", each Date.ToText([Date], "dddd", Culture), type text),
    InsertDayInt = Table.AddColumn(InsertDayName, "DateId", each [Year] * 10000 + [Month] * 100 + [Day]),
    #"Changed Type" = Table.TransformColumnTypes(InsertDayInt ,{{"Year", Int64.Type}, {"Quarter", Int64.Type}, {"Month", Int64.Type}, {"Day", Int64.Type}, {"DateId", Int64.Type}, {"Week", Int64.Type}, {"WeekDay", Int64.Type}}),
    #"Capitalized Each Word" = Table.TransformColumns(#"Changed Type",{{"MonthName", Text.Proper}, {"MonthOfYear", Text.Proper}, {"WeekDayName", Text.Proper}})

  in
     #"Capitalized Each Word"
in
    CreateDateTable

Session Contents Excel & PowerBI @ IT Pro Airlift (2015-01-22)

Thanks to all who attended, hope you had a good time attending as I had presenting.

As promissed here are the link to download the session contents here: http://1drv.ms/1Cm0BQ9

And see the Demos Video here: http://1drv.ms/18azgoV

image

image

image

image