2017-08-12

Windows 10 on ThinkPad T440s

The other day I wanted to install Windows 10 on a Lenovo ThinkPad T440s.
It seemed like a simple task:
  1. Download Windows 10 image from Visual Studio subscription.
  2. Extract installation files to USB stick.
  3. Boot ThinkPad on USB stick.
  4. Install Windows 10.
Not quite.
The installation set is not specially large and fits easily on a medium (8 GiB) USB stick.
But the file "install.wim" is larger than 4 GiB in the latest Windows 10 image. And when a USB usually is formatted with FAT32, it can't hold a file larger than 4 GiB.
My first thought was to format the USB stick with exFAT, but the T440s couldn't boot on the USB stick with the installed BIOS level.

My solution was:

  1. Format the USB stick with FAT32.
  2. Download the RTM image of Windows 10 where the file "install.wim" is smaller than 4 GiB.
  3. Extract installation files to the USB stick.
  4. Boot the ThinkPad on the USB stick.
  5. Install Windows 10 RTM.
  6. Update Windows 10 to current release - with several reboots.
How to install Windows 10 later when RTM is not available on Visual Studio subscription might be a larger challenge in the future.
If I meet that challenge then I will update this post.

2017-08-09

Getting and using PowerShell AzureRM

PowerShell


Check if PowerShell is available
powershell.exe
and what version that is installed.
prompt> $Host
or
prompt> $Host.Version

If PowerShell is an old version or not installed, then the installation is done by installing Windows Management Framework (WMF) in the latest version. Take a look at the PowerShell site (link) on MSDN or use your favorite search engine to find the latest version of WMF.

PowerShellGet

PowerShellGet is a PowerShell module that is used to get modules form a repository like PowerShell Gallery.

Check if the module PowerShellGet is installed and if then in what version:
Get-Module -Name PowerShellGet -ListAvailable
The output is a ModuleInfoGrouping object, where the Version property holds a Version object.

Directory: C:\Program Files\WindowsPowerShell\Modules

ModuleType Version Name ExportedCommands
---------- ------- ---- ----------------
Script 1.0.0.1 PowerShellGet {Install-Module, Find-Module, Save-Module, Update-Module...}


Check what version that is the latest in the repository:
Find-Module -Name PowerShellGet -Repository PSGallery

Version Name Repository Description
------- ---- ---------- -----------
2.0.1 PowerShellGet PSGallery PowerShell module with commands for discovering,...


Compare the Version elements in the outputs.
The output is a ModuleInfoGrouping object, where the Version property holds a Version object like the output from the Cmdlet Get-Module.
You might have several versions of the same module installed. Then you should compare on the latest version.

Install the latest version of the PowerShellGet module:
Install-Module -Name PowerShellGet -AllowClobber -Force
I have seen this warning (error?) some times:
WARNING: The version '1.1.4.0' of module 'PackageManagement' is currently in use. Retry the operation after closing the applications.
A PowerShell restart does not help. I have not looked deeper in this issue.

A new compare shows to versions of PowerShellGet installed side-by-side.

Directory: C:\Program Files\WindowsPowerShell\Modules

ModuleType Version Name ExportedCommands
---------- ------- ---- ----------------
Script 2.0.1 PowerShellGet {Find-Command, Find-DSCResource, Find-Module, Find-RoleCap...
Script 1.0.0.1 PowerShellGet {Install-Module, Find-Module, Save-Module, Update-Module...}

In general PowerShell will use the highest version number available when called without a specific version number.

Update the PowerShellGet module to the latest version:
Update-Module -Name PowerShellGet
I have seen that version 1.0.0.1 of PowerShellGet can't be updated, but the update doesn't fail.

Uninstall the module in all versions installed:
Uninstall-Module -Name PowerShellGet -AllVersions -Force
Again PowerShellGet in version 1.0.0.1 stand out as it is not uninstalled, but the uninstall doesn't fail.
If I try a version specific uninstall
Uninstall-Module -Name PowerShellGet -RequiredVersion '1.0.0.1'
then there is an error
PackageManagement\Uninstall-Package : No match was found for the specified search criteria and module names 'PowerShellGet'.
At C:\Program Files\WindowsPowerShell\Modules\PowerShellGet\1.1.3.2\PSModule.psm1:2252 char:21
+ ... $null = PackageManagement\Uninstall-Package @PSBoundParameters
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : ObjectNotFound: (Microsoft.Power...ninstallPackage:UninstallPackage) [Uninstall-Package], Exception
+ FullyQualifiedErrorId : NoMatchFound,Microsoft.PowerShell.PackageManagement.Cmdlets.UninstallPackage

where it looks like a newer version is installed, but that does not show through Get-Module -ListAvailable. It might be the installation on my computer that has broken during several Install-Update-Uninstall sessions. I have not looked deeper in this issue.

Import the PowerShellGet module in the current PowerShell session with this statement:
Import-Module -Name PowerShellGet
This is usually not needed as the module is loaded when installed.

List functions available in module:
Get-Command -Module PowerShellGet

Remove the PowerShellGet module with
Remove-Module -Name PowerShellGet
but why???

AzureRM

It is the same Cmdlets as used with PowerShellGet above, but I go through the phases anyway to see the differences.

Check the installed and available version:
Get-Module -Name AzureRm -ListAvailable
Find-Module -Name AzureRM -Repository PSGallery


Install the latest version:
Install-Module -Name AzureRM -AllowClobber -Force
It will usually take several seconds before the installation begins. Until then no activity is shown in the PowerShell console - you just have to be a little patient.

Update to the latest version:
Update-Module -Name AzureRM

Uninstall the module:
Uninstall-Module -Name AzureRM -AllVersions -Force

Import the module in the current PowerShell session:
Import-Module -Name AzureRM

List functions available in module:
Get-Command -Module AzureRM

Remove the module from the current PowerShell session:
Remove-Module -Name AzureRM

Context

Windows 7 (Enterprise) and 10 (Pro), WMF 5

2017-07-30

SqlBulkCopy with PowerShell

The challenge

A rather common task is to copy many (all) rows from one SQL Server database table to another as smooth and fast as possible. There are some tuning on the platform that is important, but the copy itself can be done in several very different ways.

A copy direct from table to table can be done with the .NET SqlBulkCopy class using PowerShell.

Alternative T-SQL statements can be used with reduced possibilities and (maybe) better performance:
INSERT INTO ... SELECT FROM where the target table must be created before, but can be in any filegroup.
Or SELECT INTO can be used where the target table will be created in the primary filegroup if the table does not exist there already.

Copying data between a table and a file can also be done with the SQL Server utility bcp.exe.
To copy data from a file to a database table can be done with the T-SQL statement BULK INSERT.

But for now I will focus on SqlBulkCopy with PowerShell.
Yan Pan wrote the great post „Use PowerShell to Copy a Table Between Two SQL Server Instances“ at the Hey, Scripting Guy! blog, but I will go through details on my own to get a deeper understanding.

Using SqlBulkCopy

There is a special case on identity insert as key constraints and not null is not checked by default. This can be changed using the SqlBulkCopyOption enumerations CheckConstraints.

With large amounts of data it is important to use streaming with a SqlDataReader object instead of a static DataSet object, as the DataSet object will hold all data in memory. This can really stress a server and might bring it down.

There are some interesting articles on MSDN Library and a good discussion on stackoverflow on SqlBulkCopy specific and SQL Server bulk operations in general:

I made the PowerShell function Copy-Sqltable to handle the copy of the data:
function Copy-SqlTable {
<#
.DESCRIPTION
  Copy single table from source database to target database on same SQL Server Database Engine instance.
#>
[CmdletBinding()]
[OutputType([void])]
Param()

Begin {
  $mywatch = [System.Diagnostics.Stopwatch]::StartNew()
  "{0:s}Z :: Copy-SqlTable()" -f [System.DateTime]::UtcNow | Write-Verbose

  [string]$ApplicationName = 'SqlBulkCopy.ps1'

  #Candidates for function parameters:
  [string]$SourceInstanceName = '(local)\SQL2016A'
  [string]$SourceDatabaseName = 'source'
  [string]$SourceTableName = '[test].[business]'

  [string]$TargetInstanceName = $SourceInstanceName
  [string]$TargetDatabaseName = 'target'
  [string]$TargetTableName = $SourceTableName
}

Process {
  'Connect to source...' | Write-Verbose
  [string]$CnnStrSource = "Data Source=$SourceInstanceName;Integrated Security=SSPI;Initial Catalog=$SourceDatabaseName;Application Name=$ApplicationName"
  "Source connection string: '$CnnStrSource'" | Write-Debug
  $SqlCnnSource = New-Object -TypeName System.Data.SqlClient.SqlConnection $CnnStrSource
  $SqlCommand = New-Object -TypeName System.Data.SqlClient.SqlCommand("SELECT * FROM $SourceTableName;", $SqlCnnSource)
  $SqlCnnSource.Open()
  [System.Data.SqlClient.SqlDataReader]$SqlReader = $SqlCommand.ExecuteReader()

  'Copy to target...' | Write-Verbose
  [string]$CnnStrTarget = "Data Source=$TargetInstanceName;Integrated Security=SSPI;Initial Catalog=$TargetDatabaseName;Application Name=$ApplicationName"
  "Target connection string: '$CnnStrTarget'" | Write-Debug
  try {
    $SqlBulkCopy = New-Object -TypeName System.Data.SqlClient.SqlBulkCopy($CnnStrTarget, [System.Data.SqlClient.SqlBulkCopyOptions]::KeepIdentity)
    $SqlBulkCopy.EnableStreaming = $true
    $SqlBulkCopy.DestinationTableName = $TargetTableName
    $SqlBulkCopy.BatchSize = 1000000 # Another candidate for function parameter
    $SqlBulkCopy.BulkCopyTimeout = 0 # seconds, 0 (zero) = no timeout limit
    $SqlBulkCopy.WriteToServer($SqlReader)
  }
  catch [System.Exception] {
    $_.Exception | Write-Output
  }
  finally {
    'Copy complete. Closing...' | Write-Verbose
    $SqlReader.Close()
    $SqlCnnSource.Close()
    $SqlCnnSource.Dispose()
    $SqlBulkCopy.Close()
  }
}

End {
  $mywatch.Stop()
  [string]$Message = "Copy-SqlTable finished with success. Duration = $($mywatch.Elapsed.ToString()). [hh:mm:ss.ddd]"
  "{0:s}Z $Message" -f [System.DateTime]::UtcNow | Write-Output
}
} # Copy-SqlTable


Measure

Execution time is measured on each run. For the PowerShell function I used a .NET Stopwatch object. The T-SQL statements are clocked default by Management Studio.

Also I kept an eye on Memory and CPU usage in Windows Performance Monitor on the Process object with all counters (*) on the processes sqlsrvr.exe and powershell/powershell_ise during each run.

Finally I caught actual execution plan on T-SQL statements and kept an eye on the SQL Server Activity Monitor, e.g. the wait statistics. Actually I enabled the SQL Server Query Store on both tables on creation to have some extra figures to look at.

Create test data

Using the batch delimiter "GO" in Management Studio or SQLCMD with the number of rows as the count parameter value to generate several rows of source data:
USE [source];
GO
SET NOCOUNT ON;
GO
INSERT INTO [test].[business] ([test_str],[test_nr])
VALUES (NEWID(), CONVERT(int, RAND()*2147483647));

GO 1000000000


Define SQL Server objects

The source database and table is created simple but still with parallel I/O in mind:
CREATE DATABASE [source]
ON PRIMARY
  (NAME = N'source_data', FILENAME = N'C:\MSSQL_data\source.primary.mdf',
  SIZE = 8MB, FILEGROWTH = 10MB ),
FILEGROUP [user_data]
  ( NAME = N'user_data00',
   FILENAME = N'C:\MSSQL_data\source.user_data00.ndf',
   SIZE = 128MB, FILEGROWTH = 10MB),
  ( NAME = N'user_data01',
   FILENAME = N'C:\MSSQL_data\source.user_data01.ndf',
   SIZE = 128MB, FILEGROWTH = 10MB),
  ( NAME = N'user_data02',
   FILENAME = N'C:\MSSQL_data\source.user_data02.ndf',
   SIZE = 128MB, FILEGROWTH = 10MB),
  ( NAME = N'user_data03',
   FILENAME = N'C:\MSSQL_data\source.user_data03.ndf',
   SIZE = 128MB, FILEGROWTH = 10MB),
  ( NAME = N'user_data04',
   FILENAME = N'C:\MSSQL_data\source.user_data04.ndf',
   SIZE = 128MB, FILEGROWTH = 10MB),
  ( NAME = N'user_data05',
   FILENAME = N'C:\MSSQL_data\source.user_data05.ndf',
   SIZE = 128MB, FILEGROWTH = 10MB),
  ( NAME = N'user_data06',
   FILENAME = N'C:\MSSQL_data\source.user_data06.ndf',
   SIZE = 128MB, FILEGROWTH = 10MB),
  ( NAME = N'user_data07',
   FILENAME = N'C:\MSSQL_data\source.user_data07.ndf',
   SIZE = 128MB, FILEGROWTH = 10MB)
LOG ON
  ( NAME = N'source_log',
   FILENAME = N'C:\MSSQL_translog\source_log.ldf',
   SIZE = 56MB, FILEGROWTH = 10MB);
GO

ALTER DATABASE [source] SET QUERY_STORE = ON;
ALTER DATABASE [source] SET QUERY_STORE (OPERATION_MODE = READ_WRITE);
GO

USE [source];
GO
CREATE SCHEMA [test];
GO
CREATE TABLE [test].[business] (
  [test_id] bigint NOT NULL IDENTITY (1, 1),
  [test_str] nvarchar(256) NOT NULL,
  [test_nr] int NOT NULL
  ) ON [user_data];
GO

USE [master];
GO


The target database and table is created in similar way but on another drive to further optimize I/O:
CREATE DATABASE [target]
ON PRIMARY
  (NAME = N'taget_data', FILENAME = N'M:\MSSQL_data\target.primary.mdf',
  SIZE = 8MB, FILEGROWTH = 8MB ) ,
FILEGROUP [user_data]
  ( NAME = N'user_data00',
   FILENAME = N'M:\MSSQL_data\target.user_data00.ndf',
   SIZE = 1792MB, FILEGROWTH = 32MB),
  ( NAME = N'user_data01',
   FILENAME = N'M:\MSSQL_data\target.user_data01.ndf',
   SIZE = 1792MB, FILEGROWTH = 32MB),
  ( NAME = N'user_data02',
   FILENAME = N'M:\MSSQL_data\target.user_data02.ndf',
   SIZE = 1792MB, FILEGROWTH = 32MB),
  ( NAME = N'user_data03',
   FILENAME = N'M:\MSSQL_data\target.user_data03.ndf',
   SIZE = 1792MB, FILEGROWTH = 32MB),
  ( NAME = N'user_data04',
   FILENAME = N'M:\MSSQL_data\target.user_data04.ndf',
   SIZE = 1792MB, FILEGROWTH = 32MB),
  ( NAME = N'user_data05',
   FILENAME = N'M:\MSSQL_data\target.user_data05.ndf',
   SIZE = 1792MB, FILEGROWTH = 32MB),
  ( NAME = N'user_data06',
   FILENAME = N'M:\MSSQL_data\target.user_data06.ndf',
   SIZE = 1792MB, FILEGROWTH = 32MB),
  ( NAME = N'user_data07',
   FILENAME = N'M:\MSSQL_data\target.user_data07.ndf',
   SIZE = 1792MB, FILEGROWTH = 32MB)
LOG ON
  ( NAME = N'source_log',
   FILENAME = N'C:\MSSQL_translog\target_log.ldf',
   SIZE = 56MB, FILEGROWTH = 16MB);
GO

ALTER DATABASE [target] SET QUERY_STORE = ON
ALTER DATABASE [target] SET QUERY_STORE (OPERATION_MODE = READ_WRITE)
GO

USE [target];
GO
CREATE SCHEMA [test];
GO
CREATE TABLE [test].[business] (
  [test_id] bigint NOT NULL IDENTITY (1, 1),
  [test_str] nvarchar(256) NOT NULL,
  [test_nr] int NOT NULL
) ON [user_data];
GO

USE [master];
GO


Evaluation

The first measure is a basic run with default batch size.

Copy-SqlTable finished with success. Duration = 00:13:28.0940294. [hh:mm:ss.ddd]; 134 308 637 rows; 12.3 GB data
Copy-SqlTable finished with success. Duration = 00:16:12.9162091. [hh:mm:ss.ddd]; BatchSize = 1 000
Copy-SqlTable finished with success. Duration = 00:11:34.3647701. [hh:mm:ss.ddd]; BatchSize = 10 000
Copy-SqlTable finished with success. Duration = 00:10:15.7085043. [hh:mm:ss.ddd]; BatchSize = 100 000
Copy-SqlTable finished with success. Duration = 00:10:00.1098163. [hh:mm:ss.ddd]; BatchSize = 1 000 000

2017-07-06

COM objects with PowerShell

COM (Component Object Model) is a rather old technology from Microsoft. But still relevant i practice and recognized by Microsoft. Actually there is a new COM implementation in Windows Management Framework (WMF) 5.0. with significant performance improvements as described in „What's New in Windows PowerShell“. Also some issues are fixed with WMF 5.1 (Bug Fixes).

To get a list of the COM components available on the computer you can use this PowerShell statement:
Get-ChildItem HKLM:\Software\Classes -ErrorAction SilentlyContinue |
Where-Object {
$_.PSChildName -match '^\w+\.\w+$' -and (Test-Path -Path "$($_.PSPath)\CLSID")
  } |
Select-Object -ExpandProperty PSChildName | Out-GridView

On the computer I am writing this on there are 1219 COM components…
The statement above is part of the short and fine article „Get a list of all Com objects available“ by Jaap Brasser.

On MSDN there are some nice introductions:
MSDN Library: „Creating .NET and COM Objects (New-Object)“.

A COM object is created with the PowerShell CmdLet New-Object using the parameter -Strict to stabilize the script when the COM component is using an Interop Assembly.

The integration between COM and .NET is based on COM Interop.

Microsoft Office

Each product in Microsoft Office has at least one COM component. Usually they are used inside a Office product with Visual Basic for Applications (VBA), but VBA code can very easy be refactored to VBScript.

Excel

The central COM component when working with Excel is the „Application Object“. This is a very short and incomplete example:
$Excel = New-Object -ComObject Excel.Application -Property @{Visible = $true} -Strict -ErrorAction SilentlyContinue

$Workbook = $Excel.Workbooks.Add()
$Workbook.Author = 'Niels Grove-Rasmussen'
$Workbook.Title = 'PowerShell COM sandbox'
$Workbook.Subject = 'Sandbox on using COM objects in PowerShell'

$Workbook.Sheets.Item(3).Delete()
$Workbook.Sheets.Item(2).Delete()

$Sheet1 = $Workbook.Sheets.Item(1)
$Sheet1.Name = 'COM'

$Sheet1.Range('A1:A1').Cells = 'Cell A1'
$Sheet1.Range('B1:B1').Cells = 2
$Sheet1.Range('B2:B2').Cells = 3.1
$Sheet1.Range('B3:B3').Cells.Formula = '=SUM(B1:B2)'

$Excel.Quit()

Using -Strinct on the CmdLet New-Object also generates 5+ lines of output about IDispatch. This is suppressed with -ErrorAction SilentlyContinue.

Working with Excel through COM will often - of not always - end up using Application.Sheets. In the example above it is used to give the sheet a custom name and add values or a formula to cells.

The Excel object does not close nice. So some COM cleanup is required to reset the COM environment after a Excel object is closed.
while( [System.Runtime.Interopservices.Marshal]::ReleaseComObject($Excel) ) { 'Released one Excel COM count.' | Write-Verbose }
Remove-Variable Excel
[System.GC]::Collect()

As this is a old challenge there is a Microsoft article on the more general subject: „Getting Rid of a COM Object (Once and For All)“. One interesting thing about Excel in the article is that when Excel is closed manually a EXCEL.EXE process still is active.

Word

This is another example on a COM component in Microsoft Office. The naming is similar to Excel, so the central class is the Application Class.

$Word = New-Object -ComObject Word.Application -Property @{Visible = $true} -Strict -ErrorAction SilentlyContinue

$Word.Quit()


Internet Explorer

The old Microsoft web browser has a COM component that used to be popular for a automation GUI. Some naming is aligned with Microsoft Office COM components – or the other way. The central class is the Application Class.

$IE = New-Object -ComObject InternetExplorer.Application -Strict
$IE.Visible = $true

$IE = $null


Windows Scripting Host

The „old“ Windows automation Windows Scripting Host (WSH) from before PowerShell. By default there is script engine for Visual Basic Scripting Edition (VBScript) and JScript. Other languages are implemented to integrate with WSH, e.g. Perl and REXX. There are many old but good books, articles and examples around. Personally I prefer JScript as it has a more robust error handling than VBScript - and other subjective benefits.

WScript

In this part of WSH the central COM class is WshShell.

$vbInformation = 64
$vbOKCancel = 1

$Timeout = 0
$Wsh = New-Object -ComObject WScript.Shell -Strict
[int]$Button = $Wsh.Popup('Hello from PowerShell!', $Timeout, 'PowerShell', $vbInformation + $vbOKCancel)
$Wsh = $null
''
switch($Button) {
  -1 { '???' }
  1 { ':-)' }
  2 { ':-(' }
}

The Popup method used above is a less nice way to put a popup window in a PowerShell Script. I have elsewhere written a nicer solution for a PowerShell MessageBox.
I think that if you need GUI elements then you should look at .NET Windows Forms before WSH.

Controller

WshController

Network

WshNetwork

Windows Shell

The central COM component to Windows Shell is the Shell object.

Actually the name „Windows Shell“ is somewhat misleading as Windows Shell is usually used for the Windows command line shell cmd.exe. So if you are talking with others about the COM class Shell.Application you should make sure to distinguish from cmd.exe Windows Shell.

$Shell = New-Object -ComObject Shell.Application -Strict
$Shell.Open('C:\')

$Shell = $null


ScriptControl

.

$Script = New-Object -ComObject MSScriptControl.ScriptControl -Strict
$Script.Language = 'vbscript'
$Script.AddCode('function getInput() getInput = inputbox(...')
$Input = $Script.eval('getInput')

This is a rather poisonous way to call a piece of VBScript code as the eval method is used. The method opens up for injection attack to the script.

Disclaimer: The example above should not be used in production!

When I have a more suitable example on using ScriptControl this example will be replaced.

History

2017-06-06 Blog entry started as placeholder for ideas and old notes. Not released.
2017-07-06 Notes rewritten, restructured and released.

IT infrastructure principles

This is some thoughts about good general principles on IT infrastructure. The principles are my personal thoughts, but are in content and structure inspired by TOGAF Architecture Principles
(pubs.opengroup.org/architecture/togaf9-doc/arch/chap23.html) from TOGAF 9.1.

Principle: Comply or Explain

Statement; Each solution or architectural element should comply with these principles. If it is not possible the deviation must be documented with motivations.

Rationale; This principle enforce these principles, and make sure they are continuously evaluated.

Implications; All architectural elements are documented thoroughly in a really useful way, that really can be used in future development and governance of the system.


Principle: Business Support

Statement; Each architectural element must support business. Not only the current business but also the future business.

Rationale; The architecture is sponsored by business and then should support business.

Implications; The architecture, the processes or the technology must have business as primary subject.
Data available and systems running (continuity). Easy and direct access to information.


Principle: Compliance

Statement; Each component, element and the combined solution must comply with both law (national and international), regulations and audit (internal and external).

Rationale; This is to protect the organisation, senior management and board from legal, economical or reputational consequences.

Implications; Business, audit, management, development and operation must coorporate to ensure compliance in all details. This will usually involve several different areas like organisation, procedures and technology. Each person must have ownership.

Principle: Secure by Design

(collection point of security principles?)
Attack Surface Reduction
Threat Modelling
"Microsoft Security Development Lifecycle" (https://www.microsoft.com/en-us/sdl/)

The Microsoft presentation „Basics of Secure Design Development Test“ has many interesting points. E.g. a rather well formulated description of a asymmetric problem:
We must be 100 % correct, 100 % of the time, on a schedule, with limited resources, only knowing what we know today.
- And the product has to be reliable, supportable, compatible, manageable, affordable, accessible, usable, global, doable, deployable...
They can spend as long as they like to find one bug, with the benefit of future research.

Cost for attacker to build attack is very low
Cost to your users is very high
Cost of reacting is higher than cost of defending

Principle: Protect data

Statement; Protect both data at rest and data in transport.

Rationale; Data are vital to the organisation and it's business, and protection of data must have very high priority.

Implications; classify data, classify risk, encryption

Principle: Reduce Complexity

Statement; Assign and enable only requires components to the system. Enable only on the required feature level.

Rationale; Complexity overhead generates extra administration and governance, and makes operations and problem analysis more complex.
Acquiring or assigning unused resources adds unnecessary cost.

Implications; Allocate only required resources to each service, e.g. processor, storage or memory. Install and enable only required services, e.g. Full Text features or Integration Services only when direct required. This is also an economic issue like on license.
This principle is often described in other similar but partial principles like "Least Service" or "Least Resource":
  • Least Service; install only services that are actually needed on each component, e.g. only RDBMS on a database server and management tools on other computers. There might also be an economic issue, e.g. on license.
  • Least Resource; Allocate only required resources to each service, e.g. processor, storage or memory. This is also an economic issue like on license.


Principle: Least Privilege

Statement; Security principle to minimize risk to data exposure.

Rationale; Protect the data and reduce risk.

Implications; Do not use default security groups. Only grant minimum access and rights. Fine grained security model on all roles.


Principle: Segregation of duties

Statement; Security, Audit

Rationale;

Implications; Role Based Access Control (RBAC) on the system - not the resource.


Principle: Defense in Depth

Statement; Security, Compliance

Rationale;

Implications; segmentation, precise defined layers, each component is hardened.


Principle: Automation

Statement; automate as much as possible. That is about every thing but the decision itself.

Rationale; with automation the infrastructure and architectual components above like platform or application will be handled both with stability and quick response. If there is a enterprise principle on agility the automation is a requirement.

Implications; standards, Root Cause Analysis (RCA) and real fix.
Consider to stride for a production environment without human admins. Later this can be evolved to other environments.

Principle: Scalability

Statement; architect for scale. That is on all aspects such as capacity, response time, geographic and integration.

Rationale;

Implications;

Principle: Easy Exit

Statement; choose the solution that is the most easy to leave.

Rationale; when a solution is tight coupled to a single technology or product then it will be very difficult and costly to migrate the system to another solution.

Implications; you might look more at open formats if you rate this principle high in a given challenge.

2017-05-30

PowerShell accelerators

PowerShell is object oriented and based on the .NET framework. This implies that there are no types, only classes, but to make the daily life to the PowerShell user Microsoft has added some accelerators to PowerShell. So instead of being forced to go the long way and write [System.Management.Automation.PSObject] you can settle with [psobject].

To get a list of accelerators through the Assembly class you can use the statement
[psobject].Assembly.GetType('System.Management.Automation.TypeAccelerators',$true,$true)::Get
and on my current workstation with PowerShell 4.0 there are 80 accelerators.
The method GetType() has three implementations, and the one with three parameters is used to get the most stable and predictable invocation.
The class TypeAccelerators is private, but accessed through a PSObject object.

Some accelerators are named in lower-case and other in mixed casing (CamelCase). Generally I prefer to use the casing that the accelerator has in the definition.

I have not seen any indications on a performance hit when using accelerators. And as the code is so much more readable with accelerators I go with them where it is possible.
When it is not possible I use the full class name, e.g. "System.Data.SqlClient.SqlConnection".

It is possible to create custom accelerators like an accelerator for accelerators:
[psobject].Assembly.GetType('System.Management.Automation.TypeAccelerators')::Add('accelerators',[psobject].Assembly.GetType('System.Management.Automation.TypeAccelerators',$true,$true))
The accelerator is then available like any other accelerator:
[accelerators]::Get
This accelerator is part of some PowerShell extensions.
Personally I prefer to name custom accelerator in lower-case.

Reference

RAVN Systems: PowerShell Type Accelerators – shortcuts to .NET classes

2017-05-29

Thoughts on IT documentation

Documentation of IT is often discussed, but unfortunately often left behind. Deciding to "do it later" is in my opinion also to leave it behind.

One thing that is quite often discussed is how much to document. On one side there is a demand for "everything" while the response in a good discussion could be "Read the code".
Personally I think that you should document enough to make yourself obsolete. With that I mean that your should document with the audience in mind. Not only on the contents but also the structure and taxonomy - this again could start a new discussion about a common ontology...

In an early stage of the documentation it will most likely be helpful do define the difference between:
  • Documentation / Reporting
  • Documentation / Logging
  • Documentation / Monitoring
  • Documentation / Versioning

Principles

I have a few principles on a IT documentation platform, but I can be rather insisting on them...
  • Automation; as much as possible!
  • Wide Access; all functions, all teams, all departments in the organisation.
  • Open Access; common client application - web.
  • Broad Rights; everybody can edit. If you are worried about the quality over time, then assign a librarian in the team.
  • Versioning; version every change to the documentation. This should be automated in the documentation platform. Put every piece of code, also script, under versioning.
  • Continous Documentation; documentation is "Done" SCRUM-wise with each task.

Automation

Today there is a lot of talk about automation, which is one of my preferred weapons in any aspect on IT.
Initially I would start by automate documentation on the lowest abstraction level. That is the physics where the components can be discovered with a script through APIs. Dependencies on this level can often be resolved using tools like SQL Dependency Tracker from Red-Gate.
There are tools for generating documentation, but usually I find the result too general to be useful in the long run.

Work with the technology you are using by using documentation features in development languages, e.g. PowerShell Comment Based Help.


A comprehensive automated test suite even provides the most complete and up-to-date form of application documentation, in the form of an executable specification not just of how the system should work, but also of how it actually does work.
(Jez Humble & David Farley: „Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation“)

Inventory

To build a automated documentation that is useful on the log run I usually end up building a Inventory, like a CMDB. Sometimes I am not allowed to call it that because some manager has a high-profile project running. But then the solution is named "Inventory", "Catalogue" or whatever is politically acceptable.
This Inventory will quickly be the heart of your automation and associated solutions like documentation. Actually this asset will easily be the most valuable item to your daily work and your career.

When building a Inventory you will most likely find that a lot of the data you want to generate the documentation is already in other systems like monitoring systems, e.g. Microsoft System Center.
Please regard the Inventroy as a integrated part of the IT infrastructure, where a lot of data flows to the Inventory and the sweet things like documentation flows from the Inventory.
This pattern is very much like a data flow around a data ware house. Actually there is complete similarity if you start calling the Inventory the Unified Data Model (UDM). The good news is that if you have worked with data warehousing before, then you can re-use skills on ETL development to populate the Inventory from the sources.

By adding logging or monitoring data you can document the quality of the data.
In some cases I have seen test results and other very detailed history added to the Inventory. That I think is overdoing it, but if there is a good case then it is naturally fine. These test data must include metadata, e.g. system, version and task.

Metadata

Quite early in the automation you will find that something is missing to generate useful documentation. But these data can not be collected from the physics. Still the nature of the data looks quite simple such as environment information.
A solution is to add these metadata to the Inventory. But beware! The data is inserted and maintained manually, so you will have to build some tools to help you manage these metadata.
Examples on useful metadata are system (define!), product, version with more vendor specific data such as license and support agreement.
Relate the metadata on all levels where it add value, e.g. version details on all components.
It is my hard earned experience that you should spend some time on this task. Keep an eye constantly on the data quality. Build checks and be very paranoid about the effect of bad data.

Abstract documentation

When your Inventory is up and running, the tools are sharpened and everybody is using the documentation it is time to talk about the missing parts.
As mentioned earlier the Inventory can help with the documentation of the lowest abstraction level. This means that documentation of requirements, architecture, design etcetera that is documented at more abstract levels does not fit in a Inventory. Also documentation of important issues like motivation, discussion of decisions and failures does not fit in Inventory.
For this kind of more abstract documentation there are several solutions. Usually a solution already used for other documents is used, like SharePoint or a document archive. Just make sure that you can put everything in the solution, also Visio diagrams, huge PDFs and what your favorite tools work with.
Also you should plan how to get and store documentation from outsiders. That is consultants, vendors and customers.

Document process

I would like to repeat that it is very important to document the process with things that was discussed but deliberate left out. Not only the decision but also the motivation.
An obvious item might have been left out because of a decisive business issue. This item will most likely be discussed again, and to save time and other resources it is important to be able to continue the discussion - not forced to restart it from the beginning.
Also it is important to document trials, sometimes called "sandbox experiment" or "spike solution". They are valuable background knowledge when the system must be developed further some time in some future. Especially failures are important to document with as many details as possible.

Enterprise templates

I many organisations there are required documents - sometimes with templates - for requirements, architecture or design. These documents can be called "Business Requirements", "Architectual Document" (AD) or "Design Document".
Use the templates. Do not delete predefined sections. But feel liable to add section or to broaden the use of an existing section. If you add a section then make sure to describe what you see as new in the section, and discuss other "standard" sections in relation to the new section. Make sure to answer questions up front.

History

2017-05-29 Post created.
2017-07-17 Quote from „Continous Delivery“ added. „CMDB“ changed to „Inventory“.

2017-04-29

T-SQL formatted duration

I have a database backup that takes more than two days, and I want to look at the duration. But looking at the result of a DATEDIFF the figures are difficult to compare direct.
That took me to format the duration to something readable. At first I wanted to format to ISO-8601, but that turned out to be too cumbersome in Transact-SQL. Especially getting the leading zeros was a challenge that I usually can handle in PowerShell or JavaScript, but T-SQL is not a programming language similar user friendly formatting…
Jeff Smith wrote back in 2007 a great article "Working with Time Spans and Durations in SQL Server", but I wanted to avoid the strange date-like part and the multiple columns he worked with.

To combine the different parts into one single-column answer I use part indicators like 'd' for days, 'h' for hours, 'm' for minutes and 's' for seconds. That I think is quite common indicators and should be immediately recognizable to most people. Also non-IT end users.

Most log solutions give a duration with a lot of seconds, like a SQL Server  database backup. But that can be difficult to comprehend and compare directly

DECLARE @duration INT = 221231;  -- seconds
SELECT CAST(@duration / 86400 AS varchar(3)) + 'd' + CAST((@duration % 86400) / 3600 AS varchar(2)) + 'h' + CAST((@duration % 3600) / 60 AS varchar(2)) + 'm' + CAST(@duration % 60 AS varchar(2)) + 's'

This gives the output
2d13h27m11s
That is 2 days, 13 hours, 27 minutes and 11 seconds.

The motivation for this blog entry is database backup history, and the formatting above can be used against the history
SELECT
  [database_name]
 ,backup_set_id, media_set_id, name
 ,backup_start_date, backup_finish_date
 ,CAST(DATEDIFF(SECOND, [backup_start_date], [backup_finish_date]) / 86400 AS varchar(3)) + 'd '
   + CAST((DATEDIFF(SECOND, [backup_start_date], [backup_finish_date]) % 86400) / 3600 AS varchar(2)) + 'h '
   + CAST((DATEDIFF(SECOND, [backup_start_date], [backup_finish_date]) % 3600) / 60 AS varchar(2)) + 'm '
   + CAST(DATEDIFF(SECOND, [backup_start_date], [backup_finish_date]) % 60 AS varchar(2)) + 's' AS [duration_formatted]
 ,(backup_size / 1024 / 1024 / 1024) as [backup_size_gb]
 ,(compressed_backup_size / 1024 / 1024 / 1024) as [compressed_backup_size_gb]
FROM msdb.dbo.backupset
--WHERE [database_name] = 'my_large_database'
ORDER BY [backup_finish_date] DESC

You can comment in or out lines in the statement above to get what you need in the given situation.

Maybe one can use STUFF to insert values in pre-formatted output, but that I could not get to work in first shot. And it did not really add significant value to the output...

2017-02-04

DBA Backlog

As a Database Administrator both in projects and daily routines like administration or operations all kinds of things show up. Normally they are associations that doesn't really fit in the plans or the context, but still relevant or even important in the long run.
Usually these things and ideas end up on post-it notes, whiteboards or another volatile medium. If you are lucky to be in a large project with experienced developers and project managers some items might be saved.

Another way is to create your own backlog of ideas, points of interest, nice to haves and other things that might make your works better. In ITIL these things will be considered part of Continual Service Improvement (CSI) as input to the Plan part of the Deming Circle.

I have collected a set of attributes I have used to establish a backlog in a project or a team. Not all attributes are relevant in each situation. Sometimes it is also necessary to twist an attribute to make the backlog usable.

And to my proposal for a general backlog:
Attribute NameAttribute Description
TitleShort and descriptive title of the backlog item.
ThemePredefined themes to sort the backlog on a "dimension" like technology. This is very useful when the backlog grows.
{ SQL Server | SSDB | SSIS | ... }
DescriptionProse description of the backlog item with as many details and thoughts as possible.
The description can easy change and grow over time, even before the backlog item is activated and processed.
PriorityThe priority of the backlog item. Predefined values that makes sense, even to senior management.
{ Critical | High | Medium | Low }
AbstractionThis item can be added to the backlog if you are working with defined abstraction layers from idea to solution. This is sometimes seen in agile development methods like SAFe.
{ Epic | Story | Task }
StatusStatus of the backlog item. The values are inspired by Kanban to support an effective execution of the backlog.
"Keep the Ready queue short".
{ Backlog | Ready | Doing | Done }
EstimateA quick estimate of the time needed to process and finish the backlog item.
The item is finish when the issue is solve and documented – completely. As in agile development.
{ Day | Week | Month | Quarter | Year }
DeadlineDate of deadline, if one is required. If a deadline exists it is usually given by business or management.
ChangedTimestamp of last change to the backlog item.
Consider to enable automatic versioning on the backlog.
CreatedTimestamp of when the backlog item was created. Usually when the item was entered into the backlog.
Changed byThe person who last changed the backlog item.
Created byThe person who created the backlog item. Usually the person who entered the item into the backlog.
TangibleThis is more a business attribute that tells about where business can relate to the backlog item.
OwnerBusiness owner of the backlog item.
ResponsibleThe person who is responsible on the backlog item. Usually the person who is to get the job done.

I personally have good experiences on building a backlog as a SharePoint list rather quick and effective. This also gives all the SharePoint features on filtering, printing and other trivialities.
Another possibility I have tried a few times is using the backlog features i Microsoft Team Foundation Server (TFS). It works really nice also for Database Administrators.

History

2017-04-02 Blog post created.
2017-05-08 Abstraction element added to backlog schema.