PowerShell: Refining the rows added with SqlBulkCopy

This will be an addition to https://phillipwolf91.wordpress.com/2014/09/15/powershell-import-csv-to-sql-via-sqlbulkcopy/

This is the script that was created in that post

$SqlConnection = New-Object System.Data.SqlClient.SqlConnection
$SqlConnection.ConnectionString = “Server={yourServerPath};Database={yourDatabase};User vID={yourUsername};Password={yourPassword}”

$CSVDataTable = Import-Csv ‘C:\Path\To\Your\CSV.csv’ | Out-DataTable

$sqlBulkCopy = New-Object (“Data.SqlClient.SqlBulkCopy”) -ArgumentList $SqlConnection
$sqlBulkCopy.DestinationTableName = “dbo.yourTable”

$ColumnMap1 = New-Object System.Data.SqlClient.SqlBulkCopyColumnMapping(0, 1)
$ColumnMap2 = New-Object System.Data.SqlClient.SqlBulkCopyColumnMapping(1, 2)
$ColumnMap3 = New-Object System.Data.SqlClient.SqlBulkCopyColumnMapping(2, 3)
$sqlBulkCopy.ColumnMappings.Add($ColumnMap1)
$sqlBulkCopy.ColumnMappings.Add($ColumnMap2)
$sqlBulkCopy.ColumnMappings.Add($ColumnMap3)

$SqlConnection.Open()
$sqlBulkCopy.WriteToServer($CSVDataTable)
$SqlConnection.Close()

In this script, every single row from the CSV is added to the table in the DB. This is all honky dory if the CSV file contains new information each and every time. What if the it’s the same CSV file that just gets records added to it? Using the above script would create duplicates every single time it’s run.

My current solution to this predicament is to create a DataTable based off what’s currently in the DB and another DataTable that will contain the difference in records between the one from the CSV and the DB. Here’s an example. I’m just going to hit you with the WHOLE THING at once and you’ll be able to see the difference between the two scripts (aside from the comments).

#Create the DataTable from the CSV
$CSVDT = Import-CSV “C:\a\aFolder\DATA.csv” | Out-DataTable

#Creates the Connection String
$SqlConnection = New-Object System.Data.SqlClient.SqlConnection
$SqlConnection.ConnectionString = “Server={your_server};Database={your_database};User ID={your_user};Password={your_password}”

#Creates the Query  to DB
$SqlQuery = New-Object System.Data.SqlClient.SqlCommand
$SqlQuery.CommandText = “SELECT * FROM [dbo].[your_table]”
$SqlQuery.Connection = $SqlConnection
$SqlDA = New-Object System.Data.SqlClient.SqlDataAdapter
$SqlDA.SelectCommand = $SqlQuery

#Creates the DataTable for DB query results
$SqlDT = New-Object System.Data.DataTable

#Fills the DB DataTable
$SqlConnection.Open()
$SqlDA.Fill($SqlDT)
$SqlConnection.Close()
$CSVCount = $CSVDT.ROWS.COUNT
$SqlCount = $SqlDT.ROWS.COUNT

#Create a new DataTable that will contain the difference in records
$NewEntries = New-Object System.Data.DataTable
$NewEntries.Columns.Add(“FirstColumn”)
$NewEntries.Columns.Add(“SecondColumn”)
$NewEntries.Columns.Add(“ThirdColumn”)

#Add the difference in records to the new DataTable
WHILE($SqlCount -lt $CSVcount)
{
$NewEntries.ImportRow($EnrollmentsDT.Rows[$SqlCount])
$SqlCount += 1
}

#Create the BulkCopy object
$sqlBulkCopy = New-Object System.Data.SqlClient.SqlBulkCopy $SqlConnection
$sqlBulkCopy.DestinationTableName = “dbo.Enrollments”

#Create column mappings to account for the Id column
$firstColumn = New-Object System.Data.SqlClient.SqlBulkCopyColumnMapping(0, 1)
$SecondColumn = New-Object System.Data.SqlClient.SqlBulkCopyColumnMapping(1, 2)
$ThirdColumn = New-Object System.Data.SqlClient.SqlBulkCopyColumnMapping(2, 3)
$sqlBulkCopy.ColumnMappings.Add($firstColumn)
$sqlBulkCopy.ColumnMappings.Add($SecondColumn)
$sqlBulkCopy.ColumnMappings.Add($ThirdColumn)

#Copy new entires to DB
$SqlConnection.Open()
$sqlBulkCopy.WriteToServer($NewEntries)
$SqlConnection.Close()

What happens here is that a SqlCommand is created to query the database for the current information in the table. That information is then placed into a DataTable. A third DataTable is created in order to hold the difference in records between the SQL DataTable and the CSV DataTable. The script then checks the total amount of rows in each DataTable. Then, while the total of rows from the SQL db remain less than that of what is in the CSV, it adds the missing rows until the number of rows match into the third DataTable.

Obviously, this has it’s design flaws for the moment. But depending on your needs and usages this could provide useful even if some tinkering is needed. If to say, a record were to be removed, this script wouldn’t pick that up. This will be something I will need to further look into.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s