Microsoft.NET 2.0 Framework includes the SqlBulkCopy class which can be used to perform massive data copy operations between different sources. In this article we will demonstrate how to use the SqlBulkCopy class in various scenarios. We will also build a custom XmlDataReader class to speed up the XML file to database operations.
Introduction:
Microsoft.NET 2.0 Framework includes the SqlBulkCopy class which can be used to perform massive data copy operations between different sources. In this article we will demonstrate how to use the SqlBulkCopy class in various scenarios. We will also build a custom XmlDataReader class to speed up the XML file to database operations.
Database Schema:
The database schema for this article is very simple. Our scenario consists of two tables Customers and CustomersArchive. Here is schema definition for these tables:
CustomerID: Primary key of the Customer table
FirstName: Stores the first name of the customer
LastName: Stores the last name of the customer
Both the tables have the identical schema which is an important point to consider when performing bulk copy operation. Our task is to copy the data from the Customers table into the CustomersArchive table.
Populating the Customers Table:
Before we start the data transfer we need to populate our source table, Customers with some dummy data. Take a look at the code below which is used to populate the Customers table:
private static void PopulateDatabase()
{
string connectionString = "Server=localhost;Database=EasyShopping;Trusted_Connection=true";
SqlConnection myConnection = new SqlConnection(connectionString);
string query = "insert into customers(firstname,lastname) values(@firstname,@lastname)";
SqlCommand myCommand = new SqlCommand(query,myConnection);
myConnection.Open();
for (int i = 1; i <= 50000; i++)
{
myCommand.Parameters.AddWithValue("@firstname", "FirstName" + i);
myCommand.Parameters.AddWithValue("@lastname", "LastName" + i);
myCommand.ExecuteNonQuery();
myCommand.Parameters.Clear();
}
myConnection.Close();
}
The above code simply inserts 50K records into the Customers table. You can insert more records if you want by simply changing the “for” loop.
Performing BulkCopy on Identical Schema Tables:
In this scenario both the tables, Customers and CustomersArchive have the same schema. Here is the code that is used to perform BulkCopy operation:
string connectionString = "Server=localhost;Database=EasyShopping;Trusted_Connection=true";
SqlConnection myConnection = new SqlConnection(connectionString);
SqlCommand myCommand = new SqlCommand("select * from customers", myConnection);
myConnection.Open();
SqlDataReader reader = myCommand.ExecuteReader();
using (SqlBulkCopy bulk = new SqlBulkCopy(connectionString))
{
bulk.BatchSize = 500;
bulk.DestinationTableName = "CustomersArchive";
bulk.NotifyAfter = 500;
bulk.SqlRowsCopied += new SqlRowsCopiedEventHandler(bulk_SqlRowsCopied);
bulk.WriteToServer(reader);
}
myConnection.Close();
The SqlBullkCopy class WriteToServer method does the actual work of inserting the data from the source table to the destination table. The WriteToServer method takes different parameters which include DataTable, DataRows, IDataReader etc. We are using the IDataReader since a DataReader is a forward-only, read only pointer which will not store the references of all the nodes it has parsed. This way we are not storing all the 50K items into the memory before performing the bulk copy.
Performing BulkCopy on Different Schema Tables:
In the above example we used SqlBulkCopy operation to perform the copy operation between two tables with identical schemas. Sometimes, you will have to transfer data between tables with different schemas. Let’s assume that the source table Customers contains an extra field called “FullName”. If you run the above code it will fail because there is no mapping of “FullName” in the CustomersArchive table. In order for this to work correctly you need to setup the column mappings manually. Check out the following code that will set the correct column mappings between the source and the destination table:
string connectionString = "Server=localhost;Database=EasyShopping;Trusted_Connection=true";
SqlConnection myConnection = new SqlConnection(connectionString);
SqlCommand myCommand = new SqlCommand("select * from customers", myConnection);
myConnection.Open();
SqlDataReader reader = myCommand.ExecuteReader();
using (SqlBulkCopy bulk = new SqlBulkCopy(connectionString))
{
bulk.ColumnMappings.Add("FirstName", "FirstNameArchive");
bulk.ColumnMappings.Add("LastName", "LastNameArchive");
bulk.BatchSize = 500;
bulk.DestinationTableName = "CustomersArchive";
bulk.NotifyAfter = 500;
bulk.SqlRowsCopied += new SqlRowsCopiedEventHandler(bulk_SqlRowsCopied);
bulk.WriteToServer(reader);
}
myConnection.Close();
As, you can see in the above code we are manually setting up the column mappings between the two tables. This will let the SqlBulkCopy class know which source column maps to which destination column.
Performing Bulk Insert from XML File to Database:
There are some scenarios where you need to insert data from a large XML file to the database. This can easily be achieved by populating the XML data into a DataTable and then supplying the SqlBulkCopy.WriteToServer method with the populated DataTable. The problem with this approach is that you first have to copy all the data from the XML file into a DataTable. This is a time consuming process which can also lead to insufficient memory exceptions if the XML file is very large. If we can somehow simply iterate over the XML file and insert the data into the database then it would be ideal. Unfortunately, the only data reader accepted by the WriteToServer method is the interface, IDataReader. This means that either we choose to use the DataTable or write our own custom reader that inherits from IDataReader. Let’s take the second route and implement a custom IDataReader.
Implementing a Custom IDataReader:
We will call our data reader “XmlDataReader” and will only implement the methods that are necessary to perform the bulk copy operation. Here is the implementation of the XmlDataReader. Please note that only necessary methods are implemented.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Data;
using System.Xml;
namespace SqlBulkInsertDemo
{
public class XmlDataReader : IDataReader
{
private string _path;
private XmlTextReader _reader;
private int _oridinalPosition = -1;
private string[] _columns;
public XmlDataReader(string path, string[] columns)
{
_path = path;
_columns = columns;
_reader = new XmlTextReader(path);
}
public void Close()
{
_reader.Close();
}
public bool Read()
{
return _reader.Read();
}
public int FieldCount
{
get { return _columns.Length; }
}
public int GetOrdinal(string name)
{
_oridinalPosition++;
return _oridinalPosition;
}
public object GetValue(int i)
{
string value = String.Empty;
try
{
while (_reader.Read())
{
if (_reader.NodeType == XmlNodeType.Element)
{
if (!String.IsNullOrEmpty(_reader.Name) && _columns.Contains(_reader.Name))
{
value = _reader.ReadElementString(_reader.Name);
break;
}
}
}
}
catch (Exception ex) {
// log the exception!
}
return value;
}
public bool IsDBNull(int i)
{
throw new NotImplementedException();
}
public object this[string name]
{
get { return _reader[name]; }
}
public object this[int i]
{
get { return _reader[i]; }
}
}
}
The _columns string array is used to hold the mappings. You can also use a custom class if you like. The _path variable holds the path to the XML file which is used by the XmlTextReader class to read the XML file. The heart of this reader is the GetValue method which returns the value from the XML file.
while (_reader.Read())
{
if (_reader.NodeType == XmlNodeType.Element)
{
if (!String.IsNullOrEmpty(_reader.Name) && _columns.Contains(_reader.Name))
{
value = _reader.ReadElementString(_reader.Name);
break;
}
}
The while loop is used to search for the correct element name. When the element name is found the value is extracted and returned to the caller.
Using XmlDataReader will boost performance since now we don’t have to load all the XML elements into the memory before insertion.
Using Custom XmlDataReader:
The code below shows how to use the XmlDataReader class.
private static void ImportDataFromXmlFile()
{
string path = @"C:\Customers.xml";
IDataReader reader = new XmlDataReader(path,new string[] {"FirstName","LastName"});
try
{
using (SqlBulkCopy copy = new SqlBulkCopy(_connectionString))
{
copy.ColumnMappings.Add("FirstName", "FirstName");
copy.ColumnMappings.Add("LastName", "LastName");
copy.DestinationTableName = "CustomersArchive";
copy.BatchSize = 500;
copy.SqlRowsCopied += new SqlRowsCopiedEventHandler(copy_SqlRowsCopied);
copy.WriteToServer(reader);
}
}
catch (Exception ex)
{
}
}
Conclusion:
In this article we learned how to use the SqlBulkCopy class to perform the bulk insert operation. This is a very valuable little class in the Microsoft.NET 2.0 framework which eliminated the need of writing complex T-SQL logic or using DTS jobs to perform bulk inserts.
I hope you liked the article, happy coding!
[Download Project]