Sunday, June 5, 2016

SSIS - Using Temp Tables in SSIS





Temporary
tables are created in the TempDB database, which persists for a particular session. The
objective is to maintain that session, until the temp table information
is used and dump the data into a physical table. To maintain a session in SSIS, there is a
‘Retain Same Connection’ property of the Connection Manager. If we need to
maintain a single session, we have to mark this property ‘True’.
Let us
start with a sample package. I will create the package step by step.
  1. Create two Connection manager’s – ‘Server 1′ and ‘Server 2′. Server 2 will be
    the server where we need to create the temp table using Server 1 and thereafter use that
    temp table in the queries executed on Server2.
    image
  2. Set
    the RetainSameConnection property of the Server 2 to True.
    image
  3. Drag an
    Execute SQL Task in the Control Flow.
    image
  4. Insert
    the SQL Statement create table #temptestusers (id int) inside the
    SQLStatement field in the Execute SQL Task Editor. This will create the temp table. Also set the Connection to
    Server2.
  5. Drag a
    DFT below the Execute Sql Task, which will insert the data in the temp table.
  6. Drag
    OLEDB source and OLEDB destination in the DFT
  7. In the
    OLEDB Source, enter the query
    SELECT 1 as id
    UNION ALL
    SELECT 2 as id
    
  8. The next
    step is to connect the ‘OLEDB Destination’.  Our objective is to load the data
    into a temporary table. However, you will not be directly able to select the temp
    table in Design mode. So, we will create a new temp table, if you do not have
    table creation rights, you can create a global temporary table and select
    that. Here, we will create a TestDestination table and map the Source and
    Destination columns :
  9. Now, as we mapped it in design mode, we can modify it
    later to use the Temporary table. To use Temporary table in the Destination, follow
    the below steps:
    a) Go to SSIS Menu > Select Work Offline
    o
    b) Select the OLEDB Destination > Properties >
    Set the OpenRowset property to use the temporary table ‘dbo.[#temptestusers]‘
  10. Now, we can verify the data in the temp table. First of
    all Deselect the ‘Work Offline’ mode, which was used in above step. To verify
    we will add other Data Flow Task which will query this temp table created above
    and will load the data from it to another server table. Add a  second DFT, ‘Use
    the temp table and insert in physical table’.
  11. Add OLEDB Source and Destination in the new DFT. Use
    SQL to select the rows from temp table select id from
    #Temptestusers
    .
  12. You will not be able to select it in design mode, as the
    temp table does not exist right now. So, first select from any other table and
    start the DFT working, then follow the same steps as shown above. ‘Go to
    the SSIS menu -> Select Work Offline -> Go to the OLEDB Source Properties
    -> Select SQLCommand and modify the query in the String Value editor to use
    the temp table.
  13. Next in the OLEDB Destination, select the server on which
    you have permission to create table, in this case I am selecting Server1.
  14. That’s it. The temporary table package is now completed.
    We can run the package and review the flow of data.
    2 Rows transferred into #temptestusers table.
    2 rows transferred to the physical table from temp table.
Hopefully this will help anyone to make use of
temporary tables in their design. The only thing is to make sure of is that we set
‘Work Offline’ flag on/off in the package.

SSIS - Temp table 3



Scenario:

We have create a SSIS Package for Upsert(Insert/Update). We get csv file with millions of records with (Id,Name,Address columns). If the record come with new Id , we need to insert that record in dbo.Customer table(id, name, address) and for existing IDs we need to update those records.

After doing some analysis, we got to know that the number of records those need to be updated on daily basis are minimum 100,000 per day. To perform above task we can use Lookup Transformation and find out existing and non existing records. Any non-existing IDs can be directly inserted into dbo.Customer table but for update we have to use OLE DB Command transformation. OLE DB Command transformation is slow, it will update one row at a time and for 100,000 records it will take long time.

How about inserting the records into some staging table and write TSQL Statement to Insert/update records? Good idea! It will be fast and easy to do. But my Architect do not want to create a new table :(

Solution:

Ok, How about we create Temp table and then use it in our package to perform the above task and once done, the Temp table will be gone!

Let's start with step by step approach

Step 1:

Prepare Source.csv file on desktop by using below data
Id,Name,Address
1,Aamir,ABC ADDRESS
2,Raza,Test Address
3,July, 123 River Side CA
4,Robert,540 Rio Rancho NM

Step 2:


Create dbo.Customer Table by using below script

USE TestDB
GOCREATE TABLE dbo.Customer
  (
     ID      INT,
     Name    VARCHAR(100),
     Address VARCHAR(100)
  )


Step 3: 

Create SSIS Package to load csv file into dbo.Customer Table.( Insert new records and update existing)
Create OLE DB Connection to the database where your dbo.Customer table exists. Right Click on Connection and then click properties or Click on Connection and press F4 to go to properties. 
Set RetainSameConnection=True. 
Fig 1: Set RetainSameConnection to True for OLE DB Connection


Step 4: 

Create ##Temp table by using Execute SQL Task as shown below by using 
Create Table ##Temp(ID INT, Name VARCHAR(100),ADDRESS VARCHAR(100))
Fig 2: Create ##Temp table by using Execute SQL Task


Step 5: 

Bring Data Flow Task to Control Flow Surface and then connect Execute SQL task to it. Inside Data Flow task bring Flat File Source and make connection to Source.csv file that you have created in Step 1.
Drag Lookup Transformation and configure as shown below. Our goal is to Insert any record which Id does not exist in dbo.Customer table and if ID exists we want to update that records. Instead of using OLE DB Command Transformation, we will insert records which needs to be update in ##Temp table inside Data Flow Task.
Fig 3: Configure Lookup Transformation ( Redirect rows to no match output)

Fig 4: Choose Id from dbo.Customer for lookup

Fig 5: Map the Source Id to dbo.Customer.ID for lookup

Step 6:

Bring OLE DB Destination Transformation from Data Flow Items as shown. Join No Match Output ( new records) of Lookup to OLE DB Destination and choose destination Table (dbo.Customer).
Fig 6: Insert new records by using No Match Output of Lookup Transformation

As we do not want to use OLE DB Command transformation for update inside Data Flow Task. Let's write all records those need to be update into ##Temp table by using OLE DB Destination. We will not be able to see ##Temp table in drop down in OLE DB Destination. Here are two steps we need to take
i) Create a variable with name ##Temp as shown below
Fig 7: TableName variable holding Temp Table Name

ii) Go to SSMS and create ##Temp table ( if you would not create this table, you will not be able to map the columns in OLE DB Destination)
Create Table ##Temp(ID INT, Name VARCHAR(100),ADDRESS VARCHAR(100))

Bring the OLE DB Destination and map to TableName Variable as shown below.
Fig 8: Configure OLE DB Destination to use TableName variable for Destination Table Name.


Fig 9: Map the Source Columns to ##Temp Table Columns

After all the configuration our Data Flow will look like below figure. I renames the transformation to provide better picture about what we are doing in this Data Flow Task.
Fig 10: Data Flow Task with ##Temp Table Destination.

Step 7:

Go to Control Flow Surface and Drag Execute SQL Task to write update statement.
UPDATE DST 
SET DST.Name=SRC.Name
,DST.ADDRESS=SRC.ADDRESS
FROM  dbo.Customer DST
INNER JOIN ##Temp SRC
ON DST.ID=SRC.ID

Fig 11: Execute SQL Task to Update Dbo.Customer from ##Temp 

Our final SSIS Package will look like below
Fig 12: Insert/Update Package by using Temp Table for Updates

If we try to run the SSIS Package, It might complain that ##Temp does not exists. Go to package properties by right clicking in Control Flow Pane and Set DelayValidation=True. By setting DelayValidation we are asking the package not to validate any objects as ##Temp table does not exist at this point and it will be created later in Package. 
Fig 13: Set Delay Validation=True

Run the Package couple of times and check the data in dbo.Customer table. Data should be loaded. Now let's go to Source.csv file and change some values for Name and Address columns and run the package one more time to make sure, Update logic is working fine.

Here is the data after update.
Id,Name,Address
1,Aamir1,Test  ADDRESS
2,Raza1,Test Address
3,July, 123 River Side CA USA
4,Robert,540 Rio Rancho NM

Fig 14: Package Execution After Updating Records in Source.csv file

As we can see that the records are updated, where ever we made changes in Name and Address values.

Fig 16: dbo.Customer data after Upsert




SSIS - temp table 2



SSIS - Temp table use in SSIS - 1



Problem
I'm trying to use a temp table in an SSIS package. It seems like everything is working correctly until I try to query the temp table. What am I doing wrong?
Solution
Creating temp tables in SSIS seems like a straight-forward process using the Execute SQL Task, however there are a couple of properties that must be changed. In this tip, we'll walk through creating a simple temp table in SSIS.

Creating Sample SSIS Package

First, I will drag an Execute SQL Task into my Design view and rename it Create Temp Table:
Creating temp tables in SSIS
Next, I will right click and edit and choose my connection and SQLStatement:
choose my connection and SQLStatement
(The SQL statement used in this example is below)
IF OBJECT_ID('tempdb..##tmpTeams') IS NOT NULL
    DROP TABLE ##tmpTeams
    CREATE TABLE ##tmpTeams
    (
        Team VARCHAR(255),
        Mascot VARCHAR(255),
  State VARCHAR (2)
    )
    INSERT INTO ##tmpTeams VALUES
       ('Auburn', 'Tigers', 'AL'),
       ('Alabama', 'Crimson Tide', 'AL'),
       ('Mississippi', 'Rebels', 'MS'),
       ('Louisiana State', 'Tigers', 'LA'),
       ('Mississippi State', 'Bulldogs', 'MS'),
 ('Arkansas', 'Razorbacks', 'AR'),
 ('Texas A&M', 'Aggies', 'TX'),
 ('Georgia', 'Bulldogs', 'GA'),
 ('Tennessee', 'Volunteers', 'TN'),
 ('Vanderbilt', 'Commodores', 'TN'),
 ('Florida', 'Gators', 'FL'),
 ('South Carolina', 'Gamecocks', 'SC'),
 ('Missouri', 'Tigers', 'MO')
Next, I will drag a Data Flow task into my Design view, rename it Query and drag the precedence constraint between the two:
rename it Query
For the Data Flow task we are going to query the temp table and export the results to a database table. Right click the Data Flow task and choose Edit. Drag a OLE DB Source and a OLE DB Destination task into the Design view.
Right click the Data Flow task
To avoid errors when configuring the OLE DB Source we need to create the temp table first using SSMS. In SSMS run the following statement to create a global temp table:
CREATE TABLE ##tmpTeams
    (
        Team VARCHAR(255),
        Mascot VARCHAR(255),
  State VARCHAR (2)
    )
Once the table has been created, let's go back into our SSIS package. Right click OLE DB Source and choose Edit. Choose your data source and choose SQL command for the Data access mode dropdown. In the SQL command text we will need to create our SQL statement:
SQL command text
Hit OK on the OLE DB Source window and right click OLE DB Destination and choose Edit. Choose your datasource and choose "Table or View - fast load" from the Data access mode dropdown. I already created a table called AlabamaTeams that I will use to display the data and will use this name in the dropdown for the Name of the table or the view.
Choose your datasource
Once finished configuring the OLE DB Destination hit OK. We are now ready to execute the package. Hit F5 to begin or choose Debug, Start Debugging from the Menu bar:
Start Debugging from the Menu bar
After executing the package, an error will occur on the Query task:
an error will occur on the Query task
If we look at the Progress tab we can see that the reason this error occurs is because after the temp table is created in the "Create Temp Table" task it is deleted. The "Query" task searches for the table but can't find it.
Create Temp Table

Fixing the Issue

To fix this issue, we will need to change a few properties on the tasks and connection manager. The first property is a connection manager property. If you right click the OLEDB Connection Manager and choose properties you will see a property called RetainSameConnection. This must be set to "True". RetainSameConnection means that the temp table will not be deleted when the task is completed.
The second property we must change is the Execute SQL Task property. If you right click on each task (Create Temp Table and Query) and choose properties you will see a property called DelayValidation. This must be set to "True". DelayValidation means that the task will not check if the table exists upon creation.
the Execute SQL Task property
Once we have set both properties to "True", execute the package again. The package should be successful this time:
The package should be successful
If we switch back over to SSMS and query the table, AlabamaTeams, we should return a result set:
switch back over to SSMS and query the table
**Note: I'm using global temp tables (##tmpTeams) instead of local temp tables (#tmpTeams) because when I create the table using SSMS the first time it uses a different session (SPID) therefore when I try to configure my OLE DB Source it cannot find the temp table and gives the following error:
OLE DB Source
With that said, only global temp tables should be used. Also, make sure when you are creating your global temp tables that the table name does not interfere with any other global temp tables that may be created using other processes.
Next Steps
  • I would assume that when creating a temp table inside a SSIS package that other tasks in the package will use the temp table, therefore I suggest adding another task that will drop the temp tables at the end of the package. For example, DROP TABLE ##tmpTeams.
  • Check out this tutorial to learn more about temp tables.





SSIS - Temp table use in SSIS


How to create a temporary table in SSIS control flow task and then use it in data flow task?

...use a variable instead, after all SSIS is meant to work in memory. 

solution:

Set the property RetainSameConnection on the Connection Manager to True so that temporary table created in one Control Flow task can be retained in another task.
Create a stored procedure that will create a temporary table named ##tmpStateProvince and populate with few records. 

The sample SSIS package will first call the stored procedure and then will fetch the temporary table data to populate the records into another database table. 

USE Sora;
GO

CREATE PROCEDURE dbo.PopulateTempTable
AS
BEGIN

    SET NOCOUNT ON;

    IF OBJECT_ID('TempDB..##tmpStateProvince') IS NOT NULL
        DROP TABLE ##tmpStateProvince;

    CREATE TABLE ##tmpStateProvince
    (
            CountryCode     nvarchar(3)         NOT NULL
        ,   StateCode       nvarchar(3)         NOT NULL
        ,   Name            nvarchar(30)        NOT NULL
    );

    INSERT INTO ##tmpStateProvince 
        (CountryCode, StateCode, Name)
    VALUES
        ('CA', 'AB', 'Alberta'),
        ('US', 'CA', 'California'),
        ('DE', 'HH', 'Hamburg'),
        ('FR', '86', 'Vienne'),
        ('AU', 'SA', 'South Australia'),
        ('VI', 'VI', 'Virgin Islands');
END
GO
Create a table named dbo.StateProvince that will be used as the destination table to populate the records from temporary table. Use the below create table script to create the destination table.
USE Sora;
GO

CREATE TABLE dbo.StateProvince
(
        StateProvinceID int IDENTITY(1,1)   NOT NULL
    ,   CountryCode     nvarchar(3)         NOT NULL
    ,   StateCode       nvarchar(3)         NOT NULL
    ,   Name            nvarchar(30)        NOT NULL
    CONSTRAINT [PK_StateProvinceID] PRIMARY KEY CLUSTERED
        ([StateProvinceID] ASC)
) ON [PRIMARY];
GO
Create an SSIS package using Business Intelligence Development Studio (BIDS). Right-click on the Connection Managers tab at the bottom of the package and click New OLE DB Connection... to create a new connection to access SQL Server 2008 R2 database.
Connection Managers - New OLE DB Connection
Click New... on Configure OLE DB Connection Manager.
Configure OLE DB Connection Manager - New
Perform the following actions on the Connection Manager dialog.
  • Select Native OLE DB\SQL Server Native Client 10.0 from Provider since the package will connect to SQL Server 2008 R2 database
  • Enter the Server name, like MACHINENAME\INSTANCE
  • Select Use Windows Authentication from Log on to the server section or whichever you prefer.
  • Select the database from Select or enter a database name, the sample uses the database name Sora.
  • Click Test Connection
  • Click OK on the Test connection succeeded message.
  • Click OK on Connection Manager
Connection Manager
The newly created data connection will appear on Configure OLE DB Connection Manager. Click OK.
Configure OLE DB Connection Manager - Created
OLE DB connection manager KIWI\SQLSERVER2008R2.Sora will appear under the Connection Manager tab at the bottom of the package. Right-click the connection manager and click Properties
Connection Manager Properties
Set the property RetainSameConnection on the connection KIWI\SQLSERVER2008R2.Sora to the value True.
RetainSameConnection Property on Connection Manager
Right-click anywhere inside the package and then click Variables to view the variables pane. Create the following variables.
  • A new variable named PopulateTempTable of data type String in the package scope SO_5631010 and set the variable with the value EXEC dbo.PopulateTempTable.
  • A new variable named FetchTempData of data type String in the package scope SO_5631010 and set the variable with the value SELECT CountryCode, StateCode, Name FROM ##tmpStateProvince
Variables
Drag and drop an Execute SQL Task on to the Control Flow tab. Double-click the Execute SQL Task to view the Execute SQL Task Editor.
On the General page of the Execute SQL Task Editor, perform the following actions.
  • Set the Name to Create and populate temp table
  • Set the Connection Type to OLE DB
  • Set the Connection to KIWI\SQLSERVER2008R2.Sora
  • Select Variable from SQLSourceType
  • Select User::PopulateTempTable from SourceVariable
  • Click OK
Execute SQL Task Editor
Drag and drop a Data Flow Task onto the Control Flow tab. Rename the Data Flow Task as Transfer temp data to database table. Connect the green arrow from the Execute SQL Taskto the Data Flow Task.
Control Flow Tab
Double-click the Data Flow Task to switch to Data Flow tab. Drag and drop an OLE DB Sourceonto the Data Flow tab. Double-click OLE DB Source to view the OLE DB Source Editor.
On the Connection Manager page of the OLE DB Source Editor, perform the following actions.
  • Select KIWI\SQLSERVER2008R2.Sora from OLE DB Connection Manager
  • Select SQL command from variable from Data access mode
  • Select User::FetchTempData from Variable name
  • Click Columns page
OLE DB Source Editor - Connection Manager
Clicking Columns page on OLE DB Source Editor will display the following error because the table##tmpStateProvince specified in the source command variable does not exist and SSIS is unable to read the column definition.
Error message
To fix the error, execute the statement EXEC dbo.PopulateTempTable using SQL Server Management Studio (SSMS) on the database Sora so that the stored procedure will create the temporary table. After executing the stored procedure, click Columns page on OLE DB Source Editor, you will see the column information. Click OK.
OLE DB Source Editor - Columns
Drag and drop OLE DB Destination onto the Data Flow tab. Connect the green arrow from OLE DB Source to OLE DB Destination. Double-click OLE DB Destination to open OLE DB Destination Editor.
On the Connection Manager page of the OLE DB Destination Editor, perform the following actions.
  • Select KIWI\SQLSERVER2008R2.Sora from OLE DB Connection Manager
  • Select Table or view - fast load from Data access mode
  • Select [dbo].[StateProvince] from Name of the table or the view
  • Click Mappings page
OLE DB Destination Editor - Connection Manager
Click Mappings page on the OLE DB Destination Editor would automatically map the columns if the input and output column names are same. Click OK. Column StateProvinceID does not have a matching input column and it is defined as an IDENTITY column in database. Hence, no mapping is required.
OLE DB Destination Editor - Mappings
Data Flow tab should look something like this after configuring all the components.
Data Flow tab
Click the OLE DB Source on Data Flow tab and press F4 to view Properties. Set the property ValidateExternalMetadata to False so that SSIS would not try to check for the existence of the temporary table during validation phase of the package execution.
Set ValidateExternalMetadata
Execute the query select * from dbo.StateProvince in the SQL Server Management Studio (SSMS) to find the number of rows in the table. It should be empty before executing the package.
Rows in table before package execution
Execute the package. Control Flow shows successful execution.
Package Execution  - Control Flow tab
In Data Flow tab, you will notice that the package successfully processed 6 rows. The stored procedure created early in this posted inserted 6 rows into the temporary table.
Package Execution  - Data Flow tab
Execute the query select * from dbo.StateProvince in the SQL Server Management Studio (SSMS) to find the 6 rows successfully inserted into the table. The data should match with rows founds in the stored procedure.
Rows in table after package execution
The above example illustrated how to create and use temporary table within a package.