|Previous Article||matchIT SQL Index||Next Article|
The SSIS Tasks are installed with matchIT SQL by default. During installation matchIT SQL detects which version(s) of SQL server you have installed, and, providing SQL Server Integration Services are installed, registers the appropriate assemblies to the Global Assembly Cache, as well as copying them to the DTS folder of the installation. Note that all of the following SSIS tasks use the same file (Template.xml located in the SSIS subfolder within the matchIT SQL install folder) as their base for creating XML configuration files for their processes. This template contains the paths to the matching matrices and master priorities files that are used in the processes, which can be manually amended if needed to point to different files. As well as the Tasks, a demo SSIS package is installed to the demo folder in matchIT SQL, under an SSIS subfolder. There are versions for SQL 2005, SQL 2008, SQL 2012 and SQL 2014, all of which are configured to work with the demo database provided with matchIT SQL. Once the demo database is installed and the setup step below for the tasks has been completed, the packages should be able to run out of the box – the only things that may need amending are references to connection strings, which can be done quite easily in BIDS/SSDT. If the connection strings do need to be amended, you may also need to check that the mappings in the Generate Keys components are persisted.
Before the tasks can be used in BIDS or Visual Studio, they need to be added to the toolbox in the control flow pane (Note this is not applicable to BIDS/VS 2010+). You can do this by right-clicking on the toolbox and selecting ‘Choose Items…’ The ‘Choose Toolbox Items’ dialog will appear after a while, in which you need to select the ‘SSIS Control Flow Items’ tab. In this tab you should see the matchIT SQL tasks currently unselected (they are identifiable by the fact that they are named using the prefix ‘MISQL.’). Simply select the tasks by checking the checkbox next to each one, and click ‘OK’.
Below is a brief description of each task and what it does. For a visual representation of some typical SSIS processes and some of the tasks themselves, please see The SSIS Tutorial for MatchIT SQL.
MISQL.GenerateCorrectedAddresses – This task should be the first task in a sequence if it is to be used at all. Its function is to generate corrected addresses from a given source and output them in a specified output format. This task basically wraps the same core functionality as the procedure msp_GenerateCorrectedAddresses. The specified output table in this task is persisted in a variable at the SSIS package level at design time so that it can be picked up and used in the next task.
MISQL.GenerateNCOAAddresses – If you have US addresing the this task will usually follow the GenerateCorrectedAddresses task, but can also be run standalone or even following the GenerateKeys task to allow for maximum flexibility. Its purpose is to identify individuals, families, and companies that have changed their address (i.e. moved), to help keep a database up to date. Please see this article for further information on the msp_GenerateNCOAAddresses stored procedure, which provides the same functionality as this task. Note that this service is currently only available for US addresses, to licensed users only.
MISQL.GenerateKeys – This task should be the first task in a sequence unless any of the previous tasks are also being used. This task generates the match keys required to be used in the matching tasks from the specified source data. Mappings are made from source columns to their relevant matchIT API field. This task encapsulates the same core functionality as msp_CreateKeysTable and msp_BulkGenerateKeys. By setting up a GenerateKeys task, you are effectively setting up a ‘Data Source’ with an ID, which is used as a reference in following matching and grouping tasks.
MISQL.FindMatches – This task should follow on from a GenerateKeys task. It is used to set up and execute fuzzy matching based on specified match keys and minimum score thresholds, and is pointed at a data source set up by a GenerateKeys task. The core functionality used by this task is the same as the procedure msp_FindMatches.
MISQL.GroupMatches – This task should follow on from a FindMatches task. It is used to group the results produced by a FindMatches task. Different types of tables can be selected to be output and the names of the tables can be specified. The core functionality of this task encapsulates the procedures msp_GroupMatches, msp_OutputMatchingPairs, msp_OutputMatchingGroups, msp_OutputDuplicates and msp_OutputDedupedTable.
MISQL.FindExactMatches – This task should again follow on from a GenerateKeys task. It's basically the same as the FindMatches task, only instead of fuzzy matching, exact matching is applied using the keys specified. The core functionality used is the same as msp_FindExactMatches.
MISQL.GroupExactMatches – This task should follow on from a FindExactMatches task. It is used to group the results produced by a FindExactMatches task. The core functionality is the same as the procedure msp_GroupExactMatches.
MISQL.FindOverlap – This task should follow on from two GenerateKeys tasks. It is used to set up and execute fuzzy overlapping between the two data sources set up in the preceding GenerateKeys tasks. The match keys and score thresholds to be used can be configured. The core functionality used by this task is the same as the procedure msp_FindOverlap.
MISQL.GroupOverlap – This task should follow on from a FindOverlap task. It is used to group the results produced by a FindOverlap task. Different types of tables can be selected to be output and the names of the tables can be specified. The core functionality of this task encapsulates the procedures msp_GroupOverlap, msp_OutputOverlapMatchingPairs, msp_OutputOverlapMatchingGroups, msp_OutputOverlapDuplicates and msp_OutputOverlapDedupedTable.
MISQL.FindExactOverlap – This task is basically the same as the FindOverlap task, only instead of fuzzy matching, exact matching is applied using the keys specified. The core functionality used is the same as msp_FindExactOverlap.
MISQL.GroupExactOverlap – This task should follow on from a FindExactOverlap task. It is used to group the results produced by a FindExactOverlap task. The core functionality is the same as the procedure msp_GroupExactOverlap.
|Previous Article||matchIT SQL Index||Next Article|