Previous Article | matchIT SQL Index | Next Article |
4.1.1.1 msp_GenerateCorrectedAddresses
Input Parameters:
- Configuration file – the file path of the configuration file to be used when this procedure is run.
- Datasource ID – specifies the data source to be used within the configuration file, which contains the table and column mapping specifications.
Generates corrected addresses for the source addresses specified in the supplied configuration. The corrected addresses will be written to the output table specified in the configuration, and can be subsequently used in matching processes.
The source and output mappings can be configured through the Web UI, or directly in the XML in the section called Addressing within the Datasource settings. The output table generated from the addressing step can then be used as the source table for your addresses within the matching step.
Setting |
Description |
dataSources |
Specifies the database connection, table and column mappings used to define the dataset being processed. To use the resulting corrected address table produced on completion of this process, you should add this table into the datasource. |
addressing->databaseSettings->SourceSettings->sourceTables |
Table name containing the source address data that is to be used during processing. |
addressing->databaseSettings->SourceSettings->sourceColumns |
This allows the mapping of the source table’s columns to specific address data types. |
addressing->databaseSettings->OutputSettings->outputTable |
Name of the table containing the corrected addresses that will be produced following completion of this process. |
addressing->databaseSettings->OutputSettings->outputColumns |
This allows the configuration of the output columns. The CorrectedMapping attribute is used to specify the address element that should go into a specific field should the source address be validated. The SourceMapping attribute specifies the source table’s data field that will populate this field, should the source address fail validation. |
addressing->databaseSettings->OutputSettings->outputColumns->TIGERData |
This section allows configuration of TIGER Data appending (US Only). The enabled attribute specifies whether or not appending is turned on, and the onlyCodeValidAddresses attribute specifies whether or not to append data for non-verified addresses also. Each TIGER field can be specified as a sub node of this node, with the name of the field as the name of the node, and an enabled attribute to specify whether or not to append the given field. This defaults to false. See this section for an explanation of each TIGER Data field. (Only applicable for US addressing and is licensed separately from standard US addressing.) |
addressing->apiSettings->threadCount |
Number of threads that may be used during the processing. Recommendation is to use 1 thread per machine core. |
addressing->apiSettings->defaultAPI |
When licenced for the UK and US addressing APIs, but not the international addressing API, this setting specifies which addressing API (UK or US) to use for international addresses. |
addressing->progressLoggingSettings->filePath |
Location of a progress log file which is output during Address processing. |
addressing->progressLoggingSettings->interval |
How frequently updates will be written to the progress log file (in milliseconds). |
4.1.1.2 UK Addressing Settings
Setting |
Description |
Addressing->apiSettings->enabledUK |
When set to “false” the UK addressing API is not used even when licensed. Default “true”. |
addressing->apiSettings->processType |
Value can be NameCorrection, AddressCorrection or Postcode correction. Postcode correction is an option for UK addressing only and indicates that only postcodes are to be updated and not address lines. NameCorrection requires a separately licensed consumer names database and verifies individual name as well as address.DPSOnly is an option for UK addressing that appends the DPS code without changing the address or postcode. |
addressing->apiSettings->scoreThresholds |
Any address level match falling below the “address” score threshold and any postcode level match falling below the “postcode” score threshold is downgraded to “tentative”. The default values of 50 for both address and postcode are already quite strict. If your input address data contains a lot of extraneous text you might want to consider lowering these thresholds. You can safely lower these thresholds to 0 – in which case you will be relying solely on the addressing engine’s internal checks. The “name” score threshold is only relevant when process type is NameCorrection. |
addressing->apiSettings->capitalisePostTown |
When set to True, postal towns will be capitalised in the output table. |
addressing->apiSettings->keepNonPafData |
When set to True, matchIT SQL will attempt to keep non PAF elements (i.e. extra data that may not be in the postal address file) when updating addresses. Sub options control which types of non-PAF data to keep: |
addressing->apiSettings->preventCompanyUpdates |
Stops organizations from being updated. |
addressing->apiSettings->advancedFuzzy |
When Advanced Fuzzy is disabled, the fuzzy matching algorithm employed only allows for one or two spelling errors in an entire name. Advanced Fuzzy acts at a word level, and allows names to be matched even if some words are missing (depending on how highly occurring the missing words are). Default “true” (Advanced Fuzzy enabled). |
addressing->apiSettings->resubmitFailures |
When enabled, failures are resubmitted with the first two populated address lines swapped. This is useful when subpremise details are on the line following the street name, instead of the line preceding the street name, e.g. “15 Grand Avenue, Flat 1”. Default “true”. |
addressing->apiSettings->removePostcodeFromInputAddress |
When a partial match is found only the postcode field is updated - any output address fields will be populated from the source address fields. With this option enabled a postcode or outward code found in the source address fields will be removed. |
addressing->apiSettings->referenceDatabase |
The name of a database configuration to use. This must be the name of a database configuration (or pool) defined in mcconfig.ini (see below). Normally this will be “PAF”. |
addressing->apiSettings->outputDefaultDPS |
With this option enabled a default “9Z” DPS code will be output when the address does not match to premise level but has a valid postcode. |
addressing->apiSettings->addressImprovement |
This option is only relevant with process type NameCorrection. When enabled the input person name is used to resolve address ambiguities. |
The UK addressing engine uses a service called “Capscan Pool Manager” – this manages a pool of server processes that show up in task manager as “mcserver”. The configuration of these server processes is via a file in the UK address data installation folder, called mcconfig.ini. The matchIT SQL setup process installs and configures this automatically, so you shouldn’t normally need to do any configuration manually, but you might want to increase the number of mcserver processes if running on a machine with multiple cores – to do this increase the values of InitialServers and MaxServers in parallel. Any changes to mcconfig.ini will take effect when the “Capscan Pool Manager” service is restarted. The following table lists the other settings in the mcconfig.ini file:
Section |
Setting |
Description |
Link |
Host |
Must be localhost. |
Link |
DefConnectionMode |
Must be 0. |
Link |
LogFile |
Location of error log file. |
Link |
LicPath |
Location of license files. |
Pool1 |
Name |
Must be “PAF”. |
Pool1 |
InitialServers |
The initial number of mcserver processes to launch. If the machine has multiple cores, set this to about half the number of cores. |
Pool1 |
NewServers |
The number of new mcserver processes to launch if none are available. This is only used in the unlikely event of an mcserver process crashing and needing to be replaced. Set to 1. |
Pool1 |
MaxServers |
The maximum number of mcserver processes to launch. Set this to the same value as InitialServers. |
Pool1 |
Module |
Path and name of the addressing engine dll: cpsvrmc5.dll |
Pool1 |
ServerPath |
Path and name of the addressing server: mcserver.exe |
Pool1 |
PAF |
The address database(s) to search. Normally just Capscan.paf. |
Pool1 |
RCDB |
The folder containing overlay data files. E.g. NSPDO.RCD (National Statistics Postcode Directory). |
Pool1 |
MCDParam |
Path and name of a file containing advanced configuration settings for addressing engine. |
Pool1 |
AddrFrmt |
Path and name of a file containing address formatting configuration. |
4.1.1.3 UK Addressing Input Fields
Name |
Description |
Organization or Company |
Specifies the company name. |
Address1 – Address8 |
Up to 8 address lines can be specified. |
Town or City |
Specifies the posttown. |
County or Region or State |
Specifies the county. |
Postcode or Zip |
Specifies the postcode. |
4.1.1.4 Output Fields
The following result code fields are always output:
Name |
Description |
AddrEngine |
This indicates the addressing API (UK, US, INTernational) that each record is verified with (this is only output if licensed for multiple APIs). |
PafFlag |
A numeric representation of the PafDesc column. |
PafDesc |
Indicates the level at which an address was validated (“Verified”, “Good”, “Partial”, “Tentative”, and “NoMatch”). See engine specific sections here for details. |
PafDescExtra |
|
ErrorCode |
See engine specific description here. |
AddrScore |
See engine specific description here. |
UpdateFlag |
See engine specific description here. |
User specified output fields have a source mapping and one or more corrected mappings.
Source Mapping - The default mapping for data to populate in the selected column when no address match can be found for the given input. Available values will be the Address Types assigned to the Source Columns. Select 'None' for no default data.
Corrected Mappings - The corrected elements to populate in the selected column when a match is found. See the available engine specific output fields in the following sections. Multiple elements are permitted for a single output column, and will be separated with a space when combined.
4.1.1.5 UK Addressing Output Fields
Corrected Mappings for UK Addressing may include:
Name |
Description |
Address1-8 |
A combination of the address elements listed below depending on their availablitity in the PAF file in the following order, SubBuildingName & BuildingName, BuildingNumber & Thoroughfare, DependantLocality. The Town and County details are output to specific Town and County fields. |
Organisation |
The organisation name listed on the PAF file. |
BuildingName |
The building name of the house or commercial premises. |
BuildingNumber |
The number of the building on a thoroughfare. |
SubBuilding |
When a building is split into a number of flats or business units, a sub building name will be returned. |
Thoroughfare |
The street that contains the delivery point. |
DependentThoroughfare |
If a thoroughfare exists more than once within a town, the PAF file may contain additional information to uniquely identify each one. |
DoubleDependentLocality |
Further subdivision within a DependentLocality. |
DependentLocality |
Locality (e.g. village or borough name within a town) used to differentiate between streets with the same name. |
Town |
The postal town for the address |
County |
The country the address relates to. |
Postcode |
The postcode the address relates to. |
POBox |
The POBox listed on the PAF for the address. |
AddressKey |
A combination of the 8 digit address key and 8 digit organisation key that uniquely identifies each delivery point padded with a leading 0 (where there is no organisation, 0’s will be used). |
DeliveryPointSuffix |
The delivery point associated with the address to uniquely identify the postcode. |
Barcode |
CBC Barcode |
Easting |
5 digit code relating to the location of the postcode to the National Grid. |
Northing |
5 digit code relating to the location of the postcode to the National Grid. |
CountryCode |
ONS Country Code: L93000001 - Channel Islands E92000001 - England M83000003 - Isle of Man N92000002 - Northern Ireland S92000003 - Scotland W92000004 - Wales |
Latitude |
WGS84 (World Geodetic Standard 1984) datum Latitude in degrees decimal. E.g. 51.5267183130 |
Longitude |
WGS84 (World Geodetic Standard 1984) datum Longitude in degrees decimal. E.g. -0.1888023154 |
LEA |
Local Education Authority |
DHA |
Health Area Code |
PCG |
Primary Care Trust Code |
OSCTY |
Ward County Code |
OSLAUA |
Ward District Code |
OSWard |
Ward Code |
GridQuality |
Grid Quality |
GOR |
Government Office Region Code |
CONS |
Constituency |
EER |
European Electoral Region |
Previous Article | matchIT SQL Index | Next Article |
0 Comments