True facts about... Extended postcodes
Posted: Fri Mar 27, 2015 8:16 am
Hello together,
these days a customer requested more details about our internal handling with extended postcodes. Therefore I decided to collect some basics and to provide them here in the forum. There we go - I hope this enables you to get a better understanding of what you can expect from them (and what they can't resolve!):
Countries like the USA, Great Britain/UK, Netherlands provide different levels of postcode character lengths. If the extended postcode is available for an input address this usually enables us to have a detailed geocode just on the basis of the postcode
The basic search! Always based on standard postcode length and city name and street
And here are some examples.
And some more statements straight from DEV DEPARTMENT (thanks to Jochen!):
What are ExtendedPostcodes?
The standard map providers (HERE, TOMTOM, AND) provide postcodes for each town, mostly on 4- or 5-digit level, depending on the country. However, in some countries there are also postcodes with additional letters or numbers at the end that identify an address to the street level or even further. Therese are called in this document 'Extended Postcodes'.
The data for the extended postcodes is mostly defined and provided by the national mail company and contains for each extended postcode at least a coordinate, plus sometimes additional fields like Town, Townpart, Street and HouseNumbers.
At the time of writing, extended postcode data was available for the Netherlands and Great Britain, where only the Netherlands had also the additional fields Town, Townpart, Street and HouseNumbers filled.
How are ExtendedPostcodes used when searching for addresses?
Extended postcodes are used only in the town and singlefield searches. The input is scanned for possible long postcodes which are then used to search for exact matches in the extended postcode database. Note that only the postcode is used to find the results, all other input words are ignored.
A record is only returned if the input postcode matches a database entry exactly. However, minor typos are allowed:
When an extended postcode record is returned, the normal classification algorithm is executed, taking also the other input fields into account. If there exists no data for town or street return fields, these fields are assumed exact.
If no suitable extended postcode match is found, the normal search is executed to find a result.
Where are ExtendedPostcodes stored?
As said before extended postcode data is supplied by another provider than the standard map and is thus not included in the normal map distribution. Instead you have to abtain a separate postcode index file, that can be copied into the map directory.
There is one postcode index file per integration unit ('country') which is named <country>.gpi and must be copied into the directory of that integration unit. The integration unit directory is the directory where e.g. the files <country>.lv1, <country>.gcd and/or <country>.bdg are. If your map contains instead of those only files <country>.gp, then you have to create a new folder named <country> next to the file <country>.gp and copy the <country>.gpi file into it.
How do know that my gpi files have been loaded successfully?
When loading the map, the geocoder prints for each integration unit a log message to the log output stating whether or not the extended postcode data was found and loaded successfully.
How do gpi files relate to the mapserver ezi files?
It is true that the two solutions seem similair, but the gpi files are not compatible with the old ezi files. There was only one ezi file per map, while there is one gpi file per integration unit. The file format and the structure of the records have changed, so mapserver will not load gpi files and gpGeocoder will not load ezi files.
How can I create my own gpi files?
This distribution contains also a tool for importing gpi files from a CSV format. See Extended Postcode Importer Tool for more information.
Best regards Bernd
these days a customer requested more details about our internal handling with extended postcodes. Therefore I decided to collect some basics and to provide them here in the forum. There we go - I hope this enables you to get a better understanding of what you can expect from them (and what they can't resolve!):
Countries like the USA, Great Britain/UK, Netherlands provide different levels of postcode character lengths. If the extended postcode is available for an input address this usually enables us to have a detailed geocode just on the basis of the postcode
- UK has 7 characters (e.g. “AB101AA”) in extended mode, 5 characters in standard mode (“AB101”)
- USA has 9 characters 5 characters
- Netherlands have 6 digits versus 4 digits in standard mode
- PTV map data contains both
- standard data (based on standard codes + city names, street names, house numbers and coordinates)
- extended data (based on the extended postcodes and their coordinates)
- If you enter a standard postcode we will search for it in the standard data (and only in the standard data) together with city name and street and so on. There will be no search on the extended data.
- If you enter an extended postcode we look for a perfect match of the postcode in the extended data. City/district, street and housenumber are ignored in this step. If there is no perfect match we fall back to the standard data. For the fallback we cut of the the ending characters of the postcode.
The basic search! Always based on standard postcode length and city name and street
And here are some examples.
And some more statements straight from DEV DEPARTMENT (thanks to Jochen!):
What are ExtendedPostcodes?
The standard map providers (HERE, TOMTOM, AND) provide postcodes for each town, mostly on 4- or 5-digit level, depending on the country. However, in some countries there are also postcodes with additional letters or numbers at the end that identify an address to the street level or even further. Therese are called in this document 'Extended Postcodes'.
The data for the extended postcodes is mostly defined and provided by the national mail company and contains for each extended postcode at least a coordinate, plus sometimes additional fields like Town, Townpart, Street and HouseNumbers.
At the time of writing, extended postcode data was available for the Netherlands and Great Britain, where only the Netherlands had also the additional fields Town, Townpart, Street and HouseNumbers filled.
How are ExtendedPostcodes used when searching for addresses?
Extended postcodes are used only in the town and singlefield searches. The input is scanned for possible long postcodes which are then used to search for exact matches in the extended postcode database. Note that only the postcode is used to find the results, all other input words are ignored.
A record is only returned if the input postcode matches a database entry exactly. However, minor typos are allowed:
- the comparison is case insensitive
- the comparison does not care about spaces and dashes
When an extended postcode record is returned, the normal classification algorithm is executed, taking also the other input fields into account. If there exists no data for town or street return fields, these fields are assumed exact.
If no suitable extended postcode match is found, the normal search is executed to find a result.
Where are ExtendedPostcodes stored?
As said before extended postcode data is supplied by another provider than the standard map and is thus not included in the normal map distribution. Instead you have to abtain a separate postcode index file, that can be copied into the map directory.
There is one postcode index file per integration unit ('country') which is named <country>.gpi and must be copied into the directory of that integration unit. The integration unit directory is the directory where e.g. the files <country>.lv1, <country>.gcd and/or <country>.bdg are. If your map contains instead of those only files <country>.gp, then you have to create a new folder named <country> next to the file <country>.gp and copy the <country>.gpi file into it.
How do know that my gpi files have been loaded successfully?
When loading the map, the geocoder prints for each integration unit a log message to the log output stating whether or not the extended postcode data was found and loaded successfully.
How do gpi files relate to the mapserver ezi files?
It is true that the two solutions seem similair, but the gpi files are not compatible with the old ezi files. There was only one ezi file per map, while there is one gpi file per integration unit. The file format and the structure of the records have changed, so mapserver will not load gpi files and gpGeocoder will not load ezi files.
How can I create my own gpi files?
This distribution contains also a tool for importing gpi files from a CSV format. See Extended Postcode Importer Tool for more information.
Best regards Bernd