T1.3 – Performance Benchmarking and Evaluation
A benchmarking strategy will be developed and the platforms performance will be evaluated, based on well-established benchmarks, such as the Sequoia 2000 and Paradise benchmark3. In particular, the benchmark will be based on OSM and Natural Earth datasets, complemented by synthetic social data from the LOD2 Social Intelligence Benchmark. Further, to perform benchmarking tests that can deal with very large datasets, we will develop a suitable infrastructure by (1) evaluating existing high-performant and compact GIS storage formats for adoption of best practices (e.g. Oracle Spatial, PostGIS), (2) developing a suitable process for extract-transform-load of geo data, including coordinate transforms and schema mapping (e.g. GDAL/OGR, GeoKettle). This is in itself a matter of benchmarking but also enables running benchmarks targeting diverse geospatial workloads such as localized, potentially high frequency accesses for route planning and querying; as well as aggregation queries that span large portions of the map and join with thematic data.
The Sequoia workload is an initial checklist item for validating functionality and relative performance as opposed to existing GIS. Geoknow will move beyond the state of the art by using and further refining the geospatial benchmark built in LOD2, emulating heavy drill-down style online access patterns and accessing large volumes of thematic data. An example of this is zooming on a map shaded according to population density or real estate prices. Beyond this, benchmarking will look at more complex analytics such as deciding locations of services based on population distribution, competitive services, road network capacity and the like. The last category is not expressible as a query but requires database resident application logic.
Further, to compare GKS and its components with other state of the art systems, we will use the SEALS project benchmarking services. This will provide us with an objective evaluation of the GKS and its components.
Benchmarks will be done early, so as to provide constant tracking of performance.
Virtuoso's relational GIS functionality will be compared to PostGIS using the full scale OSM data and typical OSM queries and updates. Further, an RDF version of this will be run on Virtuoso and its performance will be compared to the performance attained with OSM in relational form on Virtuoso.
Other Tasks in this Workpackage