Stream processing

Geosensor networks present a range of challenges to traditional GIS architectures, including dealing with large volumes of highly dynamic data and the inherent unreliability of such geosensor networks. Traditional GIS architectures are well-adapted for offline algorithms. In an offline environment, an algorithm is expected to have complete information about the input data to be processed. However, emerging data sources, such as the RISERnet networks, are much more suited to online processing.

While the positions of sensor nodes in the field may be irregular, bushfire spread simulation tools, such as PHOENIX RapidFire, require regular, gridded inputs. Hence in order to connect the wireless sensor networks to such tools, the measurements need to be interpolated spatially. We took spatial interpolation as a case study to explore how spatiotemporal data streams should be handled in an online environment.

First of all, we compared the efficiency and accuracy of many established spatial interpolation algorithms and different spatial partition schemes (below) for parallel stream computing. It was found that more accurate algorithms and partitions are often less efficient. Thus the trade-off between efficiency and accuracy must be considered when designing a spatial interpolation operator in a stream environment, according to specific applications. For example, in cases where the interpolated surface is required as an input to a bushfire simulation system, higher accuracy interpolation is highly desirable, due to the likely amplification of errors in the interpolated surface due to error propagation. On the other hand, where the surface is required purely for visualization or real-time applications, more efficient but less accurate methods are more appropriate, especially for larger geosensor networks.







Further, we investigated efficient algorithms to conduct Ordinary Kriging on spatiotemporal data streams. Ordinary Kriging, which is the best linear unbiased estimator, is widely used for geospatial interpolation and estimation. Due to the cubic time complexity of solving the system of linear equations, ordinary Kriging for a large set of source points is computationally intensive. Conducting real-time Kriging interpolation over continuously varying spatiotemporal data streams can therefore be especially challenging. We developed two new strategies (incremental computation and recursive computation) for improving the performance of an ordinary Kriging interpolator adapted to a stream-processing environment. These strategies rely on the expectation that, over time, source data points will frequently refer to the same spatial locations (for example, where static sensor nodes are generating repeated observations of a dynamic field). These two strategies are evaluated in terms of their computational efficiency in comparison to ordinary Kriging algorithm. The results show that these two strategies can reduce the time taken to perform the interpolation by up to 90%, and approach squared average-case time complexity when most but not all source points refer to the same locations over time. By combining the approaches developed in this paper with existing heuristic ordinary Kriging algorithms, the conclusions indicate how further efficiency gains could potentially be accrued. The work ultimately contributes to the development of online ordinary Kriging interpolation algorithms, capable of real-time spatial interpolation with large streaming data sets.

Apart from spatial interpolation, the RISER team is developing an incremental algorithm for χ-shapes, an widely used footprint (a graph which characterizes the distribution of a set of points). The highlight of χ-shapes is the guaranteed regularity of the constructed polygons. This is accomplished by iteratively removing the longest exterior edge of the triangulation of input points that does not cause irregularity. Because of this unique rule, χ-shapes are more difficult to be incrementally constructed than other footprints. We proposed a dynamic data structure to maintain the edges removed by the χ-shape algorithm. When a new set of points is inserted, or when a set of existing points is deleted, the data structure can be modified using our algorithm so that the updated χ-shape can be obtained much more efficiently than using the original χ-shape algorithm. This algorithm can be used to characterize the shape of spatial points in a high volume stream. For example, spatial points within a sliding temporal window involve frequent insertions (new points included by the window) and deletions (old points evicted by the window). The insertions and deletions can be handled more efficiently using our algorithm than treating the data as an entirely new data set.


Related links