Parallelizing InfoSphere Streams

In order to interpolate a large surface from massive scattered data more efficiently, tessellated spatial windows have been implemented in parallel on the data streams. A divide-and-conquer approach has been used. The scattered data in each tessellated region are processed to interpolate the subsurface within the same region. The subsurfaces are then recombined to generate the whole surface. The interpolation in the tessellated regions can be processed in parallel in InfoSphere Streams. Peter tested the Intra-PE loading split parallelization of the spatial interpolation operator on a dual CPUs machine (see the figure below where the color scheme denotes processing elements). Processing the streams in parallel improves the efficiency of spatial interpolation significantly. Peter is currently working on parallelisation over multiple hosts now. This is expected to further improve the performance since more CPUs are available on more hosts.

fig3

Tagged with:
Posted in Activites