Resumen:
Compact data structures are storage structures that combine a compressed represen-
tation of the data and the access mechanisms for retrieving individual data without the
need of decompressing from the beginning. The target is to be able to keep the data
always compressed, even in main memory, given that the data can be processed directly
in that form. With this approach, we obtain several benefits: we can load larger datasets
in main memory, we can make a better usage of the memory hierarchy, and we can ob-
tain bandwidth savings in a distributed computational scenario, without wasting time in
compressing and decompressing data during data exchanges.
In this work, we follow a compact data structure approach to design a storage struc-
ture for raster data, which is commonly used to represent attributes of the space (tem-
peratures, pressure, elevation measures, etc.) in geographical information systems. As it
is common in compact data structures, our new technique is not only able to store and
directly access compressed data, but also indexes its content, thereby accelerating the
execution of queries.
Previous compact data structures designed to store raster data work well when the
raster dataset has few different values. Nevertheless, when the number of different values
in the raster increases, their space consumption and search performance degrade. Our
experiments show that our storage structure improves previous approaches in all aspects,
especially when the number of different values is large, which is critical when applying
over real datasets. Compared with classical methods for storing rasters, namely netCDF,
our method competes in space and excels in access and query times