An Evaluation of Multi-Resolution Storage for Sensor Networks
by Deepak Ganesan, Ben Greenstein, Denis Perelyubskiy, Deborah Estrin, and
John Heidemann
In Proceedings of the First International Conference on Embedded Networked
Sensor Systems, pages 89 - 102, 2003
Paper: (pdf)
Slides: (ppt)
CiteSeer
ACM
Presented by Georg Wittenburg on
February 23, 2004 as part of
538A (201): Topics in Computer
Systems.
Slides: (ppt)
(pdf)
Background
The paper "An Evaluation of Multi-Resolution Storage for Sensor Networks" was
written by Deepak Ganesan, Ben Greenstein, Denis Perelyubskiy, Deborah Estrin,
and John Heidemann and published in the Proceedings of the First International
Conference on Embedded Networked Sensor Systems in 2003. Deepak Ganesan and Ben
Greenstein are PhD students at UCLA with an ongoing research interest in sensor
networks. Deborah Estrin is professor of computer science at UCLA and also holds
the positions of Director at the Center for Embedded Networked Sensing (CENS)
and of Associate Editor of the ACM Transactions on Sensor Networks. John
Heidemann is assistant professor at USC and has collaborated with Deborah Estrin
in the past.
Summary
The paper addresses the limitation of storage available in a sensor network
by proposing a hierarchical storage tree in which compressed summaries of
sub-trees are aggregated, aged and distributed among the nodes of the network.
These three techniques are then evaluated based on the implementation of the
DIMENSIONS systems, a structural diagram of which is shown below.
The key elements of the system are:
- Storage Hierarchy: Sensor nodes are organized in a tree
structure with each node in a given level aggregating the compressed summaries
of the data gathered by its children. This data may be raw sensor data or
summaries of the lower level. Wavelets are used for compression to preserve
characteristic patterns in the data while still avoiding the implosion of data
stored at the root of the tree. For load-balancing, the tree is randomly rebuilt
every epoch, thereby distributing the high-level summaries equally over all
nodes. The summaries at each level allow the user to gradually refine a query
and drill-down the storage hierarchy until the desired data is retrieved.
- Aging Algorithm: The aging algorithm takes the limited
storage available to the sensor network into account by implementing a graceful
degradation of quality of query results. Low quality summaries (which are
relatively small and stored near the root of the hierarchy) are preserved over a
longer period of time than raw sensor data. Based on an user supplied function
representing the desired quality over time, the algorithm tries to allocate
memory on each node in a way that matches this function as closely as possible.
Two approaches have been implemented: A training algorithm uses previously
retrieved data and usage patterns, and a greedy algorithm performs memory
allocation based on a few simple parameters. For evaluation purposes, an
omniscient algorithm has also been implemented.
The experimental evaluation of the system shows that the overhead of
communicating summaries can be amortized over many queries. Furthermore, aging
after prior training performs only 1% worse than the optimal solution. Greedy
aging with a good choice of parameters performs 5% worse than the optimal
solution.
In their description of the DIMENSIONS system, the authors make several
assumptions such as uniform deployment of sensor nodes in the physical world,
the network being homogenous, available time-synchronization, and equal size of
summaries both over nodes and time, i.e. constant data rate across the network.
Some of these assumptions have been addressed in follow-up work.
Discussion
- Data Replication / Redundancy: The redundancy that comes
with replication of summarized data over the sensor network is an advantage,
especially considering the potentially exposed nature of sensor nodes. Another
option would be to have a centralized storage, but this would cause far greater
traffic on the network as only a fraction of the raw data is ever accessed with
drill-down queries.
- Wavelets: Wavelets are a good choice for this application
as they generally preserve interesting patterns and are well suited to the data
that is collected with a continuous sensor over time. The actual wavelet and
further parameters should however be specified by an expert in the application
domain. This scheme may not get all random events, but it is certainly better
than ordinary Discrete Cosine Transform (DCT), and after all the storage is
limited so the compression needs to be lossy.
- Replication Overhead vs. Query Frequency: This is quite
application specific as it assumes frequent queries to compensate for the
overhead caused by transmitting the summaries. However, the general model of
sampling continuous values over time is pretty general.
- Tree Structure and Load-Balancing: An adaptable hierarchy
could be used to compensate for variable data rate and still have good
load-balancing, similar to different resolution being used in different regions
of an image. Furthermore, low level node interaction would be nice for
load-balancing at the same level in the hierarchy, maybe with the respective
supernode acting as broker
- Power Consumption: The paper doesn't directly address the
problem of power consumption. However, the sensor nodes are online most of the
time anyway as they continuously sample the environment. Hence they would need
some source of external power.
Further Reading
- Deepak Ganesan, Deborah Estrin, and John Heidemann.
"DIMENSIONS:
Why Do We Need a New Data Handling Architecture for Sensor Networks?"
In First Workshop on Hot Topics in Networks (Hotnets-I), October 2002. In
Proceedings of ACM Computer Communication Review, Volume 33, Number 1 (January
2003).
- Deepak Ganesan, Deborah Estrin, and John Heidemann.
"DIMENSIONS:
Distributed Multi-Resolution Storage and Mining of Networked Sensors"
- Deepak Ganesan, Sylvia Ratnasamy, Hanbiao Wang, and Deborah Estrin.
"Coping
with Irregular Spatio-Temporal Sampling in Sensor Networks" In 2nd
Workshop on Hot Topics in Networks (HotNets-II), November 2003.
- B. Greenstein, E. Kohler, D. Culler, and D. Estrin. "Distributed
Techniques for Area Computation in Sensor Networks" under
submission
- Amara Graps. "An
Introduction to Wavelets" In IEEE Computational Science and
Engineering, Summer 1995, vol. 2, num. 2
Georg Wittenburg - March 4, 2004
