Masters Thesis, May 2006
by: Qiang Kong
supervisor: Tamara Munzner and
Raymond T. Ng
supervisory committee member:
Michiel van de Panne
We present the PowerSetViewer visualization system for the lattice-based
mining of powersets. Searching for items within the powerset of a universe
occurs in many large dataset knowledge discovery contexts. Using a spatial
layout based on a powerset provides a united visual framework at three different
levels: data mining on the filtered dataset, browsing the entire dataset, and
comparing multiple datasets sharing the same alphabet. The features of our
system allow users to find appropriate parameter settings for data mining
algorithms through lightweight visual experimentation showing partial results.
We use dynamic constrained frequent set mining as a concrete case study to
showcase the utility of the system. The key challenge for spatial layouts based
on powerset structure is handling large alphabets, because the size of the
powerset grows exponentially with the size of the alphabet. We present scalable
algorithms for enumerating and displaying datasets containing between 1.5 and 7
million itemsets, and alphabet sizes of over 40,000.
pdf