InfoVis 2003 Contest - TreeJuxtaposer Entry
James Slack, Tamara Munzner
{jslack,tmm}@cs.ubc.ca
University of British Columbia
Francois Guimbretiere
francois@cs.umd.edu
University of Maryland
See Infovis 2003 Contest rules and task at http://www.cs.umd.edu/hcil/iv03contest/
Ratings used below:
(Strength,Possible,Difficult,Not Available)
Pairwise comparisons of trees: Topological changes
Pairwise comparisons of trees: Attribute value changes
- Task: Global impression: did things change a lot or not?
- Rating:
Not Available
- Process:
Not Applicable
- Image:
Not Applicable
- Answer:
Since TreeJuxtaposer is not designed to handle attributes at this time, the attribute section of the contest will not be considered.
- Task: What nodes or subtrees changed the most?
- Rating:
Not Available
- Process:
Not Applicable
- Image:
Not Applicable
- Answer:
Since TreeJuxtaposer is not designed to handle attributes at this time, the attribute section of the contest will not be considered.
- Task: Did the value of attribute XYZ for this node increase or
decrease?
- Rating:
Not Available
- Process:
Not Applicable
- Image:
Not Applicable
- Answer:
Since TreeJuxtaposer is not designed to handle attributes at this time, the attribute section of the contest will not be considered.
General visualization of trees: Topology
- Task: Overall characteristics (How large is the tree?)
- Rating:
Possible
- Process:
The size of the tree can be determined by the number of leaf nodes since they are allocated a portion of vertical screen relative to the number of nodes in the tree. Using the keyboard arrow keys is useful for determining how dense the nodes are at the leaf level.
- Image:
Shown elsewhere
- Answer:
- A rough estimate of the size of the tree can be determined by the mentioned process above. An exact number of named nodes is returned by the Find panel if all nodes are selected.
- The depth of the tree cannot be determined as easily. The deepest branch is also a question that cannot be answered visually. Since the trees are right-aligned, the depth of subtrees is also not visually obvious.
- Task: Path (What is the path of this node?)
- Rating:
Strength
- Process:
The path of a node to the root is simply found using the left arrow key, the left arrow visits the parent of the current node.
- Image:
Expanded path of homo sapiens in animalia tree
- Answer:
The homo sapiens example shows the path marked from leaf to root. A fully qualified naming system would show the path in the name of the nodes in the Find panel, which is also supported by TreeJuxtaposer.
- Task: Local relatives (What are the children, siblings, or cousins of this node?)
- Rating:
Strength
- Process:
The children of the node can be scrolled through with the keyboard. The right arrow will select the first child and pressing the down arrow will then cycle through the children. Growing the node with the keyboard or mouse is a much better alternative and all children that can be displayed will be visible, instead of the keyboard method which only highlights one node at a time. Cousins are most easily found with the keyboard method though, as when the last child is selected, the next arrow down (or up, depending on the direction) will select the closest cousin.
- Image:
Shown elsewhere
- Answer:
See process for information
- Task: Filtering by level (Show only the first level, or show only 3 levels down, or remove all the leaves)
- Rating:
Not Available
- Process:
Not Applicable
- Image:
Not Applicable
- Answer:
TreeJuxtaposer cannot filter by rank or level.
- Task: Evaluating the number of nodes?
- Rating:
Strength
- Process:
The largest fan-out (the largest number of leaf nodes) is clearly visible when either selecting the root of an interesting subtree with a User Group or by hovering over. User Groups are more visible since they color the interesting nodes, but hovering over is much quicker at the expense of having to use the gray box that disappears when the mouse focusses on a different node. The largest number of nodes in a subtree is not quantitatively visible in TreeJuxtaposer, but the density of tree edges in a region corresponds to a larger number of nodes in a subtree and can be used to judge the relative size of a subtree.
- Image:
users subtree highlighted in file system tree
- Answer:
Internal nodes with large space above and below their horizontal edges (such as users in the image) have larger fan-out and therefore more leaf nodes.
General visualization of trees: Attribute based
- Task: Find nodes with high values of a numerical attribute X (relative query)
- Rating:
Not Available
- Process:
Not Applicable
- Image:
Not Applicable
- Answer:
Since TreeJuxtaposer is unable to assess attributes of nodes, these tasks are not applicable for our system.
- Task: Find nodes with given value of a numerical attribute X (absolute query)
- Rating:
Not Available
- Process:
Not Applicable
- Image:
Not Applicable
- Answer:
Since TreeJuxtaposer is unable to assess attributes of nodes, these tasks are not applicable for our system.
- Task: Find nodes with value Y of categorical attribute X
- Rating:
Not Available
- Process:
Not Applicable
- Image:
Not Applicable
- Answer:
Since TreeJuxtaposer is unable to assess attributes of nodes, these tasks are not applicable for our system.
- Task: What value of a categorical attribute occurs more often? (Are there more farm animals or pets)
- Rating:
Not Available
- Process:
Not Applicable
- Image:
Not Applicable
- Answer:
Since TreeJuxtaposer is unable to assess attributes of nodes, these tasks are not applicable for our system.
- Task: Find nodes with certain values of two or more attributes (What video file is used the most?)
- Rating:
Not Available
- Process:
Not Applicable
- Image:
Not Applicable
- Answer:
Since TreeJuxtaposer is unable to assess attributes of nodes, these tasks are not applicable for our system.
- Task: Number of nodes in a tree or subtree? (How many animals? How many mammals?)
- Rating:
Possible
- Process:
The number of animal species is equal to the number of leaf nodes aligned on the right side of TreeJuxtaposer. The actual number of leaf nodes is not displayed but the total number of named nodes is available in the Find panel with a fully qualified naming structure. The number of mammals can be determined similarly with fully qualified names (or the number of dolphins, etc).
- Image:
Shown elsewhere
- Answer:
See process for information
- Task: Comparison of branches of the tree (Subtrees with most nodes; are there more mammals or fish?)
- Rating:
Strength
- Process:
By highlighting all nodes that start with "///animal/mammal" the number of mammals can be found. If you wanted to find the mammal subtree to perform this operation and typed "mammal" into the Find panel for classif_B, there are "mammal-nest beetles" which are not mammals. Since there are very few non-mammals with "mammal" in their name, it is easy to deselect the non-mammals from the Find panel and find the mammal subtree.
- Image:
Mammals and bony fish subtrees highlighted in animal tree
- Answer:
There are more bony fish than mammals as can be seen in the image for this task.
- Task: Largest fanout (What is the largest group of animals with same lineage?)
- Rating:
Strength
- Process:
The largest fan-out (the largest number of leaf nodes) is clearly visible when either selecting the root of an interesting subtree with a User Group or by hovering over. User Groups are more visible since they color the interesting nodes, but hovering over is much quicker at the expense of having to use the gray box that disappears when the mouse focusses on a different node. The keyboard arrow keys (up and down) can be used to cycle through sibling or cousin nodes.
- Image:
Shown elsewhere
- Answer:
See process for information
General visualization of trees: Known items
- Task: Which nodes have a particular string in their label? (Find "giraffe" in a tree of animals)
- Rating:
Strength
- Process:
Use the Find panel and type in giraffe. All giraffes are now highlighted with the Found group.
- Image:
All giraffe nodes highlighted
- Answer:
Only one species with the name "giraffe" was found in classif_B.
- Task: Locate a node knowing its path
- Rating:
Strength
- Process:
Finding a node with a known path can be done in the Find panel, or by browsing through the tree. The method used would depend on exactly what about the path is known. If the full path is known, then browsing through the tree from subtree to subtree may be faster since you wouldn't have to type the entire path into the tree. However, due to screen real-estate and the limited number of visible node labels, not being able to find a path element on a bushy tree is hard so time may be saved by simply typing it into the Find panel as a hint to which area to grow.
- Image:
Shown elsewhere
- Answer:
See process for information
- Task: Go back to a node you have visited before
- Rating:
Possible
- Process:
There is no undo feature in TreeJuxtaposer, but if you know that you would probably like to return to this node after exploring other parts of the tree, then marking the node with a User Group would be a good idea, if you don't run out of User Groups.
- Image:
Shown elsewhere
- Answer:
See process for information
General visualization of trees: Labeling
- Task: Review all the labels in a subtree
- Rating:
Strength
- Process:
All the labels in a subtree can be extracted through the Find panel. If a name is entered into the Find panel, the results are limited to the nodes that match the entry. Further navigation techniques such as keyboard (for fine control over sibling relationships), mouse-over (for coarse control on the entire tree) are also available.
- Image:
Shown elsewhere
- Answer:
See process for information
General visualization of trees: Browsing
- Task: Explore the tree by performing a series of up and downs in the tree
- Rating:
Strength
- Process:
These actions are easily performed with the mouse interface to resize subtrees to find interesting paths to leaf nodes. Starting with the animals tree classif_B (with common names), grow the vertebrates, mammals bigger, then find primates and then find gorillas and chimpanzees in the great apes subtree. Finding cats is a little more tricky but starting with mouse on primates, press the up arrow until carnivores is highlighted. Grow the carnivores selection with the keyboard until it is large enough to see the cats subtree. Use the mouse to resize the cats subtree until it's large enough to see cheetah and tiger.
- Image:
Browsing example: cheetah and tiger highlighted
- Answer:
See process for information
General visualization of trees: Managing the analysis
- Task: Marking nodes of interest
- Rating:
Strength
- Process:
Up to 4 User Groups can be used to mark nodes of interest. The granularity of marking can either be Node or Subtree and multiple Node/Subtrees can be marked in the same group.
- Image:
Shown elsewhere
- Answer:
A node may belong to multiple groups simultaneously, and the groups are given drawing order relative to when they were last selected; the last group selected will draw over previous user groups if they both want to draw the same edge.
- Task: Removing special anomalies
- Rating:
Not Available
- Process:
Not Applicable
- Image:
Not Applicable
- Answer:
TreeJuxtaposer can't modify the tree, and doesn't support saving or history.
- Task: Saving visualization settings for future reference
- Rating:
Not Available
- Process:
Not Applicable
- Image:
Not Applicable
- Answer:
TreeJuxtaposer can't modify the tree, and doesn't support saving or history.
- Task: Keeping the history of your analysis, reviewing it and replaying it with different parameters
- Rating:
Not Available
- Process:
Not Applicable
- Image:
Not Applicable
- Answer:
TreeJuxtaposer can't modify the tree, and doesn't support saving or history.
Phylogenies: Application specific tasks
- Task: Co-evolution?
- Rating:
Strength
- Process:
Load phylogenetic trees. Differences will be shown and navigation can be used to compare trees.
- Image:
Differences in phylogenetic tree
3 subtrees highlighted (note relative positions)
- Answer:
- All of the leaves match. The leaves in phylo_A are all in phylo_B and vice versa.
- Some leaf nodes have identical names in the same tree. TreeJuxtaposer assumes all leaves have 1-to-1 relationships with other similar leaves but is only able to automatically assign leaves; a different leaf assignment between trees might have produced a different tree comparison result.
- A subtree of 5 leaf nodes almost matches. The subtree has a structural difference in only one child subtree. The blue marked region in the picture shows this.
- A larger subtree of 7 leaf nodes matches. The green marked region in the picture shows this.
- An even larger subtree of 8 nodes nearly matches. The subtree has a larger difference in 3 internal nodes but may be useful. The cyan marked region in the picture shows this.
- Task: Interacting with the tree matching process to solve inconsistencies
- Rating:
Not Available
- Process:
Not Applicable
- Image:
Not Applicable
- Answer:
TreeJuxtaposer does not have the functionality required to interact with the nodes at a low level since the matching process used is automatic.
- Task: Displaying the trees, with or without taking into account the branch length (the length of the links)
- Rating:
Strength
- Process:
Load trees into TreeJuxtaposer to view the trees.
- Image:
Shown elsewhere
- Answer:
This is the baseline use of TreeJuxtaposer.
- Task: Showing the relationships and differences from a computed or interactively constructed mapping
- Rating:
Strength
- Process:
Differences are automatically computed and displayed. Relationships may be
reviewed using mouse over highlighting or User Groups.
- Image:
Shown elsewhere
- Answer:
The difference marking is provided by the automatic best-corresponding node algorithm in TreeJuxtaposer. Navigating through with mouse-over highlighting and marking subtrees with User Groups allows the user to recognize further similarities in the tree.
- Task: Providing ways to permute links and nodes to verify hypotheses interactively
- Rating:
Not Available
- Process:
Not Applicable
- Image:
Not Applicable
- Answer:
TreeJuxtaposer also is not designed to modify the structure of the input given.
Classifications: Application specific tasks
- Task: To what extent are the differences in the classifications due to differences in how animals are thought to be related? Other kinds of differences?
- Rating:
Strength
- Process:
Differences are automatically highlighted and further exploration with linked
highlighting show differences in animal classifications.
- Image:
Node additions and deletions
Differences for entire mammalia tree
Rodentia classification check
Movement of subtree in mammalia tree
- Answer:
- The differences in the classifications are mostly due to additions to the tree (from A to B), deletions from the tree, or slight modifications such as splitting (a leaf node in A becomes a subtree with two children in B)
- Additions and deletions on the leaves can be quantified by examination of the "redness" of the leaf level as the leaves are equally spaced at that level and therefore the percentage of red (red marking a node that is different) indicates the percentage of added nodes relative to the other tree.
- If a large subtree, for example rodentia, is highlighted in A, the nodes in B that are highlighted are all in the rodentia subtree. Furthermore, if the rodentia subtree in B is now highlighted, there are no nodes highlighted in either classification tree that are outside of the rodentia group. This is far from being complete but investigation of the mammalia trees shows mostly differences in the leaf level nodes.
- Some differences such as the movement of pitheciidae from primates in in classif_A to cebidae (new world monkeys) in classif_B can be found through exploration but there is no easy way for TreeJuxtaposer to automatically highlight or count these types of differences. The subtree marking capability does speed up the exploration process, as explained in the movement characterization answer in the General Pairwise section.
- Task: Can you say in how many different subtrees a particular common name (such as "dolphin" or "horse") is used? How closely are these animals related? Are common names a good guide to understanding relationships?
- Rating:
Strength
- Process:
- Load tree classif_A_03-04-16.nh (common, fully qualified)
- Search for "dolphin": Find panel finds 53 leaf and non-leaf dolphins
- Search for "horse": Find panel finds 47 leaf and non-leaf horses
- Image:
Common names using animals tree
Highlighted all occurances of dolphin in common names
Highlighted all occurances of horse in common names
Marmot subtree common names comparison
Marmot subtree latin names comparison
Common names comparison
Latin names comparison
- Answer:
- "myzomela adolphinae": probably not named with respect to common dolphins
- many dolphins in "marine dolphins" hierarchy
- In addition to mammalian horses, "horse" appears in many different subtrees across different parts of the classification tree (arthropods, insects, seahorses, snails, etc)
- The animal species with "horse" in their names are not closely related at all
- Several "horse-groups" exist which includes the members that do not have horse in their species names but rather a higher rank horse relationship.
- Common names are not a good guide to understanding relationships. Common names lack structure and do not have the same hierarchical classification as their latin equivalents.
- Common names may have historical or geographical influences and one classification may even look different from an identical classification tree if a naming convention is not adhered to; the trees provided are not good to find differences if common names are used. See for example how mammalia_A labels "vancouver island marmot" while mammalia_B labels "vancouver marmot" which is another name for "marmota vancouverensis"
- Some common names may be simple and included in other common names (i.e. "horse" occurs in "seahorse"). TreeJuxtaposer Find can be used to ignore or focus in on sections of species, but it requires some user input in the search window.
- For species such as dolphins that are not expected to occur frequently across very different species, it was interesting to see non-mammals occur (especially non-porpoises, using fully qualified names can see them clearly: a mollusk, 2 bony fishes, and some kind of perching bird) which may either have dolphin-like properties or "dolphin" in their name by chance.
- Although common names are very useful for providing recognizable names when a layperson browses a single tree, they dramatically impede comparison.
- Task: How many species or subspecies are named after biologists named "Townsend"?
- Rating:
Strength
- Process:
- Load tree classif_A_03-04-16.nh (latin)
- Search for "townsend": Find panel finds 51 leaf and non-leaf townsend nodes
- Start new TreeJuxtaposer with classif_A (common)
- Search for "townsend": Find panel finds 45 leaf and non-leaf townsend nodes
Some latin names appear in common trees since if a node has no common name, the latin name is used as a label.
- Image:
Townsend name search with common tree
Townsend name search with latin tree
- Answer:
- The names returned in the search do not show a pattern that can be used to deduce where in the world Townsend (or for all Townsends if there were in fact more than one Townsend naming animals) might have done research. The common names give a range of geographic locations with chipmunks, shrimp, and bats.
- The kinds of animals the search returns provides quite a range in the classification tree: the search highlights are distributed throughout the classification tree.
- Task: What kind of feedback does your tool provide to alert the user quickly when a wrong name is entered?
- Rating:
Strength
- Process:
- Load tree classif_A_03-04-16.nh (latin)
- Assume searching for Spirurida which is in the Nemata phylum
Since you know Spirurida is a type of nemata (you're knowledgeable about worms
and want to see the hierarchy under Spirurida).
- Enter "Spirulida" in Find box
- Grow Found nodes and notice that the wrong section grows and no
worms appear.
- You read what you typed into the search box and realize
the mistake and correct it.
- Image:
Incorrect name search (spirulida instead of spirurida)
- Answer:
- The feedback from TreeJuxtaposer is the nodes which were found did not grow a subtree as expected.
- Since TreeJuxtaposer doesn't store the rank as an attribute, determining if both names have the same rank is not possible within the system.
- The typed name was not in the expected part of the subtree that we chose to highlight, which would be an excellent indication of user error or at least a warning to examine what was found by the Find panel.
- Task: For the top five subtrees with the most nodes-- are they likely to have a parent of a particular rank? Or does this happen in many ranks? Can you comment on how useful "rank" is?
- Rating:
Not Available
- Process:
Not Applicable
- Image:
Not Applicable
- Answer:
We are unable to comment on rank since rank is an attribute that the TreeJuxtaposer system does not handle.
File system and usage logs: Application specific tasks
- Task: Where are the big directories?
- Rating:
Strength
- Process:
Big directories are immediately visible from the layout since the vertical space consumed by directories indicates how many total leaves are in the subdirectory structure.
- Image:
File system tree
- Answer:
- It's obvious from the tree logs_A that "users" and "class" are the biggest directories linked to the root of the tree. Finding the biggest directory in any subtree can be done in this way, as long as no ancestor nodes of the subtree were previously grown or shrunk.
- Finding directories with the biggest number of immediate leaves (files) is more difficult with TreeJuxtaposer. Since the leaves are right-aligned and children are ordered alphabetically, the leaves for a particular node are interspersed between the non-leaf children of the node, making accurate estimations of the number of immediate files in a directory hard.
- Task: Can you see different patterns in the files?
- Rating:
Strength
- Process:
Navigating and marking allows a user to investigate patterns in the files.
- Image:
building and shankar directory observations
class directory expanded
Research project directory expanded
- Answer:
- Personal pages are found in 2 locations: in the users subdirectory such as "///users/hollings" and each user also has a users subdirectory directly attached to the root such as "///usershollings". The contents of these directories are different ("///usersshankar" has more leaves than "///users/shankar" but "///users/building" has more leaves than "///usersbuilding") but not much can be said about why the directory structure is set up this way without attributes.
- The personal pages comprise of more than 50% of the total number of leaf nodes in the system. Of the 76547 nodes, personal pages make up 42877 nodes: 20480 of which are in the "///users/hollings" type personal pages and 22397 in the "///usershollings" type personal pages. The totals are displayed by the Find panel but not displayed on the visualization as found nodes since there are too many nodes that would be highlighted to be useful.
- Class pages are found in the class subtree which breaks the years 1997-2003 into fall, spring and summer terms, each of which contains cmsc course pages.
- There are many fewer research pages ("///projects") than there are personal or class pages. The largest directory in "///projects" is hcil.
- Task: Were there a lot of pages created recently? If so, in which part of the file system?
- Rating:
Not Available
- Process:
Not Applicable
- Image:
Not Applicable
- Answer:
Since TreeJuxtaposer is unable to assess attributes of nodes, these tasks are not applicable for our system.
- Task: Are the newer directories bigger than the older projects?
- Rating:
Difficult
- Process:
Expand directories in sequential logs to determine if additions have been made recently. (TreeJuxtaposer isn't able to determine the age of a directory unless the directory has been added between the times which data was collected.)
- Image:
users directory expanded
cmsc434-0101 directory expanded
cmsc838p directory expanded
spring2003 directory expanded (new courses)
projects directory expanded
- Answer:
- The size, in total number of files, of the projects subtree is quite a bit smaller than the users directory. Furthermore, user "hollings" has about as many files as the entire projects directory combined. Using the Find panel, "///users/hollings" has 7194 nodes (leaves and internal nodes) and "///projects" has 8447 nodes.
- Personal pages show the most diverse and sporadic differences. There appear to be many people who either added/deleted/moved files in their personal directories or they didn't do anything in their directory that week.
- Class pages show a pattern of difference that is regular and expected. The only differences are between leaves in fall2002 and spring2003 subdirectories.
- Closer examination of the fall2002 differences shows that some files were deleted in the projects directory of cmsc434-0101.
- Examination of the changes in spring2003 show that cmsc838p has changed, and the changes were one delete ("design/openimpl.pdf") and several additions in multiple subdirectories.
- Spring2003 has several additional subdirectories, possibly reflecting these courses beginning. These courses include: cmsc102, cmsc106, cmsc412-201, cmsc417, cmsc433, cmsc733, and the cmsc434 directory has been further populated.
- There are very few changes in the project pages in this time period. The only leaf modifications are in the "jazz-chat" directory, where some files have been added. These changes ripple up the tree to the root; the ripples do not reflect the entire structure changing.
- Task: When was the page giving directions to the department last updated?
- Rating:
Not Available
- Process:
Not Applicable
- Image:
Not Applicable
- Answer:
Finding the page giving directions to the department can not be done with TreeJuxtaposer since this would require the attribute describing the file contents (extracted from the tag).
- Task: Which are the popular webpages?
- Rating:
Not Available
- Process:
Not Applicable
- Image:
Not Applicable
- Answer:
Can't comment on usage since usage attributes aren't handled by TreeJuxtaposer.
- Task: Are there some labs more popular than others?
- Rating:
Not Available
- Process:
Not Applicable
- Image:
Not Applicable
- Answer:
Can't comment on usage since usage attributes aren't handled by TreeJuxtaposer.
- Task: Which areas are getting more popular?
- Rating:
Not Available
- Process:
Not Applicable
- Image:
Not Applicable
- Answer:
Can't comment on usage since usage attributes aren't handled by TreeJuxtaposer.
- Task: Are new pages more popular that old pages?
- Rating:
Not Available
- Process:
Not Applicable
- Image:
Not Applicable
- Answer:
Can't comment on usage since usage attributes aren't handled by TreeJuxtaposer.
- Task: Which old pages are popular?
- Rating:
Not Available
- Process:
Not Applicable
- Image:
Not Applicable
- Answer:
Can't comment on usage since usage attributes aren't handled by TreeJuxtaposer.
- Task: What proportion of the pages are never used?
- Rating:
Not Available
- Process:
Not Applicable
- Image:
Not Applicable
- Answer:
Can't comment on usage since usage attributes aren't handled by TreeJuxtaposer.
- Task: What proportion of the pages are seldom used?
- Rating:
Not Available
- Process:
Not Applicable
- Image:
Not Applicable
- Answer:
Can't comment on usage since usage attributes aren't handled by TreeJuxtaposer.
File system and usage logs: HCIL subtree