Example apps:
Example apps:
Example apps:
Example apps:
Example apps:
Example apps:
A lot of great stuff ranging from baseball stats, to arsenic levels in drinking water to cloud seeding in Tasmania to cancer data to the 2000 Florida presidential election vote counts by county to the water levels of the Fraser river. Includes all the Visualizing Data (Cleveland) examples.
Tcpdump is a powerful tool that shows all network traffic on a link,
but it can be quite hard to understand what's going on when confronted with
the raw tcpdump output. "Visual tcpdump" would ideally run off
either log file of a past tcpdump session or in realtime with live tcpdump
connection. There are several tasks one might target from this dataset. First,
visually characterizing traffic patterns - for example, showing the
distribution of session lengths or packet types. Second, highlighting
dangerous packets that could occur in a stream - for example, passwords sent
in plain text. Third, characterizing protocols - for example, showing the TCP
window size changes over the course of a session. Some previous knowledge of
networking will be helpful for this project.
Noticing that a network is under attack is difficult because of the sheer
volume of benign traffic, and the number of attack methods. The two main tasks
are real-time detection that an attack is occurring, and forensic analysis of
a past attack. There is a publicly
available dataset of network traces with four different simulated attacks
plus a control baseline with no attacks. Previous knowledge of networking and
security issues will be helpful for this project.
One way to map "the Internet" is to consider the structure of the
backbone router interconnections. Bill Cheswick has been keeping archives
of the daily changes in the roughly 100,000 core reachable routers for over
three years. Even the static dataset from a single day is a difficult
challenge to show comprehensibly, and showing growth and changes over time is
an even more interesting problem. The H3
browser for large graphs is a potential resource. This project should be
feasible without previous knowledge of networking.
A great set of databases, using Tableau should be a great way to
get started!
Data
InfoVis
Contest 2004
The InfoVis 2004 contest data will be released in early February.
InfoVis 2004 contest has selected the history of the field as its
theme. The competition will involve several data sets and seeks to
highlight and help you compare visualizations that support the
discovery and identification of major research topics, relationships
between members of the community, trends over time, and so on.
InfoVis
Contest 2003: Tree Comparison
The InfoVis 2003 contest focused on tree comparison. This site has not
only data, but also an extensive list of tasks. The three datasets
are: many small binary phylogenetic trees, big taxonomies (200K
nodes), and 70K filesystems with usage logs.
Laboratory Test Results
Wes Schreiber of the UBC med school is interested in working with
people who would like to experiment with visualizing laboratory test
data. Talk to me for more details.
StatLib datasets
Many datasets include a pointer to papers describing them.
EPA air quality
AirData Web site gives you access to air pollution data for the entire
United States. Want to know the highest ozone level measured in your
state last year? Ever wonder where air pollution monitoring sites are
located? Are there sources of air pollution in your town? You can find
out here! AirData produces reports and maps of air pollution data
based on criteria that you specify.
AirData presents annual summaries of air pollution data from three EPA databases:
Visual Tcpdump
Intrusion Detection
Internet Backbone
Jeff Klingner's List of Online Databases
http://graphics.stanford.edu/~klingner/online_databases.html
Books
The following books are on reserve in the CS reading room:
Back to 533 Home