See also the InfoVis04 Workshop on InfoVis Software Infrastructures page
Tcpdump is a powerful tool that shows all network traffic on a link,
but it can be quite hard to understand what's going on when confronted with
the raw tcpdump output. "Visual tcpdump" would ideally run off
either log file of a past tcpdump session or in realtime with live tcpdump
connection. There are several tasks one might target from this dataset. First,
visually characterizing traffic patterns - for example, showing the
distribution of session lengths or packet types. Second, highlighting
dangerous packets that could occur in a stream - for example, passwords sent
in plain text. Third, characterizing protocols - for example, showing the TCP
window size changes over the course of a session. Some previous knowledge of
networking will be helpful for this project.
Noticing that a network is under attack is difficult because of the sheer
volume of benign traffic, and the number of attack methods. The two main tasks
are real-time detection that an attack is occurring, and forensic analysis of
a past attack. There is a publicly
available dataset of network traces with four different simulated attacks
plus a control baseline with no attacks. Previous knowledge of networking and
security issues will be helpful for this project.
One way to map "the Internet" is to consider the structure of the
backbone router interconnections. Bill Cheswick has been keeping archives
of the daily changes in the roughly 100,000 core reachable routers for over
three years. Even the static dataset from a single day is a difficult
challenge to show comprehensibly, and showing growth and changes over time is
an even more interesting problem. The H3
browser for large graphs is a potential resource. This project should be
feasible without previous knowledge of networking.
A great set of databases, using Tableau should be a great way to
get started!
Data
InfoVis Contest
2005: Market Data
InfoVis
Contest 2004: Publication History
The InfoVis 2004 contest selected the history of the field as its
theme. The task was supporting the discovery and identification of
major research topics, relationships between members of the community,
trends over time, and so on.
InfoVis
Contest 2003: Tree Comparison
The InfoVis 2003 contest focused on tree comparison. This site has not
only data, but also an extensive list of tasks. The three datasets
are: many small binary phylogenetic trees, big taxonomies (200K
nodes), and 70K filesystems with usage logs.
EPA air quality
AirData Web site gives you access to air pollution data for the entire
United States. Want to know the highest ozone level measured in your
state last year? Ever wonder where air pollution monitoring sites are
located? Are there sources of air pollution in your town? You can find
out here! AirData produces reports and maps of air pollution data
based on criteria that you specify.
AirData presents annual summaries of air pollution data from three EPA databases:
Visual Tcpdump
Intrusion Detection
Internet Backbone
Jeff Klingner's List of Online Databases
http://graphics.stanford.edu/~klingner/online_databases.html
Eamonn Keogh Time Series Data Mining Archive
Heidi Lam Time Series Data list
Books
The following books are on reserve in the CS reading room:
Blogs
Back to 533 Home