From Degree to Job — Circos Visualizes Workforce Transitions
Finding the relationship between a student's major and career field is the topic of "Measuring Transitions Into The Workforce As A Form Of Accountability". The diagrams connect the flow of students from one of 17 fields of study (left) to job sectors (right).
2011 Measuring Transitions into the Workforce as a Form of Accountability SSRN eLibrary ID 1831967
Satyan L Devadoss from Williams College performed a similar analysis of Impact of Major on Career Path for 15600 Williams College Alums.
Circos tackles the connectome
Irimia et al. introduce circular representation of cortical networks in Circular representation of human cortical networks for subject and population-level connectomic visualization. The scalability of this circular visualization approach is demonstrated by lucid aggregate visualizations using cortical networks of 50 individuals.
The UCLA group also used the circular connectome visualization to assess differences in brain injury in patients Patient-tailored connectomics visualization for the assessment of white matter atrophy in traumatic brain injury in Frontiers in Neurotrauma.
A good layman description of the work can be found at the neurosceptic blog.
2012 Circular representation of human cortical networks for subject and population-level connectomic visualization NeuroImage, 2012 Patient-tailored connectomics visualization for the assessment of white matter atrophy in traumatic brain injury Frontiers in Neurology 3
Hemolytic–Uremic Syndrome Outbreak
Rasko et al. use Circos to show how the E. coli strain implicated in the German outbreak of hemolytic-uremic syndrome varies from other strains in their New England Journal of Medicine paper, where they find that "the genome of the German outbreak strain can be distinguished from those of other O104:H4 strains because it contains a prophage encoding Shiga toxin 2 and a distinct set of additional virulence and antibiotic-resistance factors."
2011 Origins of the E. coli Strain Causing an Outbreak of Hemolytic-Uremic Syndrome in Germany The New England journal of medicine published ahead of print:-.
Circos Maps Cancer Landscapes
Nature features an article by Heidi Ledford, The Cancer Genome Challenge, which discusses the progress and challenges of identifying structural variation signatures in cancer genomes.
Circos images are used throughout the piece, taken from the COSMIC project (Catalogue of Somatic Mutations in Cancer).
2010 Big science: The cancer genome challenge Nature 464 (7291) 972-974.
Jonathan Feinberg (IBM) created this perfectly circular wordle for me, using content from the Circos site.
As far as I know, this is the only circular wordle.
All Your Genes Are Belong To Us
Remembering one of the most viral internet memes.
Circos is catching on, too.
Hive Plots - Linear Layout for Network Visualization
Visualizing large networks is hard. Nobody wants to see another hairball, but you want to show your data.
What do you do?
Try our new linear layout for network visualization, introducing the hive plot. This plot takes a fresh approach to drawing networks. It scales well, shows topology, and makes the network layout based on meaningful properties.
Power of Round
Circular data tracks naturally support display of information at various resolutions.
Compared to a track at a radius r, a pixel in a track at r/4 will span a region 4x larger. Tracks in the interior of the figure are therefore useful to display low-resolution or summary information.
Circos Introduced in the New York Times
My first Circos infographic to be published in the New York Times introduces the idea of sequence similarity curves linking circularly composed ideograms.
Working with David Constantine, I illustrated the similarity between chromosome 1 of mouse, rhesus, chimp, and chicken to that of human.
NYT Article - Mapping the Epigenome
In collaboration with Jonathan Corum from the NYT, Martin Krzywinski created an illustration of data showing methylation on chromosome 22 in a variety of tissues.
The illustration accompanies the article Now: The Rest of the Genome, by Carl Zimmer.
When faced with generating a visual representation of information, it's useful to ask the following.
Even if the answers to these questions seem obvious to you, they help emphasize the purpose of the figure. Sometimes authors confuse the notion of "showing all data" with "sending a clear message" — it is unlikely that readers actually want to see all your data. It is better to send a clear message and not show the data, then show the data and hope the reader will arrive at the right conclusion.
One of the ways in which Circos helps is that it slows you down. Circos requires that you think about your data and design its layout before you write the configuration file (there is no interactive interface). This initial process of reflection can be both short and extremely productive in helping focusing your message.
Once you have written your configuration file, it is easy to hide elements, adjust scale (either globally or locally — a unique feature), or apply a different format to your data (visibility, opacity, shape, color and even position) based on dynamic rules.
In other words, you can generate a variety of figures without adjusting either the input data or signficant portions of the configuration file. Using dynamic rules, you can draw focus to data positions and/or values — these rules will apply to whatever the input data set is. For example, a single rule can color green all the glyphs in a scatter plot associated with a value of >0.5.
Because the configuration can be composed of multiple files, you can mix-and-match configuration blocks. This is very helpful when your configuration is largely fixed (data domain, position of tracks and ticks) with minor changes (min/max values of axes, zoom levels, etc).
Circos is perfect for data analysis environments in which data is processed in a multi-step pipeline (typically using multiple analysis tools). It can therefore be inserted into the pipeline to produce one (or more figures) automatically.
The circular layout is ideal for showing how different positions within your data domain relate to one another. This relationship can be quantitative (e.g. similarity) or binary (e.g. is/isn't connected). Circos was initially designed to emphasize these kinds of relationships and therefore has many helpful features in fully illustrating these relationships.
By allowing you to adjust the thickness of the ribbons that represent relationships, the progression and orientation of the circular segments, and apply twists to the ribbons to indicate the orientation of the link, you can clearly show a large number of connections between data points (or positions) without exhausting the capacity of the reader.