Circos > Data Visualization
Circos at the EMBO NGS workshop in Tunis, Sept 15–25.

The terrifying dinosaur corn genome

Amblin Entertainment and Legendary Pictures, the studios that produced Jurrasic World, try to inject genome science into the movie. Unfortunately, since we don't quite know how to construct viable genomes of extinct species, much less grow the creatures themselves, we don't know whether the depiction of the science is right. Perhaps theirs is exactly what a genome lab would look like in a dino-building facility.

But, we can get fewer things wrong. In the Creation Lab companion website, a Circos image is used to illustrate a triceratops genome.

Unfortunately, this is an image of the B73 Maize reference genome (B73 RefGen_v1), as published in Nature's The B73 Maize Genome: Complexity, Diversity, and Dynamics.

Schnable PS Ware D Fulton RS et al. 2009 The B73 maize genome: complexity, diversity, and dynamics Science 326 (5956) 1112-1115

Using News Reports to Track Wildlife Black Markets

THE INTERNATIONAL BLACK market in wildlife—alive or dead—is notoriously difficult to track. Hunters and smugglers don’t report their take for the same reasons that drug dealers don’t report profits to the IRS. But if you could actually track those networks, maybe you could do something about them. That’s what sent Nikkita Patel, a veterinary epidemiologist at the University of Pennsylvania, to an unusual source of data on the illegal wildlife trade: the news.


The image shows the illegal global rhinoceros trade network before (top) and after (bottom) a hypothetical targeted disruption. Created with Circos online table viewer.

Circos Maps America’s Restless Interstate Migration Without a Map

Wired has a writeup about migration patterns within the US that shows the data using d3.js chord diagrams, modeled after how Circos shows tabular data.

Circos reaches 500 literature citations

In October 2013 Circos reached a milestone - 500 citations in peer-reviewed literature.

To celebrate, I've made a commemorative poster that features over 400 Circos images from the literature.

citation list | image gallery | press highlights

Circos Interchange Diagrams — Networks and Flow

Zeng et al. introduce a new type of visualization based on Circos, the interchange diagram, in their paper Visualizing Interchange Patterns in Massive Movement Data.

The design is applied to displaying movement data, such as daily trips made by passengers in a city. By incorporating interactivity, this visualization method is helpful to understand interchange patterns at different spatial (between trains, between cities) and time scales (different times of day).

Circos has been used for urban planning before. The town of Caceres in Spain has used Circos to communicate their urban planning strategy.

project website

Zeng W, Fu C-W, Arisona SM et al. 2013 Visualizing Interchange Patterns in Massive Movement Data Computer Graphics Forum 32:271-280

Circos connects to the connectome

Methods to visualize the connectome are reviewed in Craddock et al — Circos is one of them.

Craddock RC, Jbabdi S, Yan C-G et al. 2013 Imaging human connectomes at the macroscale Nat Meth 10:524-539.

The use of Circos for showing the connectome was introduced by Irimia et al. in Circular representation of human cortical networks for subject and population-level connectomic visualization.

A good layman description of the work can be found at the neurosceptic blog.

Irimia A, Chambers MC, Torgerson CM et al. 2012 Circular representation of human cortical networks for subject and population-level connectomic visualization NeuroImage, Irimia A, Chambers MC, Torgerson CM et al. 2012 Patient-tailored connectomics visualization for the assessment of white matter atrophy in traumatic brain injury Frontiers in Neurology 3

Circos is the Method for Visualizing Translocations

Genomic rearrangements can cause disease and are implicated in many cancers. Being able to see the patterns in these changes across samples and patients is important.

In the review article End-joining, Translocations and Cancer, Bunting and Nussenzweig demonstrate how compositing the genome circularly adds value and clarity to the presentation.

Bunting SF, Nussenzweig A 2013 End-joining, translocations and cancer Nat Rev Cancer

From Degree to Job — Circos Visualizes Workforce Transitions

Finding the relationship between a student's major and career field is the topic of "Measuring Transitions Into The Workforce As A Form Of Accountability". The diagrams connect the flow of students from one of 17 fields of study (left) to job sectors (right).

Schenk TL, Jr. 2011 Measuring Transitions into the Workforce as a Form of Accountability SSRN eLibrary ID 1831967

Satyan L Devadoss from Williams College performed a similar analysis of Impact of Major on Career Path for 15600 Williams College Alums.

Circos tackles the connectome

Irimia et al. introduce circular representation of cortical networks in Circular representation of human cortical networks for subject and population-level connectomic visualization. The scalability of this circular visualization approach is demonstrated by lucid aggregate visualizations using cortical networks of 50 individuals.

The UCLA group also used the circular connectome visualization to assess differences in brain injury in patients Patient-tailored connectomics visualization for the assessment of white matter atrophy in traumatic brain injury in Frontiers in Neurotrauma.

A good layman description of the work can be found at the neurosceptic blog.

Irimia A, Chambers MC, Torgerson CM et al. 2012 Circular representation of human cortical networks for subject and population-level connectomic visualization NeuroImage, Irimia A, Chambers MC, Torgerson CM et al. 2012 Patient-tailored connectomics visualization for the assessment of white matter atrophy in traumatic brain injury Frontiers in Neurology 3

Hemolytic–Uremic Syndrome Outbreak

Rasko et al. use Circos to show how the E. coli strain implicated in the German outbreak of hemolytic-uremic syndrome varies from other strains in their New England Journal of Medicine paper, where they find that "the genome of the German outbreak strain can be distinguished from those of other O104:H4 strains because it contains a prophage encoding Shiga toxin 2 and a distinct set of additional virulence and antibiotic-resistance factors."

NEJM created an animation that explains the visualizations. The paper was blogged by Pacific Biosciences.

Rasko DA, Webster DR, Sahl JW et al. 2011 Origins of the E. coli Strain Causing an Outbreak of Hemolytic-Uremic Syndrome in Germany The New England journal of medicine published ahead of print:-.

Circos Maps Cancer Landscapes

Nature features an article by Heidi Ledford, The Cancer Genome Challenge, which discusses the progress and challenges of identifying structural variation signatures in cancer genomes.

Circos images are used throughout the piece, taken from the COSMIC project (Catalogue of Somatic Mutations in Cancer).

Ledford H 2010 Big science: The cancer genome challenge Nature 464 (7291) 972-974.

Linux kernel exploration

Răzvan Musăloiu-E. explored the Linux file system and used Circos to relate the systems (disk-based, optical mediums, flash-based, network-based, cluster-based, memory-based, ancient) to kernel symbols.

Circular Worle

Jonathan Feinberg (IBM) created this perfectly circular wordle for me, using content from the Circos site.

As far as I know, this is the only circular wordle.

Circos Citation Themes

A wordle created from the words of the over 100 scientific articles that cite Circos.

All Your Genes Are Belong To Us

Remembering one of the most viral internet memes.

Circos is catching on, too.

Circos at VIZBI 2011

Circos was one of the community visualization tool tutorials at VIZBI 2011, at the Broad Institute in Boston.

Circos Helps with Urban Planning

The town of Caceres, Spain, a UNESCO World Heritage Site, used Circos to illustrate the relationships between businesses in their urban planning strategy.

Hive Plots - Linear Layout for Network Visualization

Visualizing large networks is hard. Nobody wants to see another hairball, but you want to show your data.

What do you do?

Try our new linear layout for network visualization, introducing the hive plot. This plot takes a fresh approach to drawing networks. It scales well, shows topology, and makes the network layout based on meaningful properties.

Power of Round

Circular data tracks naturally support display of information at various resolutions.

Compared to a track at a radius r, a pixel in a track at r/4 will span a region 4x larger. Tracks in the interior of the figure are therefore useful to display low-resolution or summary information.

Circos Introduced in the New York Times

My first Circos infographic to be published in the New York Times introduces the idea of sequence similarity curves linking circularly composed ideograms.

Working with David Constantine, I illustrated the similarity between chromosome 1 of mouse, rhesus, chimp, and chicken to that of human.

One of the smaller panels in the infographic was subsequently used by the Alliance of Lupus Research in their Faces of Lupus II video.

NYT Article - Mapping the Epigenome

In collaboration with Jonathan Corum from the NYT, Martin Krzywinski created an illustration of data showing methylation on chromosome 22 in a variety of tissues.

The illustration accompanies the article Now: The Rest of the Genome, by Carl Zimmer.

Things to Consider

When faced with generating a visual representation of information, it's useful to ask the following.

  • What are the important features and patterns I wish to communicate?
  • What are the unimportant features and patterns, and how can I effectively minimize their visual impact? Can I hide them altogether?
  • How dense (or sparse) is my data?
  • What data resolution is required? Can I bin the data, or show summary statistics instead of the data (e.g. min/avg/max)?
  • What is the length scale of significant variation and non-signficant noise?
  • In what media will the figure be shown? How does the resolution of the media compare to the resolution of my data?
  • Do I have the time/energy/motivation to worry about black-and-white reproduction or readers with color blindness?

Even if the answers to these questions seem obvious to you, they help emphasize the purpose of the figure. Sometimes authors confuse the notion of "showing all data" with "sending a clear message" — it is unlikely that readers actually want to see all your data. It is better to send a clear message and not show the data, then show the data and hope the reader will arrive at the right conclusion.

A comparison of ancestral crucifer genome with three modern species. (797 x 251)
A comparison of an ancestral crucifer genome to that of three modern species. The panel on the right is a prototype 3-way comparison, with links relating positions on two modern crucifer genomes that have the same corresponding source in the ancestral genome.

How Circos Helps

it makes you think

One of the ways in which Circos helps is that it slows you down. Circos requires that you think about your data and design its layout before you write the configuration file (there is no interactive interface). This initial process of reflection can be both short and extremely productive in helping focusing your message.

it makes visual experimentation easy

Once you have written your configuration file, it is easy to hide elements, adjust scale (either globally or locally — a unique feature), or apply a different format to your data (visibility, opacity, shape, color and even position) based on dynamic rules.

In other words, you can generate a variety of figures without adjusting either the input data or signficant portions of the configuration file. Using dynamic rules, you can draw focus to data positions and/or values — these rules will apply to whatever the input data set is. For example, a single rule can color green all the glyphs in a scatter plot associated with a value of >0.5.

Because the configuration can be composed of multiple files, you can mix-and-match configuration blocks. This is very helpful when your configuration is largely fixed (data domain, position of tracks and ticks) with minor changes (min/max values of axes, zoom levels, etc).

Circos is perfect for data analysis environments in which data is processed in a multi-step pipeline (typically using multiple analysis tools). It can therefore be inserted into the pipeline to produce one (or more figures) automatically.

Depiction of the MLL recombinome of acute leukemias. (900 x 305)
Complex rearrangements of MLL in acute leukemias. These images are part of a poster that accompanies the publication of the MLL recombinome (C. Meyer et al., Leukemia 23, 1490 (Aug, 2009).

it emphasizes patterns in connections

The circular layout is ideal for showing how different positions within your data domain relate to one another. This relationship can be quantitative (e.g. similarity) or binary (e.g. is/isn't connected). Circos was initially designed to emphasize these kinds of relationships and therefore has many helpful features in fully illustrating these relationships.

By allowing you to adjust the thickness of the ribbons that represent relationships, the progression and orientation of the circular segments, and apply twists to the ribbons to indicate the orientation of the link, you can clearly show a large number of connections between data points (or positions) without exhausting the capacity of the reader.