Circos > Data Visualization

Things to Consider

When faced with generating a visual representation of information, it's useful to ask the following.

  • What are the important features and patterns I wish to communicate?
  • What are the unimportant features and patterns, and how can I effectively minimize their visual impact? Can I hide them altogether?
  • How dense (or sparse) is my data?
  • What data resolution is required? Can I bin the data, or show summary statistics instead of the data (e.g. min/avg/max)?
  • What is the length scale of significant variation and non-signficant noise?
  • In what media will the figure be shown? How does the resolution of the media compare to the resolution of my data?
  • Do I have the time/energy/motivation to worry about black-and-white reproduction or readers with color blindness?

Even if the answers to these questions seem obvious to you, they help emphasize the purpose of the figure. Sometimes authors confuse the notion of "showing all data" with "sending a clear message" — it is unlikely that readers actually want to see all your data. It is better to send a clear message and not show the data, then show the data and hope the reader will arrive at the right conclusion.

A comparison of ancestral crucifer genome with three modern species. (797 x 251)
A comparison of an ancestral crucifer genome to that of three modern species. The panel on the right is a prototype 3-way comparison, with links relating positions on two modern crucifer genomes that have the same corresponding source in the ancestral genome.

How Circos Helps

it makes you think

One of the ways in which Circos helps is that it slows you down. Circos requires that you think about your data and design its layout before you write the configuration file (there is no interactive interface). This initial process of reflection can be both short and extremely productive in helping focusing your message.

it makes visual experimentation easy

Once you have written your configuration file, it is easy to hide elements, adjust scale (either globally or locally — a unique feature), or apply a different format to your data (visibility, opacity, shape, color and even position) based on dynamic rules.

In other words, you can generate a variety of figures without adjusting either the input data or signficant portions of the configuration file. Using dynamic rules, you can draw focus to data positions and/or values — these rules will apply to whatever the input data set is. For example, a single rule can color green all the glyphs in a scatter plot associated with a value of >0.5.

Because the configuration can be composed of multiple files, you can mix-and-match configuration blocks. This is very helpful when your configuration is largely fixed (data domain, position of tracks and ticks) with minor changes (min/max values of axes, zoom levels, etc).

Circos is perfect for data analysis environments in which data is processed in a multi-step pipeline (typically using multiple analysis tools). It can therefore be inserted into the pipeline to produce one (or more figures) automatically.

Depiction of the MLL recombinome of acute leukemias. (900 x 305)
Complex rearrangements of MLL in acute leukemias. These images are part of a poster that accompanies the publication of the MLL recombinome (C. Meyer et al., Leukemia 23, 1490 (Aug, 2009).

it emphasizes patterns in connections

The circular layout is ideal for showing how different positions within your data domain relate to one another. This relationship can be quantitative (e.g. similarity) or binary (e.g. is/isn't connected). Circos was initially designed to emphasize these kinds of relationships and therefore has many helpful features in fully illustrating these relationships.

By allowing you to adjust the thickness of the ribbons that represent relationships, the progression and orientation of the circular segments, and apply twists to the ribbons to indicate the orientation of the link, you can clearly show a large number of connections between data points (or positions) without exhausting the capacity of the reader.