Loading
Circos at the EMBO NGS workshop in Tunis, Sept 15–25.

The terrifying dinosaur corn genome

Amblin Entertainment and Legendary Pictures, the studios that produced Jurrasic World, try to inject genome science into the movie. Unfortunately, since we don't quite know how to construct viable genomes of extinct species, much less grow the creatures themselves, we don't know whether the depiction of the science is right. Perhaps theirs is exactly what a genome lab would look like in a dino-building facility.

But, we can get fewer things wrong. In the Creation Lab companion website, a Circos image is used to illustrate a triceratops genome.

Unfortunately, this is an image of the B73 Maize reference genome (B73 RefGen_v1), as published in Nature's The B73 Maize Genome: Complexity, Diversity, and Dynamics.

Schnable PS Ware D Fulton RS et al. 2009 The B73 maize genome: complexity, diversity, and dynamics Science 326 (5956) 1112-1115

Using News Reports to Track Wildlife Black Markets

http://www.wired.com/2015/06/using-news-reports-track-wildlife-black-markets/

THE INTERNATIONAL BLACK market in wildlife—alive or dead—is notoriously difficult to track. Hunters and smugglers don’t report their take for the same reasons that drug dealers don’t report profits to the IRS. But if you could actually track those networks, maybe you could do something about them. That’s what sent Nikkita Patel, a veterinary epidemiologist at the University of Pennsylvania, to an unusual source of data on the illegal wildlife trade: the news.

Wired

The image shows the illegal global rhinoceros trade network before (top) and after (bottom) a hypothetical targeted disruption. Created with Circos online table viewer.

Circos on Cancer Discovery Covers

The July 2013 issue cover shows a Circos plot of relative copy number changes in 38 oral squamous cell carcinoma tumors.

The September 2012 issue cover shows a collection of Circos images of somatic mutations in melanoma tumors.

July 2013 Pickering CR, Zhang J, Yoo SY et al. 2013 Integrative genomic characterization of oral squamous cell carcinoma identifies frequent somatic drivers Cancer discovery 3:770-781.

Sep 2012 Dahlman KB, Xia J, Hutchinson K et al. 2012 BRAF(L597) mutations in melanoma are associated with sensitivity to MEK inhibitors Cancer discovery 2:791-797.

Circos charts the placenta transcriptome

Saben et al. use Circos to visualize the transcriptome and gene expression of placenta from 20 healthy women in their article A comprehensive analysis of the human placenta transcriptome.

Saben J, Zhong Y, McKelvey S et al. 2014 A comprehensive analysis of the human placenta transcriptome Placenta 35:125-131.

Circos Maps America’s Restless Interstate Migration Without a Map

Wired has a writeup about migration patterns within the US that shows the data using d3.js chord diagrams, modeled after how Circos shows tabular data.

Circos on cover of UCSF Magazine

The Fall 2013 issue of UCSF Magazine has my Circos illustration of personalized medicine. The human outline motif is incorporated into other design elements in the issue.

The look of the image is inspired after Nature's Encode cover by Carl De Torres.

To learn how to generate the cover and variants, read the Circos Encode Cover Tutorial.

Circos on Cover of Cancer Cell

Yang et al. used network analysis approaches characterize a subtype of ovarian cancer associated with poor overall survival.

E-cadherin is a protein encoded by the CDH1 gene and is responsible for cell-cell adhesion. Yang linked the expression of E-cadherin to specific miRNAs that influenced the regulatory network singled out in this cancer subtype.

Yang D, Sun Y, Hu L et al. 2013 Integrated analyses identify a master microRNA regulatory network for the mesenchymal subtype in serous ovarian cancer Cancer cell 23:186-199

Circos reaches 500 literature citations

In October 2013 Circos reached a milestone - 500 citations in peer-reviewed literature.

To celebrate, I've made a commemorative poster that features over 400 Circos images from the literature.

citation list | image gallery | press highlights

Circos deals with 8 Gb Rye Genome

Because of its large 8 Gb genome, the genomic analysis of rye has lagged behind other cereals.

To address this, Martis et al. eastablished a linear gene order model for 72% of the rye genes based on synteny information from rice, sorghum and B. distachyon.

Although it appears that six major translocations shaped the modern rye genome, highly dissimilar conserved syntenic gene content, gene sequence diversity signatures, and phylogenetic networks were found for individual rye syntenic blocks.

Martis MM, Zhou R, Haseneyer G et al. 2013 Reticulate Evolution of the Rye Genome Plant Cell

Circos Stages Mesolithic to Neolithic Transition

Bollongino et al. present evidence of a slow transition between Mesolithic hunter-gatherer groups to Neolithic farmers.

Previous theories that the foragers disappeared shortly after the arrival of farmers are at odds with palaeogenetic and isotopic data analysis from Neolithic human skeletons from the Blätterhöhle burial site in Germany. Instead of an abrupt transition, the data suggest a more complex pattern of coexistence that persisted for over 2000 years.

Bollongino R, Nehlich O, Richards MP et al. 2013 2000 years of parallel societies in Stone Age Central Europe Science 342:479-481.

Circos in 54 million pixels

Ruddle et al. demonstrate their commodity hardware 54 million pixel data display in exploring copy number variation data.

Ruddle RA, Fateen W, Treanor Det al.. 2013. Leveraging Wall-sized High-Resolution Displays for Comparative Genomics Analyses of Copy Number Variation. In IEEE Symposium on Biological Data Visualization, Atlanta, GA.

Circos Tracks CO2 Emissions

Kanemoto et al. report on the disturbing trend of emissions leakage, in which developing countries are displacing emissions intensive production offshore.

The report confirms previous findings that adjusting for trade, developed countries emissions have increased, not decreased. A connection is made to the kind of emissions displacement that has already occurred for air pollution, where despite aggressive legislation in major emitters total global air pollution emissions have increased.

The conclusion warns us that "if regulatory policies do not account for embodied imports, global emissions are likely to rise even if developed countries emitters enforce strong national emissions targets."

Kanemoto K, Moran D, Lenzen M et al. 2013 International trade undermines national emission reduction targets: New evidence from air pollution Global Environmental Change

Circos Round — Lotus Sacred

The pleasing roundness of Circos is used by Ming et al. to depict the Sacred Lotus genome in the publication "Genome of the long-living sacred lotus (Nelumbo nucifera Gaertn.).

The Sacred lotus has religious significance in both Buddhism and Hinduism and has been used as a food and herbal medicine product in Asia for over 7,000 years. Its seeds have exceptional longevity, remaining viable for as long as 1,300 years.

The plant is known for its exceptional water repellency, known as the lotus effect. The latter property is due to the nanoscopic closely packed protuberances of its self-cleaning leaf surface, which have been adapted for the manufacture of a self-cleaning industrial paint, Lotusan.

Ming R, Vanburen R, Liu Y et al. 2013 Genome of the long-living sacred lotus (Nelumbo nucifera Gaertn.) Genome Biol 14:R41.

6.9e11 g of oil and Circos was there

Rivers et a. describe the effects of the Deepwater Horizon blowout on the microbial blooms of petroleum-degrading bacteria.

By sequencing 66 million community transcripts, the identity of metabolically active microbes and their roles in petroleum consumption was revealed.

Rivers AR, Sharma S, Tringe SG et al. 2013 Transcriptional response of bathypelagic marine bacterioplankton to the Deepwater Horizon oil spill The ISME journal

Plants Love Circos

Circos frequently appears in plant literature, twice on the cover of Plant Biotechnology Journal in the last year.

Rai KM, Singh SK, Bhardwaj A et al. 2013 Large-scale resource development in Gossypium hirsutum L. by 454 sequencing of genic-enriched libraries from six diverse genotypes Plant biotechnology journal

Bekele WA, Wieckhorst S, Friedt W et al. 2013 High-throughput genomics in sorghum: from whole-genome resequencing to a SNP screening array Plant biotechnology journal

Circos has appeared 8 times each in the Plant Journal and Plant Cell.

Circos for R

Zhang et al. implement Circos in R.

Same round shape you expect. And now, in everyone's favourite open source statistics and data analysis environment.

CRAN RCircos package

Zhang H, Meltzer P, Davis S 2013 RCircos: an R package for Circos 2D track plots BMC Bioinformatics 14:244.

Circos Interchange Diagrams — Networks and Flow

Zeng et al. introduce a new type of visualization based on Circos, the interchange diagram, in their paper Visualizing Interchange Patterns in Massive Movement Data.

The design is applied to displaying movement data, such as daily trips made by passengers in a city. By incorporating interactivity, this visualization method is helpful to understand interchange patterns at different spatial (between trains, between cities) and time scales (different times of day).

Circos has been used for urban planning before. The town of Caceres in Spain has used Circos to communicate their urban planning strategy.

project website

Zeng W, Fu C-W, Arisona SM et al. 2013 Visualizing Interchange Patterns in Massive Movement Data Computer Graphics Forum 32:271-280

Circos connects to the connectome

Methods to visualize the connectome are reviewed in Craddock et al — Circos is one of them.

Craddock RC, Jbabdi S, Yan C-G et al. 2013 Imaging human connectomes at the macroscale Nat Meth 10:524-539.

The use of Circos for showing the connectome was introduced by Irimia et al. in Circular representation of human cortical networks for subject and population-level connectomic visualization.

A good layman description of the work can be found at the neurosceptic blog.

Irimia A, Chambers MC, Torgerson CM et al. 2012 Circular representation of human cortical networks for subject and population-level connectomic visualization NeuroImage, Irimia A, Chambers MC, Torgerson CM et al. 2012 Patient-tailored connectomics visualization for the assessment of white matter atrophy in traumatic brain injury Frontiers in Neurology 3

Circos is the Method for Visualizing Translocations

Genomic rearrangements can cause disease and are implicated in many cancers. Being able to see the patterns in these changes across samples and patients is important.

In the review article End-joining, Translocations and Cancer, Bunting and Nussenzweig demonstrate how compositing the genome circularly adds value and clarity to the presentation.

Bunting SF, Nussenzweig A 2013 End-joining, translocations and cancer Nat Rev Cancer

Circos Paints Chromosomes of Capsella Rubella

Slotte et al. use Circos to show the genomic structures, chromosome painting and comparative genomic mapping in C. rubella, A. lyrata and A. thaliana.

Their figure illustrates how Circos is effective at showing two-way comparisons of syntenic structure. For three-way comparison, consider hive plots.

Slotte T, Hazzouri KM, Agren JA et al. 2013 The Capsella rubella genome and the genomic consequences of rapid mating system evolution Nat Genet

Circos on the Cover Of Journal of Pathology

The June 2013 issue of the Journal of Pathology features a pair of Circos plots on the cover. The images are from the paper by Weier et al. describing TMPRSS2 and ERG rearrangements in prostate cancer.

"TMPRSS2–ERG rearrangements occur in approximately 50% of prostate cancers and therefore represent one of the most frequently observed structural rearrangements in all cancers."

Weier C, Haffner MC, Mosbruger T et al. 2013 Nucleotide resolution analysis of TMPRSS2 and ERG rearrangements in prostate cancer J Pathol 230:174-183.

Circos on the Cover Of Nature's Asian Journal of Andrology

The May 2013 Special Issue of Asian Journal of Andrology presents the outcomes from the Sixth Annual Forum on Prostate Disease (6th FPD), which was held on June 8-9, 2012 in Shanghai, China [source: nature.com]. The cover art for the issue shows a Circos plot of 90 significantly recurrent molecular alterations in prostate cancer from an analysis of 372 prostate tumors discussed in the Wyatt et al. review article.

The review summarizes the current state of understanding of prostate cancer, "including the sentinel role of copy number variation, the growing spectrum of oncogenic fusion genes, the potential influence of chromothripsis, and breakthroughs in defining mutation-associated subtypes. Increasing evidence suggests that genomic lesions frequently converge on specific cellular functions and signalling pathways, yet recurrent gene aberration appears rare".

Wyatt AW, Mo F, Wang Y et al. 2013 The diverse heterogeneity of molecular alterations in prostate cancer identified through next-generation sequencing Asian J Androl 15:301-308.

What is Circos?

Circular visualization

Circos is a software package for visualizing data and information. It visualizes data in a circular layout — this makes Circos ideal for exploring relationships between objects or positions. There are other reasons why a circular layout is advantageous, not the least being the fact that it is attractive.

Circos is ideal for creating publication-quality infographics and illustrations with a high data-to-ink ratio, richly layered data and pleasant symmetries. You have fine control each element in the figure to tailor its focus points and detail to your audience.

Circular genome and data visualization with Circos (950 x 234)
Images created with Circos, illustrating links, ribbons, tiles and a variety of 2D data tracks. If it's round, Circos can probably do it (more images).

Circos is flexible. Although originally designed for visualizing genomic data, it can create figures from data in any field—from genomics to visualizing migration to mathematical art. If you have data that describes relationships or multi-layered annotations of one or more scales, Circos is for you.

Circos can be automated. It is controlled by plain-text configuration files, which makes it easily incorporated into data acquisition, analysis and reporting pipelines (a data pipeline is a multi-step process in which data is analyzed by multiple and typically independent tools, each passing their output as the input to the next step).

Popular and Pretty

Have you noticed how beautifully everyday science and technology is rendered in movies? Information is delivered seamlessly from interfaces oozing with style and function. While others complain that the movie doesn't get the science facts right, I contrarily note that it doesn't get the science look right. No busy scientist is able to make such great design and type face choices!

An interactive hologram of the periodic table of elements in Iron Man 2 (400 x 200)
Experiments in movies are beautiful.

Sadly, the reality of cutting-edge science reveals a grimmer picture, replete with incomprehensible figures, illegible color combination and awkward type faces. This is due in large part by the fact that the people in charge of the science are too busy with the science to worry about figures. It is therefore important for designers, artists and other visual creatives to continue providing working scientists with tools that are useful, effective and ... pretty. One example of this kind of knowledge transfer are Brewer palettes. The scientists will thank you, the press will thank you, as will the public and policy makers, who are ultimately asked to digest the results.

Circos attempts to bring a different aesthetic to science and strike a balance between flexibility and ease-of-use. Circos makes no assumptions about your data, uses extremely simple input data format, and makes image creation and customization easy. It's helping to make science look better, one figure at a time.

Circos has appeared in many publications, both scientific and general. It has changed the way the scientific community visualizes genomic alterations (changes in a genome over time, or differences between two or more genomes). One timely application of this approach is creating effective figures showing how cancer genomes differ from healthy ones (e.g. COSMIC: Census of Somatic Mutations in Cancer).

The biological scientific community has adopted Circos wholeheartedly. By now, Circos has appeared on the the covers of both Nature and Science publications, which are the world's top scientific journals. publication-logos

Circos in Bioinformatics - Circoletto Circos in Genome Biology - Evolution of an adenocarcinoma in response to selection by targeted kinase inhibitors Circos in Nature - A first look at entire human methylomes Circos in Science - The B73 Maize Genome: Complexity, Diversity, and Dynamics Circos in PNAS - A Nitrospira metagenome illuminates the physiology and evolution of globally important nitrite-oxidizing bacteria
Circos in American Scientist - Genetics and the Shape of Dogs Circos in Genome Research - Automated identification of conserved synteny after whole-genome duplication Circos in Conde Nast Portfolio - 23andme Circos in Wired - Getting Lost Circos in PNAS - A Nitrospira metagenome illuminates the physiology and evolution of globally important nitrite-oxidizing bacteria
Circos in PLoS - U87MG Decoded: The Genomic Sequence of a Cytogenetically Aberrant Human Cancer Cell Line
Circos in Plant Cell - Fast Diploidization in Close Mesopolyploid Relatives of Arabidopsis Circos in New York Times - Now, for the rest of the genome New insights to the MLL recombinome of acute leukemias Circos in SEED

My images created with Circos have appeared in a variety of publications. Wired, New York Times, Conde Nast Portfolio, and American Scientist.

In genomics, scientific journals like Science, Nature, PLoS, Genome Research and others have published papers that used Circos images (Circos citations).

Circos has been published in (950 x 300)
A sampling of published images from New York Times, Conde Nast Portfolio, American Scientist, Nature and various books (see more).

Scriptable and Automatable - Get Your *NIX Geek On

Creation of images is controlled through a plain-text configuration file — there is no interactive user interface. This approach to configuration should be very famililar to you if you have UNIX experience.

If you're used to pointing (and clicking), you're in for both a surprise and a treat and, initially, perhaps for a little bit of frustration. It's ok, don't worry. Although Circos' barrier to entry is higher than most applications you may have used, once you become comfortable with Circos and gain experience in its use, you will see benefits from Circos' approach and will be able to convert the time you invested into learning Circos into great-looking figures.

Image creation can be completely automated — you can write scripts to generate both data and configuration file and make a call to Circos to generate the image — making Circos suitable for incorporation into data analysis pipelines and applications. In this way, Circos is similar to gnuplot.

Dynamic Formatting with Run-Time Rules

Most aspects of the output image can be adjusted using dynamic rules, which format elements of the figure based on data values. This feature allows a variety of images to be created without changing the input data or configuration file.

Using run-time rules, format of elements in the figure can be changed based on data values. (652 x 336)
By using run-time rules, defined in the configuration files, you can control how elements of the figure are drawn based on data values.

This feature is extremely powerful and uniquely suited for visual analytics. For example, for a given data track (e.g. histogram) you can ask that all bins with values >10 are colored blue, or more generally you can color the bins by value using your own color scheme. Rules can be chained. For example, later in the rule chain, you can ask that any blue bins that fall within a specific position range be hidden.

Circos is great data. Your data! (998 x 150)

Who should care?

If you are a researcher, analyst, data geek, art director, illustrator or visual artist who is seeking to explore or communicate a data set, or to think outside the box (and inside a circle), Circos is worth looking into.

What is it for?

Circos can be used to display any kind of information. It's particularly suitable for layering different data sets to create highly informative infographics with texture and visual appeal. Circos can make low-resolution bitmaps, suitable for basic web-based reporting, as well as publication-quality images with a lot of bling (but I mean legible, clear and informative bling!).

Circos is great for genomic data. Your genomic data! (940 x 281)
Circos has features that makes it ideal for drawing genomic information. Shown here are ChIP-Seq, chr 22 methylation, whole-genome methylation, multi-species comparison, human genome variation and self-similarity and MLL recombinome (see more).

Circos was initially designed for displaying genomic data (particularly cancer genomics and comparative genomics) and molecular biology. It has specific features that address typical challenges in drawing these kind of data, which tend to be very sparse and encompass a large number of length scales.

Circos has been cited by over 200 scientific articles. This wordle is created from the words of their titles. (950 x 371)
Circos has been cited by over 500 scientific articles. This wordle is created from the words of their titles. The most common words are genome, analysis, sequence, human, comparative and cancer.
Circos is great data. Your data! (998 x 150)

Only for genomics? No!

Data is data. Circos is flexible. There is nothing about Circos that is specific to genomics — it just happens that I work in genomics and therefore the tool has been applied to this field.

Circos can illustrate genomic rearrangements, where a relationship between two elements (genomic positions) represents a structural fusion. Circos can also visually represent the flow of refugees, where a relationship between two elements (countries) represents the extent of ingress and egress.

Circos is great for general data. Your general data! (940 x 281)
Circos is suitable for showing any kind of relationships among data. Shown here are car purchase trends, chemical reactivity, and dating trends, among others (see more).

To name a few, Circos has been used to visualize customer flow in the auto industry, volume of courier shipments, database schemas, and presidential debates.

How is it different?

My purpose in creating Circos was not as much to create yet another way to draw data, but rather to create a tool which can help make data look beautiful. The compactness of the circular form is inherently more appealing than a linear layout. Although some figures are ideally suited for a square layout, most of the time a circular figure can match or exceed efficiency in delivering information, have a higher ink-to-data ratio and sit more tightly on the page.

It is easy to plot, format and layer your data with Circos. A large variety of plot and feature parameters are customizable, helping you make the image that best communicates your data. You supply your data to Circos as plain-text files, tell Circos what you want plotted using the configuration file, and then create the image.

A panel comparing each chromosome of the dog genome with the entire human genome. The Circos American Scientist cover is based on these data. (950 x 277)
Each panel shows the sequence similarity between one dog chromosome, placed at the bottom half of the circle, with the entire human genome, placed at the top. These data were combined into a single figure which appeared on the cover of the American Scientist. The image accompanies the article Genetics and the Shape of Dogs by Elaine Ostrander (see more).

Useful to you?

How do you know whether Circos can be useful to you? First, look at published images and see what others are doing with Circos (for other images, see sample image archive). For examples of Circos' capabilities, see the tutorial images. These image sets will give you an idea of the types of data visualizations that Circos can create.

Circos can be used to generate visualizations of tabular data. (950 x 419)
A collection of images created by users of the online tableviewer utility, which uses Circos to visualize tabular information. (zoom).

For quick exposure to Circos, try the online tableviewer, which is an instance of Circos designed to visualize tabular data. You can upload a table (e.g. exported from a spreadsheet) and have it drawn à la Circos. If you don't have any data (who these days doesn't?), you can choose to use pre-generated or random tabular data.

To learn how Circos can be used in specific applications, browse the walkthrough guides which spend some time telling you about features and applications, use in genomics and application to table visualization.

To get your feet wet and hands dirty, download Circos and a read the tutorials, or dive into a full course on Circos.

Circos is great data. Your data! (998 x 150)

Brief History

Circos was originally conceived for visualizing genomic data such as alignments and structural variation. Over time, support was added for 2D data tracks such as line, scatter, heatmap and histogram plots.

As Circos' popularity grew — sparked by a New York Times full-page infographic — it started to be used for visualizing other data, not just genomics.

Future of Circos

I work on Circos in a passive-aggressive manner - sometimes passive sometimes aggressive. I welcome your comments.

Visit the Circos forum or contact Martin Krzywinski if you would like to report a bug, request a feature or share the ways in which you are using, or hope to use, Circos.

License and Use

Circos is free software, licensed under GPL.

Circos is written in Perl, can be deployed on any operating system for which Perl is available (e.g. Windows, Mac OS X, Linux and other UNIX flavours) and produces bitmap (PNG) and vector (SVG) images using plain text configuration and input files.

Circos is great data. Your data! (998 x 150)