Use the latest version of Circos and read Circos best practices—these list recent important changes and identify sources of common problems.
If you are having trouble, post your issue to the Circos Google Group and include all files and detailed error logs. Please do not email me directly unless it is urgent—you are much more likely to receive a timely reply from the group.
Don't know what question to ask? Read Points of View: Visualizing Biological Data by Bang Wong, myself and invited authors from the Points of View series.
This is a very important document—well worth reading. Chances are if you have asked a question in the Google Group, I have sent you to read this first.
Follow these tips to avoid common errors and to increase the legibility and modularity of your configuration files.
The configuration file is passed through several parsing steps, which import files, merge blocks, and replace values. To see the configuration data structure that Circos will use, use -cdump
.
> circos -cdump ... $VAR1 = { anglestep => 0.5, anti_aliasing => 1, ... };
The format is a Perl data structure, with {}
indicating a hash and []
a list.
The configuration dump is long. To obtain a list of a specific block, provide the block name.
> circos -cdump ideogram $VAR1 = { fill => 1, radius => '0.90r', spacing => { default => '0.005r' }, stroke_color => 'dgrey', stroke_thickness => '2p', thickness => '20p' };
To list parameters that match a regular expression,
> circos -cdump block:regex > circos -cdump :regex
See Runtime Parameter Tutorial for details.
You can change any configuration parameter without editing the configuration file by using the -param
flag. The syntax here is -param BLOCK/.../PARAMETER=VALUE
. For example,
> circos -param ideogram/show=no -param image/radius=2000p
This is useful to toggle things on/off for testing. For example, to turn all <plot> tracks off,
> circos -param plots/show=no
This is the equivalent of adding the following to the configuration file
<plots> show = no <plot> ... </plot> ... </plots>
See Runtime Parameter Tutorial for details.
Circos can produce debug messages from different components of the code. By default you will see only summary debug messages and, if your image takes more than 30 seconds to create, you will see timing information.
This is set by the debug_group
and debug_auto_timer_report
parameters in etc/housekeeping.conf
.
To include messages from other parts of the code, use debug_group
.
# include file I/O messages
> circos -debug_group io
To get a list of debug groups,
> circos -debug_group
See Debugging Tutorial for details.
Do not write your configuration files from scratch.
Do not use the configuration of the example/
as a template. It is quite complex.
Use the Quick Start Tutorials as templates for your configuration file.
Some files are conventionally included from the Circos distribution directory. Do not copy them outside of the Circos directories, unless you need to alter them.
etc/colors_fonts_patterns.conf etc/housekeeping.conf etc/image.conf data/karyotype/*
The contents of these files often change with new versions.
Colors, fonts and patterns should be imported from etc/colors_fonts_patterns.conf
.
# circos.conf <<include etc/colors_fonts_patterns.conf>>
This file, which can be found in etc/
in the distribution, automatically brings in etc/colors.conf
, etc/fonts.conf
, and etc/patterns.conf
, which you do not need to import individually.
Make sure that you're not defining colors twice by using this file and importing from etc/colors.conf
.
Circos comes with a large number of predefined colors. The best way to add your own is to append new color definitions to the <colors> block as follows
# circos.conf # your own colors will be merged and overwrite # any default color definitions included in # the <colors> block from # etc/colors_fonts_patterns.conf <colors> myred = 211,121,111 myblue = 85,143,190 </colors> # color, font, pattern defaults -- this is required <<include etc/colors_fonts_patterns.conf>>
The <colors> block imported from etc/colors_fonts_patterns.conf
will be merged with the one you've added. The order in which these appear in the configuration file does not matter.
See Configuration File Tutorial for details.
Every track type has some default variables set. To see these, look in etc/track/*
in the Circos distribution. To undefine a parameter
<plot> type = tile # removes the definition of this parameter stroke_thickness = undef ... </plot>
If you don't like the idea of these defaults, you can either comment the track_defaults
parameter in etc/housekeeping.conf
, or better yet undefine it in your configuration by overwriting the parameter (see below about doing this).
<<include etc/housekeeping.conf>>
track_defaults* = undef
Circos has many different parameters and keeping track of them can initially be a challenge. I suggest that you import default values, as done in the tutorials, by including configuration snippets from the Circos distribution.
For example, the <image> block, which defines output file location, size, color, etc, is populated conveniently by importing etc/image.conf
,
# circos.conf <image> <<include etc/image.conf>> </image>
Sometimes you'll want to redefine some of the parameters set by the included file. To do this, use the *
suffix syntax
# circos.conf <image> <<include etc/image.conf>> file* = myfile.png radius* = 1000p </image>
The value of any parameter named p
will be overwritten by the value of p*
. Similarly, p**
overwrites the value of code (p*), and so on. This is particularly useful if you want to adjust some system parameters temporarily
# circos.conf <<include etc/housekeeping.conf>> anti_aliasing* = no
See Configuration File Tutorial for details.
Use a separate directory for each Circos image. Put configuration files in etc/
and data in data/
. For an example, see the example/
directory, included in the Circos distribution.
Here's what a typical Circos project directory might look like,
cancer_snp/ etc/ circos.conf ticks.conf ideogram.conf ... data/ cnv.txt genes.txt ...
Circos will automatically look for circos.conf
as the main configuration file. Therefore, I suggest you name your file to this. If you have a one-to-one mapping between directory and image, there won't be any ambiguity. It is better to name the directory after the project, rather than configuration file. For example, instead of
circos/ circos_cancersnp.conf ...
I suggest
circos/ cancer_snp/ etc/ circos.conf
By having your configuration file named circos.conf
, you can quickly run Circos like this
# switch to the image project directory > cd cancer_snp > ls data/ etc/ # Circos will automatically look for circos.conf in several locations (including etc/), # You don't need to use the command-line flag "-conf etc/circos" > circos ... > ls data/ etc/ circos.png circos.svg
The default location of the output file is the same as directory of the configuration file. This is set in etc/image.generic.conf
dir = conf(configdir)
If you want to change the name of the output file, use -outputfile
and -outputdir
.
# ./snp.png > circos -outputfile snp.png # /tmp/snp.png > circos -outputfile snp.png -outputdir /tmp # /tmp/circos.png > circos -outputdir /tmp
As much as possible, avoid the use of absolute path names in the configuration files. Relative paths will be interpreted relative to your current directory. If Circos cannot find the file, it will look in a few other places, such as relative to the path of the configuration file as well as the Circos distribution.
For example, if you have
/user/bob/circos/cancersnp/ etc/ circos.conf data/ snp.txt
Then in circos.conf
you only need
file = data/snp.txt
and not
file = /user/bob/circos/cancersnp/data/snp.txt
Avoiding the use of absolute paths will make your Circos project files portable.
When a file name is defined using a relative path, Circos will attempt to look for it various places, constructed by combining three paths.
dir1/dir2/dir3 where dir1 = current_directory | configuration_directory | circos_binary_directory dir2 = . | .. | ../.. | ../../../ dir3 = . | etc | data
For example, some of the possible search locations are
current_directory/../ current_directory/../data current_directory/../etc current_directory/../../ current_directory/../../data current_directory/../../etc configuration_directory/ circos_binary_directory/../../etc
It's important to realize that this automatic searching is happening. You can see it in action by using the -debug_group io
command-line flag. For example, here I am running the example, found in example/
, from ~/tmp
directory
>: cd ~/tmp > circos -debug_group io -conf ~/work/circos/svn/example/etc/circos.conf ... # here Circos is trying to find the data/mm.bundle.txt file # try relative to current directory debuggroup io 6.31s trying /home/martink/tmp/data/mm.bundle.txt debuggroup io 6.31s trying /home/martink/tmp/etc/data/mm.bundle.txt debuggroup io 6.31s trying /home/martink/tmp/data/data/mm.bundle.txt debuggroup io 6.31s trying /home/martink/tmp/../data/mm.bundle.txt debuggroup io 6.31s trying /home/martink/tmp/../etc/data/mm.bundle.txt debuggroup io 6.31s trying /home/martink/tmp/../data/data/mm.bundle.txt debuggroup io 6.31s trying /home/martink/tmp/../../data/mm.bundle.txt debuggroup io 6.31s trying /home/martink/tmp/../../etc/data/mm.bundle.txt debuggroup io 6.31s trying /home/martink/tmp/../../data/data/mm.bundle.txt debuggroup io 6.31s trying /home/martink/tmp/../../../data/mm.bundle.txt debuggroup io 6.31s trying /home/martink/tmp/../../../etc/data/mm.bundle.txt debuggroup io 6.31s trying /home/martink/tmp/../../../data/data/mm.bundle.txt # try relative to configuration directory debuggroup io 6.31s trying /home/martink/work/circos/svn/example/etc/data/mm.bundle.txt debuggroup io 6.31s trying /home/martink/work/circos/svn/example/etc/etc/data/mm.bundle.txt debuggroup io 6.31s trying /home/martink/work/circos/svn/example/etc/data/data/mm.bundle.txt debuggroup io 6.31s trying /home/martink/work/circos/svn/example/etc/../data/mm.bundle.txt # found it! debuggroup io 6.31s data/mm.bundle.txt found in /home/martink/work/circos/svn/example/etc/../data/mm.bundle.txt
If you are using reference genomes, use the karyotype files that come bundled in the distribution in data/karyotype
.
karyotype = data/karyotype/karyotype.human.txt
Avoid unnecessary duplication and do not copy these files to your working directories. If you have custom karyotype files that you use often, place them in a central location.
karyotype = /path/to/my/karyotype/custom.txt
If you are plotting multiple genomes, it's best to keep their karyotype files separate and then specify them as a list.
karyotype = data/karyotype/karyotype.human.txt,data/karyotype/karyotype.mouse.txt
See Data Files Tutorial for details.
Use the one-line link format instead of the two line format.
# YES hs1 100 200 hs5 500 750 id=5 # NO link001 hs1 100 200 id=5 link001 hs5 500 750 id=5
See Data Files Tutorial for details.
In old versions these blocks required names. This is no longer needed.
# YES <link> ... # NO <link segdup> ...
If all your <plot> or <link> blocks share the same parameters, place them in the parent <plots> or <links> block.
<plots> color = black <plot> # color inherited from outer block ... </plot> <plot> # color inherited from outer block ... </plot> <plot> # override color color = blue ... </plot> </plots>
You can add cytogenetic bands to your ideograms, as described in the Ideograms Tutorial. These are regional annotations that (a) cover the ideogram entirely and (b) do not overlap. If you attempt to use the cytogenetic band functionality for other data, make sure that they meet these criteria.
Do not use the band function for showing genomic annotations like repeats or genes. Instead, use <plot> block of type=highlight
.
<plot> type = highlight file = repeats.txt r0 = dims(ideogram,radius_inner) r1 = dims(ideogram,radius_outer) </plot>
When drawing data on a 3 Gb genome, it is easy to quickly exceed the resolution of the output medium. Consider that a typical 3,000 x 3,000 pixel Circos image will have an ideogram circumference of about 6,000 pixels. The size of each pixel will be about 0.5 Mb on the genome.
If your data is sampled more finely than this, you will not see any structure in it because multiple data points will be assigned to the same pixel. In general, if you have more than 10,000 data points in any track, consider downsampling.
When drawing images on full mamallian genomes, I suggest starting with 5 Mb data bins. You can use the tools/resample
script to resample your data. This script is designed for moderate data sizes. If you have millions of data points, you're better off using something like bedtools.
For guidelines about matching data and output resolution, see Limits of Human Visual Acuity and Consequences on Sequence Visualization.
Rules no longer require the importance
parameter. They are evaluated in the order that they appear in the configuration file. The parameter is only used if you want the rules to be evaluated in some other order.
_FIELD_
syntax is deprecatedRules should use var(FIELD)
syntax, not _FIELD_
. For example,
# YES condition = var(chr) eq "hs1" # NO condition = _CHR_ eq "hs1"