When generating a Circos plot, the formatting of the data to be represented is a crucial step. Here are some pointers on how to avoid the dreadful *** CIRCOS ERROR ***.
All data files must be in text format. For instance, using R, I would generate a myData.txt file that I would then call within a specific plot block (<plot>…</plot>). Data files are used for 2-dimensional graphical representations (histogram, scatter plot, heatmap, tiles), labels (which are technically also a type of graphical representation) and links. To know how to format your file, you must first determine how you want this data to be illustrated.
Type of data representation | Graphs | Labels | Links |
Columns needed | chr start end val | chr start end label | chr1 start1 end1 chr2 start2 end2 |
Example of data | chr1 1000 1199 1.00 chr1 1200 1399 15.00 chr1 1400 1599 -2.00 |
chr1 11873 14409 DDX11L1 chr1 14361 29370 WASH7P chr1 17368 17436 MIR6859-1 |
chr1 486 769 chr15 10026 10033 chr1 3426 3938 chr15 10021 10026 chr1 5763 6268 chr15 10021 10026 |
Other parameters can be added after the last column (after the val/label/end2 columns for graphs, labels and links files respectively), color for instance, but for now we will work with the basic formatting. To be noted, with or without additional parameters, the processes are still very similar.
Now that we know how we want to represent our data, we can start to format it. Import your raw data in R as a new data frame.
> data_df <- read.table("myRawData.txt", header = TRUE, sep = "\t", as.is = TRUE)
Then, work with your data as you usually would. For example, you could compute means, standard deviations or you could also do a statistic test and only select the values that are significant. This step is completely up to you!
It is important that you keep track of the chromosome and position of a given value. A good way to do so would be to put all the data in a table with a column for the chromosome and two more for the start and end positions. All the other columns are to your discretion.
When you are ready to save your data to a new file, create a new table with all the columns required by your desired representation (graphs, labels or links), as illustrated in the table above. If you were already working with a table, just make sure that the columns are in the right order. To effectively export your data, you could use this simple code line:
> write.table(myDataTable, file = "myData.txt", row.names = FALSE, col.names = FALSE, sep = "\t", quote = FALSE)
myDataTable is the name of the table you want to export while myData.txt is the name of the resulting text file. The circos format does not allow for row labels, hence the row.names = FALSE. Column labels are accepted to some extent, but you must have the exact formulation. To avoid any possible errors, I suggested that you also export you table without its column’s names. The separation argument ensure that every entry of every line are spaced by a tab, which is preferred by Circos. Finally, setting quote to false will remove the quotes (“”) of any string i.e. chromosomes and labels. This last argument is very important for Circos not to crash.
For security, double check your file before using it in a configuration file.
A lot of ***CIRCOS ERROR*** can be avoided when you know how to properly format your data files!
Leave A Comment