Predict overrepresented motif complexes in any genome-wide set of regulatory regions.
Jankowski A., Prabhakar S., Tiuryn J.: TACO: a general-purpose tool for predicting cell-type–specific transcription factor dimers. BMC Genomics 2014, 15:208.
Jankowski A., Szczurek E., Jauch R., Tiuryn J., Prabhakar S.: Comprehensive prediction in 78 human cell lines reveals rigidity and compactness of transcription factor dimers. Genome Res. 2013, 23:1307-18.
If you have any questions or comments, please contact Aleksander Jankowski.
TACO is actively developed; for the latest version, see github.com/ajank/taco.
TACO, or Transcription factor Association from Complex Overrepresentation, is a program to predict overrepresented motif complexes in any genome-wide set of regulatory regions.
The latest packaged version of TACO is 1.0. The release package, TACO-1.0.tar.gz, contains both the source code and example specifications. It is licensed under the GNU General Public License. For the latest version, see github.com/ajank/taco.
TACO is written in C++ and should run on any Unix-like operating system, such as Linux and Mac OS X. To compile it, run make
. After a successful compilation, the executable file src/taco
could be copied to a system-wide directory, such as /usr/local/bin
.
TACO makes use of R library functions. It may happen that you do not have R installed, or it was not built as a library. In such a case, install the standalone R math library, found in the package libRmath-devel
or r-mathlib
(depending on the system distribution). If you encounter any problems with the compilation, please contact the author.
In the release package, a few example specification files are provided. To repeat the analyses, you will need:
We provide example lists of UW and Duke open chromatin datasets, as well as the respective URLs of narrowPeak files to be downloaded from the ENCODE Project. To download the latter ones, go to the wgEncodeUwDnase_hg19
or wgEncodeOpenChromDnase_hg19
subdirectory and run wget -i urls.list
.
We also provide example list of K562 ChIP-seq peaks, and the respective URLs, in similar manner. To repeat this analysis, for each dataset you will also need the top 5 motifs found in ChIP-seq peaks using MEME. They can be downloaded from Factorbook or generated locally.
Using a set of UW open chromatin datasets in mouse, we generated a comprehensive list of 186 cooperativity predictions in mouse; see the graphical view (44 MB PDF file) and the tabular description of underlying motif complexes. For a comprehensive list of 603 cooperativity predictions in human, please refer to our previous work.
The full documentation of TACO is provided on a separate page.