======================================================================
An Atlas of Nonribosomal Peptide Synthetases and Polyketide Synthases
H. Wang et al. (2014) submitted
URL: http://npgc.biocenter.helsinki.fi/downloads/ (download)
URL: http://npgc.biocenter.helsinki.fi/ (web server)
======================================================================
The data are presented in three tab-separated tables:
species_table.txt:
- species identifier
- scientific name
- number of clusters
- NCBI BioProject identifier
- taxonomic lineage
cluster_table.txt:
- cluster identifier
- nucleotide sequence accession number
- species identifier
- classification of cluster type
- number of proteins in cluster
- number of domains in cluster
- start of locus
- end of locus
protein_table.txt:
- NCBI locus tag
- cluster identifier
- internal check
- names of domains in protein
- start-end coordinates of domains in protein
- FASTA header of amino acid sequence
- amino acid sequence
- length of amino acid sequence
- FASTA header of nucleotide sequence
- nucleotide sequence
- length of nucleotide sequence
======================================================================
Three additional data files are provided for convenience:
protein_table_small.txt:
- NCBI gene accession number
- cluster identifier
- internal check
- names of domains in protein
- start-end coordinates of domains in protein
- FASTA header of amino acid sequence
- length of amino acid sequence
- FASTA header of nucleotide sequence
- length of nucleotide sequence
atlas_proteins.fasta:
- FASTA formatted amino acid sequences of all genes in the database
atlas_nucleotide.fasta:
- FASTA formatted nucleotide sequences of all genes in the database
======================================================================