Peptide blast is a database of peptide sequences derived from the text mining of MEDLINE abstracts, and the manual curation of full-text PDF articles. Using this approach, we have developed a very large and diverse collection of peptide sequences with many different uses in the biological, therapeutic and diagnostic fields.
Peptides are important for a variety of processes in the body, including hormones, receptors and membrane-translocating agents, as well as many enzyme substrates. They can also be used as inhibitors of other proteins and enzymes, for imaging cellular processes, or to test for potential drug candidates.
The database consists of a number of BLAST tools, such as tBLASTn (protein sequence searched against translated nucleotide sequences), bBLASTx (translated nucleotide sequence searched against protein sequences) and PSI-BLAST (protein sequence searched against PSSM matrices). All three tools can be useful for finding homologous protein coding regions in unannotated nucleotide sequences such as expressed sequence tags (ESTs) or draft genome records (HTGs), located in the BLAST databases est and htgs, respectively.
tBLASTn is particularly useful for searching for homologous protein coding regions in short cDNA sequences such as ESTs and draft genome records, which do not contain annotated coding sequences. It can also be useful for searching for protein sequences containing non-coding regions such as transmembrane domains, which are often difficult to locate in larger cDNA databases.
BLAST searches for similar sequences by locating matches, usually by seeding the search with a small number of matches between two sequences. The matching sequences are then compared with the database sequences. This comparison is then used to generate a local alignment between the query and database sequences.