Web-based Molecular Biology Tools

Molecular biology is the study of biological macromolecules at the structural and functional level, particularly DNA and proteins. There are many free resources on the Internet to study various aspects of these primary constituents. The following is a list of some of these web-based tools and a brief description with some verbiage used from the native site. This is not a comprehensive list, but it is meant to provide a good starting point for researchers. Some resources appear in more than one category.

General Sites

  • BYU DNA Sequencing Center Resources
    The DNA Sequencing Center (DNASC) at Brigham Young University has also created an online resource page with additional resources.
  • DBGET
    DBGET is a simple database retrieval system for finding and obtaining specific entries of diverse databases. Here a database is simply considered a sequential collection of entries, which may be stored in a single file or multiple files. Because each entry of a database is given a unique identifier, molecular biology databases in the world can be retrieved uniformly by the combination of the database name and the identifier.
  • European Bioinformatics Institute
    European Bioinformatics Institute (EBI) is a center for research and services in bioinformatics. The Institute manages databases of biological data including nucleic acid, protein sequences and macromolecular structures.
  • Expasy
    Molecular server that is dedicated to the analysis of protein and nucleic acid sequence. Protein identification and characterization tools:

    • Identification and characterization with peptide mass fingerprinting data
    • Identification and characterization with MS/MS data
    • Identification with isoelectric point, molecular weight and/or amino acid composition
    • Other prediction or characterization tools, MS data (vizualisation, quantitation, analysis, etc.), and 2-DE data (image analysis, data publishing, etc.).
  • Java based Molecular Biologist’s Workbench
    This site contains a workbench of tools for DNA and protein analysis: Data entry, data manipulation, data analysis, genetical and functional site mapping, and primer design.
  • National Center for Biotechnology Information
    NCBI’s mission is to develop new information technologies to aid in the understanding of fundamental molecular and genetic processes that control health and disease. It contains links to the Genbank database, tools for data mining including BLAST, COGS, MapViewer, LocusLink, UniGen, ORF finder, Electronic PCR, VAST search, CCAP, Human-Mouse Homology maps, VecScreen, and Cancer Genome Anatomy Project. Also provides access to Entrez: a retrieval system for searching several linked databases, including PubMed, Nucleotide sequence database, protein sequence database, structure, genome, population data sets, Online Mendelian Inheritance in Man, taxonomy, 3D domains, ProbeSet, and online books.
  • National Center for Genome Resources
    The National Center for Genome Resources (NCGR) contains information and links to various genome related projects.

Nucleic Acid Sequencing Tools

  • Biosyn Gizmo Tools
    Bundle of databases (siRNA, protein, peptide antigen) and tools, including a Bioinformatic Glossary, Genetic Code Table, Nucleic Acids and Protein Calculations, and an Oligo Properties Calculator.
  • BLASTN
    Searches for sequence homology between your sequence and those in the databases. BLASTN will perform search in DNA sequences; BLASTX will translate your sequence in all 6 frames and perform a search in protein sequences.
  • Codon Usage Database
    A query box to search a codon usage table for an organism, is presented. Search can be done with Latin name or its sub-string of organism. Useful for creation of primers and probes.
  • Sequence Manipulation Suite (SMS)
    The Sequence Manipulation Suite in BioSyn’s Gizmo Tools is a collection of JavaScript programs for generating, formatting, and analyzing short DNA and protein sequences. It is commonly used by molecular biologists, for teaching, and for program and algorithm testing.

Genomic Resources

  • GenomeNet
    GenomeNet is a Japanese network of database and computational services for genome research and related research areas in molecular and cellular biology. GenomeNet was established in September 1991 under the Human Genome Program (HGP) of the Ministry of Education, Science, Sports and Culture (MESSC).
  • National Center for Genome Resources
    National Center for Genome Resources (NCGR) contains information and links to various genome related projects.
  • SoftBerry
    Softberry, Inc. is a leading developer of software tools for genomic research. Their primary areas of interest and expertise are in the following areas: *Genome annotation *Functional site identification in DNA and Proteins *Sequence database managing *Genome comparison *Expression data analysis *Protein structure prediction. *Protein compartment (destination) prediction.
  • UCSC Genome Browser
    The University of California, Santa Cruz (UCSC) Genome Browser website contains the reference sequence and working draft assemblies for a large collection of genomes.
  • db GAP (NCBI)
    The database of Genotypes and Phenotypes (dbGaP) was developed to archive and distribute the results of studies that have investigated the interaction of genotype and phenotype. Such studies include genome-wide association studies, medical sequencing, molecular diagnostic assays, as well as association between genotype and non-clinical traits.
  • Ensembl
    The Ensembl project produces genome databases for vertebrates and other eukaryotic species, and makes this information freely available online.

Protein Sequence Analysis Tools

  • Expasy
    Molecular server that is dedicated to the analysis of protein and nucleic acid sequence. Protein identification and characterization tools:

    • Identification and characterization with peptide mass fingerprinting data
    • Identification and characterization with MS/MS data
    • Identification with isoelectric point, molecular weight and/or amino acid composition
    • Other prediction or characterization tools, MS data (vizualisation, quantitation, analysis, etc.), and 2-DE data (image analysis, data publishing, etc.).
  • FramePlot
    Protein coding region prediction in Bacterial DNA.
  • MPEx
    Membrane Protein Explorer (MPEx) is a tool for exploring the topology and other features of membrane proteins by means of hydropathy plots based upon thermodynamic principles.
  • PredictProtein
    PredictProtein is an Internet service for sequence analysis and the prediction of protein structure and function. Users submit protein sequences or alignments; PredictProtein returns multiple sequence alignments, PROSITE sequence motifs, low-complexity regions (SEG), nuclear localization signals, regions lacking regular structure (NORS) and predictions of secondary structure, solvent accessibility, globular regions, transmembrane helices, coiled-coil regions, structural switch regions, disulfide-bonds, sub-cellular localization, and functional annotations. Upon request fold recognition by prediction-based threading, CHOP domain assignments, predictions of transmembrane strands and inter-residue contacts are also available.
  • ProDom
    ProDom is a protein domain family database constructed automatically by clustering homologous segments. The ProDom building procedure MKDOM2 is based on recursive PSI-BLAST searches [ALTS2]. The source protein sequences are non-fragmentary sequences derived from SWISS-PROT and TrEMBL databases.
  • ProtScale
    ProtScale allows you to compute and represent the profile produced by any amino acid scale on a selected protein. An amino acid scale is defined by a numerical value assigned to each type of amino acid. The most frequently used scales are the hydrophobicity or hydrophilicity scales and the secondary structure conformational parameters scales, but many other scales exist which are based on different chemical and physical properties of the amino acids. This program provides 57 predefined scales entered from the literature.
  • Sequence Manipulation Suite (SMS)
    The Sequence Manipulation Suite in BioSyn’s Gizmo Tools is a collection of JavaScript programs for generating, formatting, and analyzing short DNA and protein sequences. It is commonly used by molecular biologists, for teaching, and for program and algorithm testing.
  • Worldwide Protein Data Bank (wwPDB)
    The wwPDB maintains a single Protein Data Bank Archive of macromolecular structural data that is freely and publicly available to the global community.

3D Macromolecular Structure Tools

  • Cn3D
    Cn3D is a helper application for web browsers that allows you to view 3-dimensional structures from NCBI’s Entrez retrieval service. Cn3D runs on Windows, Mac, and Unix. Cn3D simultaneously displays structure, sequence, and alignment, and now has powerful annotation and alignment editing features.
  • DeepView
    Swiss-PdbViewer (aka DeepView) is an application that provides a user friendly interface allowing to analyze several proteins at the same time. The proteins can be superimposed in order to deduce structural alignments and compare their active sites or any other relevant parts. Amino acid mutations, H-bonds, angles and distances between atoms are easy to obtain thanks to the intuitive graphic and menu interface.
  • Povray
    When used with Swiss-PDB viewer the rendered output image appears much sharper and the colors are more vivid.
  • RasMol
    Protein Explorer, a RasMol-derivative, is the easiest-to-use and most powerful software for looking at macromolecular structure and its relation to function. It runs on Windows or Mac computers. RasMol users will find its menus very familiar, and it understands RasMol commands. It is very fast: rotating a protein or DNA molecule shows its 3D structure.
  • RCSB Protein Database
    The RCSB PDB provides a variety of tools and resources for studying the structures of biological macromolecules and their relationships to sequence, function, and disease. This site offers tools for browsing, searching, and reporting that utilize the data resulting from ongoing efforts to create a more consistent and comprehensive archive. The Research Collaboratory for Structural Bioinformatics (RCSB) is a non-profit consortium dedicated to improving our understanding of the function of biological systems through the study of the 3-D structure of biological macromolecules.

Phylogeny Tools

  • PHYLIP
    PHYLIP is a free package of programs for inferring phylogenies. It is distributed as source code, documentation files, and a number of different types of executables.
  • TreeView
    TreeView is a simple program for displaying phylogenies on Apple Macintosh and Windows PCs. It can be used to view PHYLIP generated phylogeny trees.

 

Greg Nelson, Chemical and Life Sciences Librarian, Brigham Young University

We welcome your comments and suggestions. If you have a resource that you would like to see highlighted please leave us a comment.

One thought on “Web-based Molecular Biology Tools

Leave a comment