Model Organisms: Functional and Comparative Genomics

Model Organisms: Functional and Comparative Genomics

A model organism is a non-human species that is extensively studied to understand particular biological phenomena, with the expectation that discoveries made in the organism model will provide insight into the workings of other organisms. Model organisms are in vivo models and are widely used to research human disease when human experimentation would be unfeasible or unethical. This strategy is made possible by the common descent of all living organisms, and the conservation of metabolic and developmental pathways and genetic material over the course of evolution. Studying model organisms can be informative, but care must be taken when extrapolating from one organism to another.

Selecting a model organism

Models are those organisms with a wealth of biological data that make them attractive to study as examples for other species and/or natural phenomena that are more difficult to study directly. Continual research on these organisms focus on a wide variety of experimental techniques and goals from many different levels of biology--from ecology, behavior, and biomechanics, down to the tiny functional scale of individual tissues, organelles, and proteins. Inquiries about the DNA of organisms are classed as genetic models (with short generation times, such as the fruitfly and nematode worm), experimental models, and genomic parsimony models, investigating pivotal position in the evolutionary tree. Historically, model organisms include a handful of species with extensive genomic research data, such as the NIH model organisms.

Often, model organisms are chosen on the basis that they are amenable to experimental manipulation. This usually will include characteristics such as short life-cycle, techniques for genetic manipulation (inbred strains, stem cell lines, and methods of transformation) and non-specialist living requirements. Sometimes, the genome arrangement facilitates the sequencing of the model organism's genome, for example, by being very compact or having a low proportion of junk DNA (e.g. yeast, Arabidopsis, or pufferfish).

When researchers look for an organism to use in their studies, they look for several traits. Among these are size, generation time, accessibility, manipulation, genetics, conservation of mechanisms, and potential economic benefit. As comparative molecular biology has become more common, some researchers have sought model organisms from a wider assortment of lineages on the tree of life.

Use of model organisms

There are many model organisms. One of the first model systems for molecular biology was the bacterium Escherichia coli, a common constituent of the human digestive system. Several of the bacterial viruses (bacteriophage) that infect E. coli also have been very useful for the study of gene structure and gene regulation (e.g. phages Lambda and T4). However, bacteriophages are not organisms because they lack metabolism and depend on functions of the host cells for propagation.

In eukaryotes, several yeasts, particularly Saccharomyces cerevisiae ("baker's" or "budding" yeast), have been widely used in genetics and cell biology, largely because they are quick and easy to grow. The cell cycle in a simple yeast is very similar to the cell cycle in humans and is regulated by homologous proteins. The fruit fly Drosophila melanogaster is studied, again, because it is easy to grow for an animal, has various visible congenital traits and has a polytene (giant) chromosome in its salivary glands that can be examined under a light microscope. The roundworm Caenorhabditis elegans is studied because it has very defined development patterns involving fixed numbers of cells, and it can be rapidly assayed for abnormalities.

Important model organisms

Viruses

Viruses include:

  • Phage Lambda
  • Phi X 174 - its genome was the first ever to be sequenced. The genome is a circle of 11 genes, 5386 base pairs in length.
  • Tobacco mosaic virus

Prokaryotes

Prokaryotes include:

  • Escherichia coli (E. coli) - This common, Gram-negative gut bacterium is the most widely-used organism in molecular genetics.
  • Bacillus subtilis - an endospore forming Gram-positive bacterium
  • Caulobacter crescentus - a bacterium that divides into two distinct cells used to study cellular differentiation.
  • Mycoplasma genitalium - a minimal organism
  • Vibrio fischeri - quorum sensing, bioluminescence and animal-bacterial symbiosis with Hawaiian Bobtail Squid
  • Synechocystis, a photosynthetic cyanobacterium widely used in photosynthesis research.
  • Pseudomonas fluorescens, a soil bacterium that readily diversifies into different strains in the lab.

Eukaryotes

Eukaryotes include:

Protists

  • Protists:

  • Chlamydomonas reinhardtii - a unicellular green alga used to study photosynthesis, flagella and motility, regulation of metabolism, cell-cell recognition and adhesion, response to nutrient deprivation and many other topics. Chlamydomonas reinhardtii has a well-studied genetics, with many known and mapped mutants and expressed sequence tags, and there are advanced methods for genetic transformation and selection of genes. Sequencing of the Chlamydomonas reinhardtii genome was reported in October 2007. A Chlamydomonas genetic stock center exists at Duke University, and an international Chlamydomonas research interest group meets on a regular basis to discuss research results. Chlamydomonas is easy to grow on an inexpensive defined medium.
  • Dictyostelium discoideum is used in molecular biology and genetics (its genome has been sequenced), and is studied as an example of cell communication, differentiation, and programmed cell death.
  • Tetrahymena thermophila - a free living freshwater ciliate protozoan.
  • Emiliania huxleyi - a unicellular marine coccolithophore alga, extensively studied as a model phytoplankton species.
  • Thalassiosira pseudonana - a unicellular marine diatom alga, extensively studied as a model marine diatom since its genome was published in 2004

Fungi

  • Fungi:
  • Ashbya gossypii, cotton pathogen, subject of genetics studies (polarity, cell cycle)
  • Aspergillus nidulans, mold subject of genetics studies
  • Coprinus cinereus, mushroom (genetic studies of mushroom development, genetic studies of meiosis)
  • Cunninghamella elegans is a fungal model of mammalian drug metabolism.
  • Neurospora crassa - orange bread mold (genetic studies of meiosis, metabolic regulation, and circadian rhythm)
  • Saccharomyces cerevisiae, baker's yeast or budding yeast (used in brewing and baking)
  • Schizophyllum commune - model for mushroom formation.
  • Schizosaccharomyces pombe, fission yeast, (cell cycle, cell polarity, RNAi, centromere structure and function, transcription)
  • Ustilago maydis, dimorphic yeast and plant pathogen of maize (dimorphism, plant pathogen, transcription)

Plants

  • Plants:
  • Arabidopsis thaliana, currently the most popular model plant. This herbaceous dicot belonging to Brassicaceae family is a plant closely related to the mustard plant. Its small stature and short generation time facilitates rapid genetic studies, and many phenotypic and biochemical mutants have been mapped. Arabidopsis was the first plant to have its genome sequenced. Its genome sequence, along with a wide range of information concerning Arabidopsis, is maintained by the TAIR database.
    (Plant physiology, Developmental biology, Molecular genetics, Population genetics, Cytology, Molecular biology)
  • Selaginella moellendorffii is a remnant of an ancient lineage of vascular plants and key to understanding the evolution of land plants. It has a small genome size (~110Mb) and its sequence was released by the Joint Genome Institute in early 2008. (Evolutionary biology, Molecular biology)
  • Brachypodium distachyon is an emerging experimental model grass that has many attributes that make it an excellent model for temperate cereals. (Agronomy, Molecular biology, Genetics)
  • Lotus japonicus a model legume used to study the symbiosis responsible for nitrogen fixation. (Agronomy, Molecular biology)
  • Lemna gibba is a rapidly-growing aquatic monocot, one of the smallest flowering plants. Lemna growth assays are used to evaluate the toxicity of chemicals to plants in ecotoxicology. Because it can be grown in pure culture, microbial action can be excluded. Lemna is being used as a recombinant expression system for economical production of complex biopharmaceuticals. It is also used in education to demonstrate population growth curves.
  • Maize (Zea mays L.) is a cereal grain. It is a diploid monocot with 10 large chromosome pairs, easily studied with the microscope. Its genetic features, including many known and mapped phenotypic mutants and a large number of progeny per cross (typically 100-200) facilitated the discovery of transposons ("jumping genes"). Many DNA markers have been mapped and the genome has been sequenced. (Genetics, Molecular biology, Agronomy)
  • Medicago truncatula is a model legume, closely related to the common alfalfa. Its rather small genome is currently being sequenced. It is used to study the symbiosis responsible for nitrogen fixation. (Agronomy, Molecular biology)
  • Mimulus is a model organism used in evolutionary and functional genomes studies. This specie pertain to Phrymaceae family, with ca. 120 species.
  • Tobacco BY-2 cells is suspension cell line from tobacco (Nicotiana tabaccum). Useful for general plant physiology studies on cell level. Genome of this particular cultivar will be not sequenced (at least in near future), but sequencing of its wild species Nicotiana tabaccum is presently in progress. (Cytology, Plant physiology, Biotechnology)
  • Rice (Oryza sativa) is used as a model for cereal biology. It has one of the smallest genomes of any cereal species, and sequencing of its genome is finished. (Agronomy, Molecular biology)
  • Physcomitrella patens is a moss increasingly used for studies on development and molecular evolution of plants. It is so far the only non-vascular plant(and so the only "primitive" plant) with its genome completely sequenced. Moreover, it is currently the only land plant with efficient gene targeting that enables gene knockout. The resulting knockout mosses are stored and distributed by the International Moss Stock Center. (Plant physiology, Evolutionary biology, Molecular genetics, Molecular biology)
  • Populus is a genus used as a model in forest genetics and woody plant studies. It has a small genome size, grows very rapidly, and is easily transformed. The genome sequence of Poplar (Populus trichocarpa) sequence is publicly available.
  • See also Chlamydomonas reinhardtii, above under Protists.

Animals

Invertebrates

  • Amphimedon queenslandica, a demosponge from the phylum Porifera used as a model for evolutionary developmental biology and comparative genomics

  • Arbacia punctulata, the purple-spined sea urchin, classical subject of embryological studies
  • Aplysia, a sea slug, whose ink release response serves as a model in neurobiology and whose growth cones serve as a model of cytoskeletal rearrangements
  • Branchiostoma floridae, a species commonly known as amphioxus or lancelet from the subphylum Cephalochordata of the phylum Chordata used as a model for understanding the evolution of nonchordate deuterostomes, invertebrate chordates, and vertebrates
  • Caenorhabditis elegans, a nematode, usually called C. elegans - an excellent model for understanding the genetic control of development and physiology. C. elegans was the first multicellular organism whose genome was completely sequenced
  • Ciona intestinalis, a sea squirt
  • Daphnia spp., small planktonic crustaceans, highly sensitive to pollution, used for evaluating environmental toxicity of chemicals on aquatic invertebrates.
  • Drosophila, usually the species Drosophila melanogaster - a kind of fruit fly, famous as the subject of genetics experiments by Thomas Hunt Morgan and others. Easily raised in lab, rapid generations, mutations easily induced, many observable mutations. Recently, Drosophila has been used for neuropharmacological research. (Molecular genetics, Population genetics, Developmental biology).
  • Euprymna scolopes, the Hawaiian bobtail squid, model for animal-bacterial symbiosis, bioluminescent vibrios
  • Hydra (genus), a Cnidarian, is the model organism to understand the processes of regeneration and morphogenesis, as well as the evolution of bilaterian body plans
  • Loligo pealei, a squid, subject of studies of nerve function because of its giant axon (nearly 1 mm diameter, roughly a thousand times larger than typical mammalian axons)
  • Macrostomum lignano, a free-living, marine flatworm, a model organism for the study of stem cells, regeneration, ageing, gene function, and the evolution of sex. Easily raised in the lab, short generation time, indetermined growth, complex behaviour
  • Mnemiopsis leidyi, from the phylum Ctenophora (comb jelly) used as a model for evolutionary developmental biology and comparative genomics
  • Nematostella vectensis, a sea anemone from the phylum Cnidaria used as a model for evolutionary developmental biology and comparative genomics
  • Oikopleura dioica, an appendicularia, a free-swimming tunicate (or urochordate))
  • Oscarella carmela a homoscleromorph sponge (phylum Porifera) used as a model in evolutionary developmental biology
  • Parhyale hawaiensis an amphipod crustacean, used in evolutionary developmental (evo-devo) studies, with an extensive toolbox for genetic manipulation.
  • Platynereis dumerilii a marine polychaetous annelid, which evolved very slowly and therefore retained many ancestral features.
  • Pristionchus pacificus, a roundworm used in evolutionary developmental biology in comparative analyses with C. elegans
  • Schmidtea mediterranea a freshwater planarian; a model for regeneration and development of tissues such as the brain and germline
  • Stomatogastric ganglion of various arthropod species; a model for motor pattern generation seen in all repetitive motions
  • Strongylocentrotus purpuratus, the purple sea urchin, widely used in developmental biology
  • Symsagittifera roscoffensis, a flatworm, subject of studies of bilaterian body plan development
  • Tribolium castaneum, the flour beetle - a small, easily kept darkling beetle used especially in behavioural ecology experiments
  • Trichoplax adhaerens, a very simple free-living animal from the phylum Placozoa used as a model in evolutionary developmental biology and comparative genomics
  • Tubifex tubifex, an oligochaeta used for evaluating environmental toxicity of chemicals on aquatic and terrestrial worms.

Vertebrates

  • Guinea pig (Cavia porcellus) - used by Robert Koch and other early bacteriologists as a host for bacterial infections, hence a byword for "laboratory animal" even though less commonly used today
  • Chicken (Gallus gallus domesticus) - used for developmental studies, as it is an amniote and excellent for micromanipulation (e.g. tissue grafting) and over-expression of gene products
  • Cat (Felis sylvestris catus) - used in neurophysiological research
  • Dog (Canis lupus familiaris) - an important respiratory and cardiovascular model, also contributed to the discovery of classical conditioning.
  • Hamster - first used to study kala-azar (leishmaniasis)
  • Mouse (Mus musculus) - the classic model vertebrate. Many inbred strains exist, as well as lines selected for particular traits, often of medical interest, e.g. body size, obesity, muscularity. (Quantitative genetics, Molecular evolution, Genomics)
  • Lamprey - spinal cord research
  • Medaka (Oryzias latipes, the Japanese ricefish) - an important model in developmental biology, and has the advantage of being much sturdier than the traditional Zebrafish
  • Rat (Rattus norvegicus) - particularly useful as a toxicology model; also particularly useful as a neurological model and source of primary cell cultures, owing to the larger size of organs and suborganellar structures relative to the mouse. (Molecular evolution, Genomics)
  • Rhesus macaque (Macaca mulatta) - used for studies on infectious disease and cognition
  • Cotton rat (Sigmodon hispidus) - formerly used in polio research
  • Zebra finch (Taeniopygia guttata) - used in the study of the song system of songbirds and the study of non-mammalian auditory systems
  • Takifugu (Takifugu rubripes, a pufferfish) - has a small genome with little junk DNA
  • The African clawed frog (Xenopus laevis) - used in developmental biology because of its large embryos and high tolerance for physical and pharmacological manipulation
  • Zebrafish (Danio rerio, a freshwater fish) - has a nearly transparent body during early development, which provides unique visual access to the animal's internal anatomy. Zebrafish are used to study development, toxicology and toxicopathology, specific gene function and roles of signaling pathways.

Model organisms used for specific research objectives

Sexual selection and sexual conflict

  • Callosobruchus maculatus, the bruchid beetle
  • Chorthippus parallelus, the meadow grasshopper
  • Coelopidae - seaweed flies
  • Diopsidae - stalk-eyed flies
  • Drosophila spp. - fruit flies
  • Macrostomum lignano, a sand flatworm
  • Gryllus bimaculatus, the field cricket
  • Scathophaga stercoraria, the yellow dung fly

Hybrid zones

  • Bombina bombina and variegata
  • Podisma spp. in the Alps
  • Caledia captiva (Orthoptera) in eastern Australia

Ecological genomics

  • Daphnia pulex, an environmental indicator model organism

Table of model genetic organisms

This table indicates the status of the genome sequencing project for each organism as well as whether the organism exhibits homologous recombination.

Organism Genome Sequenced Homologous Recombination
Prokaryote
Escherichia coli Yes Yes
Eukaryote, unicellular
Dictyostelium discoideum Yes Yes
Saccharomyces cerevisiae Yes Yes
Schizosaccharomyces pombe Yes Yes
Chlamydomonas reinhardtii Yes No
Tetrahymena thermophila Yes Yes
Eukaryote, multicellular
Caenorhabditis elegans Yes Difficult
Drosophila melanogaster Yes Difficult
Arabidopsis thaliana Yes No
Physcomitrella patens Yes Yes
Vertebrate
Danio rerio Yes Yes
Mus musculus Yes Yes
Xenopus laevis (Note: and X. tropicalis) Yes No
Homo sapiens (Note:not a model organism) Yes Yes

Comparative genomics

Comparative genomics is the study of the relationship of genome structure and function across different biological species or strains. Comparative genomics is an attempt to take advantage of the information provided by the signatures of selection to understand the function and evolutionary processes that act on genomes. While it is still a young field, it holds great promise to yield insights into many aspects of the evolution of modern species. The sheer amount of information contained in modern genomes (3.2 gigabases in the case of humans) necessitates that the methods of comparative genomics are automated. Gene finding is an important application of comparative genomics, as is discovery of new, non-coding functional elements of the genome.

Comparative genomics exploits both similarities and differences in the proteins, RNA, and regulatory regions of different organisms to infer how selection has acted upon these elements. Those elements that are responsible for similarities between different species should be conserved through time (stabilizing selection), while those elements responsible for differences among species should be divergent (positive selection). Finally, those elements that are unimportant to the evolutionary success of the organism will be unconserved (selection is neutral).

One of the important goals of the field is the identification of the mechanisms of eukaryotic genome evolution. It is however often complicated by the multiplicity of events that have taken place throughout the history of individual lineages, leaving only distorted and superimposed traces in the genome of each living organism. For this reason comparative genomics studies of small model organisms (for example the model Caenorhabditis elegans and closely related Caenorhabditis briggsae) are of great importance to advance our understanding of general mechanisms of evolution.

Having come a long way from its initial use of finding functional proteins, comparative genomics is now concentrating on finding regulatory regions and siRNA molecules. Recently, it has been discovered that distantly related species often share long conserved stretches of DNA that do not appear to code for any protein (see conserved non-coding sequence). One such ultra-conserved region, that was stable from chicken to chimp has undergone a sudden burst of change in the human lineage, and is found to be active in the developing brain of the human embryo.

Computational approaches to genome comparison have recently become a common research topic in computer science. A public collection of case studies and demonstrations is growing, ranging from whole genome comparisons to gene expression analysis. This has increased the introduction of different ideas, including concepts from systems and control, information theory, strings analysis and data mining. It is anticipated that computational approaches will become and remain a standard topic for research and teaching, while multiple courses will begin training students to be fluent in both topics.

  • Recommend Us