Saturday, December 31, 2011

Intronless genes in teleost fish genomes

A recent study made by German scientists revealed that Takifugu rubripes, Tetraodon nigroviridis, Oryzias latipes, Gasterosteus aculeatus and Danio rerio genomes are respectively comprised of 2.83%, 3.42%, 4.49%, 4.35% and 4.02% single exon genes (SEGs). These SEGs encode for a variety of family proteins including claudins, olfactory receptors and histones that are essential for various biological functions. Annotation features of three Dicentrarchus labrax chromosomes revealed 78 (5.30%) intronless genes, comparisons with G. aculeatus showed that SEG composition and their order varied significantly among corresponding chromosomes, even for those with nearly complete synteny. More than half of SEGs identified in most of the species have at least one ortholog multiple exon gene in the same genome, which provides insight to their possible origin by retrotransposition. In spite of the fact that they belong to the same lineage, the fraction of predicted SEGs varied significantly between the genomes analyzed, and only a low fraction of proteins (4.1%) is conserved between all five species. Furthermore, the inter-specific distribution of SEGs as well as the functional categories shared by species did not reflect their phylogenetic relationships. These results indicate that new SEGs are continuously and independently generated after species divergence over evolutionary time as evidenced by the phylogenetic results of single exon claudins genes. Results of this study provide strong support for the idea that retrotransposition followed by tandem duplications is the most probable event that can explain the expansion of SEGs in eukaryotic organisms.

Study was published in Marine Genomics. 2011 4(2):109-19.

Friday, September 23, 2011

First non-avian reptile whole genome sequence unveiled: Anolis carolinensis

Amniotes, the first truly terrestrial vertebrates, diverged from other animals some 320 million years ago to form the mammalian and reptilian lineages. Until now, however, the only representatives of the reptile branch to be sequenced were birds-the chicken, the turkey and the zebra finch.

Scientists from USA, UK and Sweden, recently reported the genome sequence of the North American green anole lizard, Anolis carolinensis. They found that A. carolinensis microchromosomes are highly syntenic with chicken microchromosomes, yet do not exhibit the high GC and low repeat content that are characteristic of avian microchromosomes. Comparative gene analysis shows that amniote egg proteins have evolved significantly more rapidly than other proteins. An anole phylogeny resolves basal branches to illuminate the history of their repeated adaptive radiations.

The genome sequence of A. carolinensis allows a deeper understanding of amniote evolution. Filling this important reptilian node with a sequenced genome has revealed derived states in each major amniote branch and has helped to illuminate the amniote ancestor.

The research is published in latest issue of Nature

Wednesday, September 21, 2011

Whole genome sequence: Anolis carolinensis lizard

Scientist found Anolis to have the most compositionally homogeneous genome of all amniotes sequenced thus far, a homogeneity exceeding that for the frog Xenopus. Isochores are large regions of relatively homogeneous nucleotide composition and are present in the genomes of all mammals and birds that have been sequenced to date. GC-rich isochores, with shorter introns and higher gene density are reported in all genomes sequenced till date, but disappeared from the Anolis genome. Using genic GC as a proxy for isochore structure so as to compare with other vertebrates, researchers found that GC content has substantially decreased in the lineage leading to Anolis since diverging from the common ancestor of Reptilia ∼275 MYA, perhaps reflecting weakened or reversed GC-biased gene conversion, a non-adaptive substitution process that is thought to be important in the maintenance and trajectory of isochore evolution.

Results demonstrate that GC composition in Anolis is not associated with important features of genome structure, including gene density and intron size, in contrast to patterns seen in mammal and bird genomes.

Findings are published in the latest issue of Genome Biology and Evolution

Alternative Splicing Switch

Adapted from Cell doi:10.1016/j.cell.2011.08.023
Alternative splicing (AS) is a key process underlying the expansion of proteomic diversity and the regulation of gene expression. Scientists from Canada and USA identified an evolutionarily conserved embryonic stem cell (ESC)-specific AS event that changes the DNA-binding preference of the forkhead family transcription factor FOXP1. An ESC-specific splicing switch in FOXP1 transcripts produces the FOXP1-ES isoform. FOXP1-ES has distinct DNA-binding properties compared to the canonical FOXP1 isoform. FOXP1-ES stimulates key pluripotency genes and represses many differentiation genes. FOXP1-ES is required for ESC pluripotency and efficient induced pluripotent stem cells (iPSC) reprogramming.

These results reveal a pivotal role for an AS event in the regulation of pluripotency through the control of critical ESC-specific transcriptional programs.

The findings are published in the recent issue of Cell

Mouse genomic variation

Adapted from Nature 477, 289-294.
Recently Scientist from USA, UK and Germany reported the genomic variation in mouse and its effect on phenotypes and gene regulation. They reported genome sequences of 17 inbred strains of laboratory mice and identified almost ten times more variants than previously known. By identifying candidate functional variants at 718 quantitative trait loci, scientists showed that the molecular nature of functional variants and their position relative to genes vary according to the effect size of the locus. These sequences provide a starting point for a new era in the functional analysis of a key model organism.

Research outcomes are published in latest issue of Nature

Tuesday, August 16, 2011

Whole genome Sequence of Atlantic Cod

Recently, scientists from Norway have investigated and present the genome sequence of Atlantic cod (Gadus morhua). The genome assembly was obtained exclusively by 454 sequencing of shotgun and paired-end libraries, and automated annotation identified 22,154 genes. Genome sequence provided evidence for complex thermal adaptations in its haemoglobin gene cluster and an unusual immune architecture compared to other sequenced vertebrates. Atlantic cod has lost the genes for MHCII, CD4 and invariant chain (Ii) that are conserved  feature of the adaptive immune system of jawed vertebrates and, are essential for the function of this pathway. These observations affect fundamental assumptions about the evolution of the adaptive immune system and its components in vertebrates.

The study is published in the latest issue of Nature

Monday, April 18, 2011

Medaka Hd-rR: Whole Genome Sequencing Project

Sequencing of the medaka genome was started at the Academia Sequencing Center of the National Institute of Genetics (NIG) in mid 2002. The project was supported by group grant Genome Science (Grant-in-Aid for Scientific Research on Priority Areas supported by the Ministry of Education, Culture, Sports, Science and Technology of Japan).
.
The sequencing was conducted by the whole-genome shotgun strategy using southern inbred strain, Hd-rR. The genome was assembled from 13.8 million reads, obtained from the whole genome shotgun plasmid, fosmid, and bacterial artificial chromosome (BAC) libraries. The total size of the assembled contigs was 700.4 megabases (Mb). 50% of nucleotides are covered in scaffolds (or contigs) of length 1.41Mb (9.8 kilobases) that are called N50 values. This contiguity is sufficient to characterize the genomic structures of genes.

Four versions of the medaka genome sequence data named 200406, 200506, version 0.9, and version 1.0  have been released to the public to provide users with timely information. The former two versions had shorter scaffolds that were not anchored on the medaka chromosomes because they were built in 2004 and 2005, before genetics markers were available. Versions 0.9 and 1.0 created in 2006, when comprehensive genetic markers were available, so that about 90% of their scaffolds and ultracontigs were located on the twenty-four medaka chromosomes. Versions 0.9 and 1.0 were built from the identical contigs and scaffolds, but the assembly of version 1.0 is longer than that of version 0.9 because more genetic markers could be used to generate version 1.0. Version 0.9 is left open to the public because most of the data analysis in the medaka genome paper published in Nature (2007) was based on version 0.9.

The University of Tokyo Medaka Genome Browser (UTGB Medaka) a web-based genome database browser, which provides various information related to medaka genomes, including assembly sequences, genes, clones, homologus genome sequences to other species, etc.

Friday, April 15, 2011

Wild type Zebrafish: Whole Genome Sequence

Till date, the main genomics work has been focused on lab grown Zebrafish strains that functionally represent genetic clones; little or no true genetic diversity was captured in these initial genome studies. Therefore to address questions related to genomic variations, IGIB sequenced a Wild type Strain of Zebrafish caught directly from the water bodies in India.

Institute of Genomics and Integrative Biology (IGIB), which is a constituent laboratory of the Council of Scientific and Industrial Research (CSIR) has completed the whole genome sequencing of a Wild type Strain of Zebrafish (Danio rerio). This work marked India’s entry into the arena of whole genome sequencing of animals.

The Zebrafish genome is about half the size of the human genome, containing about 1700 million DNA base pairs. The research team at IGIB generated over 89 Gigabases of DNA sequences in two months time resulting in over 20X coverage of the Zebrafish genome. The Solexa / Illumina sequencing technology was employed for sequencing the Zebrafish Wildtype strain. This next-generation sequencing technology enables massively parallel sequencing of millions of genomic fragments ranging from 36 to 76 base pairs, which are then mapped back to the reference genome. This humongous exercise was made possible with the CSIR Supercomputing facility at IGIB. Mayo Clinic, Rochester, USA will join this CSIR led project for genome annotation.

Zebrafish, a four centimeter-long fish native to Indian rivers, has attracted considerable scientific interest worldwide primarily as a non-mammalian vertebrate model organism. Zebrafish  share many features of the human system. Using the new advancements in sequencing technologies alongside cutting-edge bioinformatics capabilities, Indian scientists explored genetic variation through comparing the genomes of single Wild type Strain Zebrafish parent and approximately 100 of its offsprings, which were bred and phenotyped at IGIB. The whole genome sequencing of approximately 100 of the offspring is expected to complete by 2012.

Monday, April 11, 2011

Rainbow Trout Physical and Genetic Maps Integrated

A first generation integrated map of the rainbow trout genome is available at http://www.genome.clemson.edu/activities/projects/rainbowTrout and Published in BMC genomics.

 The Collaborative study led by Dr. Yniv Palti, a research geneticist at the USDA ARS in Kearneysville , extend their earlier work of BAC-based physical map and genetic Maps of  INRA  and NCCCWA for  the most-widely cultivated cold freshwater fish.

Using the microsatellites isolated from BAC end sequences and PCR super pools for library screening and identification of BACs that harbor previously mapped markers the integrated map is composed of 238 BAC contigs anchored to chromosomes of the genetic map. It covers more than 10% of the genome across segments from all 29 chromosomes.

This Map will surely add to the growing rainbow trout genomics resources which include various genetic maps, physical maps, EST database and a transcriptome.

Friday, April 8, 2011

Genetically modified Atlantic salmon on US dinner tables.

AquaBounty is developing advanced-hybrid salmon, trout, and tilapia designed to grow faster than traditional fish. AquAdvantage® Salmon (AAS) grows twice as fast as wild Atlantics, reaching market weight in a year and a half instead of three. Mature AAS are indistinguishable from their conventional counterparts. This advancement provides a compelling economic benefit to farmers (reduced growing cycle) as well as enhancing the economic viability of inland operations, thereby diminishing the need for ocean pens. AAS are also reproductively sterile, which eliminates the threat of interbreeding amongst themselves or with native populations, a major recent concern in dealing with fish escaping from salmon farms.


The fish contains a single copy of a DNA sequence that includes code for a Chinook salmon growth hormone and regulatory sequences derived from Chinook salmon and the eel-like ocean pout. Whereas Atlantic salmon normally stop growing in the winter, the GM fish produces growth hormones throughout the year. Developer AquaBounty Technologies, based in Waltham, Massachusetts, has spent more than a decade shepherding the fish towards approval in a new regulatory landscape.

Cynoglossus semilaevis: Whole Genome Sequence

On July 31 2010, Yellow Sea Fisheries Research Institute (YSFRI) Chinese Academy of Fishery Sciences and BGI (Formerly known as Beijing Genomics Institute, headquarter at Shenzhen), jointly announced the complete sequencing and assembly of Cynoglossus semilaevis genome. It is the first Pleuronectiformes that have been fully sequenced and also the first fish genome to be sequenced in China.
C. semilaevis (Half-smooth tongue sole) is a marine fish. It has a ZW sex determination type in which females has heteromorphic chromosome and grows 2-4 times faster than males. The genome project of C. semilaevis was jointly launched by YSFRI and BGI in December 2009, with the aim to sequence the whole genome of C. semilaevis using next-generation sequencing technology. The genome is successfully assembled by BGI using its self-developed assemble and analysis tools. The research revealed that the C. semilaevis genome size is 520Mb (5.2 million nucleotides), containing 9.5% repeat sequences. More than 20,000 genes were found, of which more than 18,000 can be identified with homologous genes in other species. The other 2000 genes have no homologous gene identified.
The C. semilaevis genome sequencing has generated an enormous database of genetic information that can be used to understand the genetic basis of important traits such as sex determination. The joint research is part of BGI’s "1000 Plant and Animal Reference Genomes Project" which plans to decode 1000 genomes of plants and animals of great economic and scientific value.

Monday, April 4, 2011

Takifugu rubripes: Whole Genome Sequence

Fifth Fugu Genome assembly v5 made available by the Institute of Molecular and Cell Biology (IMCB) in July 2010. The fugu (Takifugu rubripes or Fugu rubripes) genome project was initiated in 1989. In 1993, researchers showed that the fugu genome is only 390 Mb, yet it contains a similar repertoire of genes to humans, which is useful for discovering genes and gene regulatory elements in the human genome. Fugu is the second vertebrate genome to be sequenced, the first being the human genome. A ‘draft’ sequence of the fugu genome was determined by the International Fugu Genome Consortium in 2002 using the 'whole-genome shotgun' sequencing strategy. In the latest version, some gaps in the fugu assembly v4 have been filled and the scaffolds have been organized into chromosomes based on a genetic map of the fugu (a collaborative project between IMCB and University of Tokyo). The v5 assembly comprises 7,118 scaffolds covering 392 Mb. About 72% of the assembly (281,557,002 bp) is organized into 22 chromosomes. Another 14% of the assembly (55,560,038 bp) is assigned to chromosomes but the orientation and order of the scaffolds are not known (Chr_n_un). The remaining 14% of the assembly (54,753,918 bp) is concatenated into a single sequence (Chr_un).

Friday, April 1, 2011

Characid fish scatter seeds

Throughout Amazonia, overfishing has decimated populations of fruit-eating fishes, especially the large-bodied characid, Colossoma macropomum. During lengthy annual floods, frugivorous fishes enter vast Amazonian floodplains, consume massive quantities of fallen fruits and egest viable seeds. Extensive mobility of frugivorous fish could result in extremely effective, multi-directional, long-distance seed dispersal.

Scientists from USA and Peru tracked fine-scale movement patterns and habitat use of wild Colossoma, and seed retention in the digestive tracts of captive individuals. At least 5 per cent of seeds are predicted to disperse 1700–2110 m, farther than dispersal by almost all other frugivores reported in the literature. Additionally, seed dispersal distances increased with fish size, but overfishing has biased Colossoma populations to smaller individuals. Thus, overexploitation probably disrupts an ancient coevolutionary relationship between Colossoma and Amazonian plants.

The findings are published in the latest Nature | Research Highlights and in the latest issue of Proceedings of The Royal Society: Biological Sciences

Summer of Code 2011 by Google


Google Summer of Code is a global program that offers students stipends to write code for open source projects. It is a program that offers student developers stipends to write code for various open source projects.  Historically, the program has brought together over 4,500 students with over 300 open source projects, to create millions of lines of code. The program, which kicked off in 2005, is now in its seventh year. If you are interested in learning more about the projects worked with in the past, check out the 2006, 2007, 2008, 2009, and 2010 program pages. 

The student applications now being accepted for Google Summer of Code 2011. Application period begins March 28, 2011 and ends April 8th at 19:00 UTC. For full details, see the program timeline.

Important dates are:
March 28: 19:00 UTC
Student application period opens.
April 8: 19:00 UTC
Student application deadline.
Interim Period:
Mentoring organizations review and rank student proposals; where necessary, mentoring organizations may request further proposal detail from the student applicant.
April 22:
1.    All mentors must be signed up and all student proposals matched with a mentor - 07:00 UTC
2.    Student ranking/scoring deadline. Please do not add private comments with a nonzero score or mark students as ineligible (unless doing so as part of resolving duplicate accepted students) after this deadline - 17:00 UTC
3.    IRC meeting to resolve any outstanding duplicate accepted students - 19:00 UTC
April 25: 19:00 UTC
Accepted student proposals announced on the Google Summer of Code 2011 site.


Thursday, March 31, 2011

BOSC 2011: Bioinformatics Open Source Conference, Vienna, Austria, 15-16 July 2011

Bioinformatics Open Source Conference (BOSC) is held annually in conjunction with ISCB's meeting ISMB. BOSC 2011 will be held in conjunction with ISMB/ECCB 2011. BOSC 2011 is sponsored by the Open Bioinformatics Foundation (O|B|F), a non-profit group dedicated to promoting the practice and philosophy of Open Source software development within the biological research community.

Abstracts for talks and posters are invited for following sessions:
  • Approaches to parallel processing.
  • Cloud-based approaches to improving software and data accessibility.
  • The Semantic Web in open source bioinformatics.
  • Data visualization.
  • Tools for next-generation sequencing.
  • Other Open Source software.
Important Dates:
  • April 18, 2011: Deadline for submitting abstracts to BOSC 2011
  • May 9, 2011: Notifications of accepted abstracts emailed to corresponding authors
  • July 13-14, 2011: Codefest 2011 programming session
  • July 15-16, 2011: BOSC 2011
  • July 17-19, 2011: ISMB 2011

Wednesday, March 30, 2011

FishBase: A global information system on fishes

FishBase is a relational database with information to cater to different professionals such as research scientists, fisheries managers, zoologists and many more.  FishBase on the web contains practically all fish species known to science. It included descriptions of 31,600 species, 279,100 common names in hundreds of languages, 49,300 pictures, and references to 44,200 works in the scientific literature.

 FishBase was developed at the WorldFish Center in collaboration with the Food and Agriculture Organization of the United Nations ( FAO) and many other partners, and with support from the European Commission ( EC). Since 2001 FishBase is supported by a consortium of nine research institutions. The consortium consists of:
  1. Leibniz-Institut für Meereswissenschaften (IFM)

The genetics of koi fish

The commitment of numerous generations of Japanese koi farmers has provided us the modern koi, with its more than 100 types and their distinctive color and pattern variations, some  more desired than others.
The intrinsic qualities of koi, as loved by keepers, breeders and enthusiasts, are governed by a intricate set of genetic and ecological elements.

Modern day koi breeders have broad access to scientific research. The study of genetics in general requires the study of visible characteristics in offspring. Applying conventional techniques, it typically takes 20 generations of devoted, well-planned, selective breeding to be able to establish qualities of preferred character in koi. Results have to be painstakingly recorded and that, unfortunately, is a tradition followed by very few, if any, conventional koi farmers.

Research of koi genetics have been slow, because koi take around two to three years to reach maturity. Carp can mature in a much shorter time, but koi breeders have got inadvertently slowed the rate at which fish achieve maturity even further by breeding for improved body confirmation in order to produce large, show-winning fish. They would never use a female developing gonad at the age of one year.

Recently, advancements in analytical techniques for genome investigation have speeded up genetic research. These techniques were applied into the genetic variability of the koi stock of Niigata’s Yamakoshi region, where a relatively high mortality rate at the larval period had been observed. Koi were obtained from all the significant breeders in the region and examined. The study found that not only was there a low genetic variability inside the Niigata population, but the genetic distance between Kohaku, Sanke and Showa was small, suggesting that these favored kinds originated from a small founding population.
I

Electric Eel Genome

Sequencing the complete electric eel genome would be a boon to research on everything from energy production and storage to tissue regeneration, according to some scientists.

Six American researchers wrote a review, published  in the Journal of Fish Biology, calling for dense, seven- to 11-fold shotgun sequencing of the electric eel genome - a move they said would provide information about more than 95 percent of the fish’s genome as well as its genetic scaffold.

Electric eels, Electrophorus electricus, can generate bioelectricity from chemical food energy using specialized electric organs. These contain electrically-charged cells that, in turn, house precisely regulated ion channels and receptors. Together, this system lets electric eels generate electrical pulses ranging from weak, millivolt discharges to strong zaps up to 600 volts.

Another advantage to this particular creature is its ability to regenerate some tissues and organs - including its spinal chord - after injury. Peeking into the eel’s genome may explain this, as well as its complex evolution and neurophysiology.

Monday, March 28, 2011

Workshop on Molecular Evolution: North America, 24 July - 6 August 2011


The Workshop on Molecular Evolution consists of a series of lectures, demonstrations and computer laboratories that cover various aspects of molecular evolution. Demonstrations and consultations on the use of computer programs and packages such as BLAST, BEAST, Clustal W and Clustal X, FASTA, FigTree, GARLI, Genealogical Sorting Index, LAMARC, MAFFT, Migrate-n, MrBayes, PAML, PAUP*, and SeaView will be provided. The course is designed for established investigators, postdoctoral scholars, and advanced graduate students with prior experience in molecular evolution and related fields. Admission is limited and highly competitive, with admissions decisions determined by an international committee.

Topics to be covered include:
  • Databases and sequence matching: database searching: protein sequence versus protein structure; homology; mathematical, statistical, and theoretical aspects of sequence database searches
  • Phylogenetic analysis: theoretical, mathematical and statistical bases; sampling properties of sequence data; Bayesian analysis; hypothesis testing
  • Maximum likelihood theory and practice in phylogenetics and population genetics: coalescent theory; maximum likelihood estimation of population genetic parameters
  • Molecular evolution integrated at organism and higher levels: population biology; biogeography; ecology; systematics and conservation
  • Molecular evolution and development: gene duplication and divergence; gene family organization; coordinated expression in evolution
  • Comparative genomics: genome content; genome structure; genome evolution
  • Molecular evolution integrated at lower levels: biochemistry; cell biology; physiology; relationship of genotype to phenotype

Venue: Colorado State University, Fort Collins, Colorado, USA
Application form and complete schedule are available online.

Coding sequence evolution in salmonids

The combination of next generation sequencing with a comparative genomics approach appears particularly promising towards yielding detailed insights into issues related to evolutionary biology. Very recently canadian scientists applied this strategy to investigate patterns of nucleotide substitutions in five species of the salmonid family (Salmo salar, Onchorynchus mykiss, Salvelinus fontinalis, Salvelinus namaycush, Coregonus clupeaformis) and compare this information with other fishes (Esox lucius, Danio rerio) for which genome information is available in order to infer the role of natural selection on the evolution of protein coding genes.

Results of the study warrant further investigation in regards to the putative role of positive selection in the process of adaptive divergence in salmonids. Findings of this study are published in the latest issue of Molecular Biology and Evolution.