GENOME Research: April 2011

Monday, April 18, 2011

Medaka Hd-rR: Whole Genome Sequencing Project

Sequencing of the medaka genome was started at the Academia Sequencing Center of the National Institute of Genetics (NIG) in mid 2002. The project was supported by group grant Genome Science (Grant-in-Aid for Scientific Research on Priority Areas supported by the Ministry of Education, Culture, Sports, Science and Technology of Japan).

The sequencing was conducted by the whole-genome shotgun strategy using southern inbred strain, Hd-rR. The genome was assembled from 13.8 million reads, obtained from the whole genome shotgun plasmid, fosmid, and bacterial artificial chromosome (BAC) libraries. The total size of the assembled contigs was 700.4 megabases (Mb). 50% of nucleotides are covered in scaffolds (or contigs) of length 1.41Mb (9.8 kilobases) that are called N50 values. This contiguity is sufficient to characterize the genomic structures of genes.

Four versions of the medaka genome sequence data named 200406, 200506, version 0.9, and version 1.0 have been released to the public to provide users with timely information. The former two versions had shorter scaffolds that were not anchored on the medaka chromosomes because they were built in 2004 and 2005, before genetics markers were available. Versions 0.9 and 1.0 created in 2006, when comprehensive genetic markers were available, so that about 90% of their scaffolds and ultracontigs were located on the twenty-four medaka chromosomes. Versions 0.9 and 1.0 were built from the identical contigs and scaffolds, but the assembly of version 1.0 is longer than that of version 0.9 because more genetic markers could be used to generate version 1.0. Version 0.9 is left open to the public because most of the data analysis in the medaka genome paper published in Nature (2007) was based on version 0.9.

The University of Tokyo Medaka Genome Browser (UTGB Medaka) a web-based genome database browser, which provides various information related to medaka genomes, including assembly sequences, genes, clones, homologus genome sequences to other species, etc.

Friday, April 15, 2011

Wild type Zebrafish: Whole Genome Sequence

Till date, the main genomics work has been focused on lab grown Zebrafish strains that functionally represent genetic clones; little or no true genetic diversity was captured in these initial genome studies. Therefore to address questions related to genomic variations, IGIB sequenced a Wild type Strain of Zebrafish caught directly from the water bodies in India.

Institute of Genomics and Integrative Biology (IGIB), which is a constituent laboratory of the Council of Scientific and Industrial Research (CSIR) has completed the whole genome sequencing of a Wild type Strain of Zebrafish (Danio rerio). This work marked India’s entry into the arena of whole genome sequencing of animals.

The Zebrafish genome is about half the size of the human genome, containing about 1700 million DNA base pairs. The research team at IGIB generated over 89 Gigabases of DNA sequences in two months time resulting in over 20X coverage of the Zebrafish genome. The Solexa / Illumina sequencing technology was employed for sequencing the Zebrafish Wildtype strain. This next-generation sequencing technology enables massively parallel sequencing of millions of genomic fragments ranging from 36 to 76 base pairs, which are then mapped back to the reference genome. This humongous exercise was made possible with the CSIR Supercomputing facility at IGIB. Mayo Clinic, Rochester, USA will join this CSIR led project for genome annotation.

Zebrafish, a four centimeter-long fish native to Indian rivers, has attracted considerable scientific interest worldwide primarily as a non-mammalian vertebrate model organism. Zebrafish share many features of the human system. Using the new advancements in sequencing technologies alongside cutting-edge bioinformatics capabilities, Indian scientists explored genetic variation through comparing the genomes of single Wild type Strain Zebrafish parent and approximately 100 of its offsprings, which were bred and phenotyped at IGIB. The whole genome sequencing of approximately 100 of the offspring is expected to complete by 2012.

Monday, April 11, 2011

Rainbow Trout Physical and Genetic Maps Integrated

A first generation integrated map of the rainbow trout genome is available at http://www.genome.clemson.edu/activities/projects/rainbowTrout and Published in BMC genomics.

The Collaborative study led by Dr. Yniv Palti, a research geneticist at the USDA ARS in Kearneysville , extend their earlier work of BAC-based physical map and genetic Maps of INRA and NCCCWA for the most-widely cultivated cold freshwater fish.

Using the microsatellites isolated from BAC end sequences and PCR super pools for library screening and identification of BACs that harbor previously mapped markers the integrated map is composed of 238 BAC contigs anchored to chromosomes of the genetic map. It covers more than 10% of the genome across segments from all 29 chromosomes.

This Map will surely add to the growing rainbow trout genomics resources which include various genetic maps, physical maps, EST database and a transcriptome.

Friday, April 8, 2011

Genetically modified Atlantic salmon on US dinner tables.

AquaBounty is developing advanced-hybrid salmon, trout, and tilapia designed to grow faster than traditional fish. AquAdvantage® Salmon (AAS) grows twice as fast as wild Atlantics, reaching market weight in a year and a half instead of three. Mature AAS are indistinguishable from their conventional counterparts. This advancement provides a compelling economic benefit to farmers (reduced growing cycle) as well as enhancing the economic viability of inland operations, thereby diminishing the need for ocean pens. AAS are also reproductively sterile, which eliminates the threat of interbreeding amongst themselves or with native populations, a major recent concern in dealing with fish escaping from salmon farms.

The fish contains a single copy of a DNA sequence that includes code for a Chinook salmon growth hormone and regulatory sequences derived from Chinook salmon and the eel-like ocean pout. Whereas Atlantic salmon normally stop growing in the winter, the GM fish produces growth hormones throughout the year. Developer AquaBounty Technologies, based in Waltham, Massachusetts, has spent more than a decade shepherding the fish towards approval in a new regulatory landscape.

Cynoglossus semilaevis: Whole Genome Sequence

On July 31 2010, Yellow Sea Fisheries Research Institute (YSFRI) Chinese Academy of Fishery Sciences and BGI (Formerly known as Beijing Genomics Institute, headquarter at Shenzhen), jointly announced the complete sequencing and assembly of Cynoglossus semilaevis genome. It is the first Pleuronectiformes that have been fully sequenced and also the first fish genome to be sequenced in China.

C. semilaevis (Half-smooth tongue sole) is a marine fish. It has a ZW sex determination type in which females has heteromorphic chromosome and grows 2-4 times faster than males. The genome project of C. semilaevis was jointly launched by YSFRI and BGI in December 2009, with the aim to sequence the whole genome of C. semilaevis using next-generation sequencing technology. The genome is successfully assembled by BGI using its self-developed assemble and analysis tools. The research revealed that the C. semilaevis genome size is 520Mb (5.2 million nucleotides), containing 9.5% repeat sequences. More than 20,000 genes were found, of which more than 18,000 can be identified with homologous genes in other species. The other 2000 genes have no homologous gene identified.

The C. semilaevis genome sequencing has generated an enormous database of genetic information that can be used to understand the genetic basis of important traits such as sex determination. The joint research is part of BGI’s "1000 Plant and Animal Reference Genomes Project" which plans to decode 1000 genomes of plants and animals of great economic and scientific value.

Monday, April 4, 2011

Takifugu rubripes: Whole Genome Sequence

Fifth Fugu Genome assembly v5 made available by the Institute of Molecular and Cell Biology (IMCB) in July 2010. The fugu (Takifugu rubripes or Fugu rubripes) genome project was initiated in 1989. In 1993, researchers showed that the fugu genome is only 390 Mb, yet it contains a similar repertoire of genes to humans, which is useful for discovering genes and gene regulatory elements in the human genome. Fugu is the second vertebrate genome to be sequenced, the first being the human genome. A ‘draft’ sequence of the fugu genome was determined by the International Fugu Genome Consortium in 2002 using the 'whole-genome shotgun' sequencing strategy. In the latest version, some gaps in the fugu assembly v4 have been filled and the scaffolds have been organized into chromosomes based on a genetic map of the fugu (a collaborative project between IMCB and University of Tokyo). The v5 assembly comprises 7,118 scaffolds covering 392 Mb. About 72% of the assembly (281,557,002 bp) is organized into 22 chromosomes. Another 14% of the assembly (55,560,038 bp) is assigned to chromosomes but the orientation and order of the scaffolds are not known (Chr_n_un). The remaining 14% of the assembly (54,753,918 bp) is concatenated into a single sequence (Chr_un).

Friday, April 1, 2011

Characid fish scatter seeds

Throughout Amazonia, overfishing has decimated populations of fruit-eating fishes, especially the large-bodied characid, Colossoma macropomum. During lengthy annual floods, frugivorous fishes enter vast Amazonian floodplains, consume massive quantities of fallen fruits and egest viable seeds. Extensive mobility of frugivorous fish could result in extremely effective, multi-directional, long-distance seed dispersal.

Scientists from USA and Peru tracked fine-scale movement patterns and habitat use of wild Colossoma, and seed retention in the digestive tracts of captive individuals. At least 5 per cent of seeds are predicted to disperse 1700–2110 m, farther than dispersal by almost all other frugivores reported in the literature. Additionally, seed dispersal distances increased with fish size, but overfishing has biased Colossoma populations to smaller individuals. Thus, overexploitation probably disrupts an ancient coevolutionary relationship between Colossoma and Amazonian plants.

The findings are published in the latest Nature | Research Highlights and in the latest issue of Proceedings of The Royal Society: Biological Sciences

Summer of Code 2011 by Google

Google Summer of Code is a global program that offers students stipends to write code for open source projects. It is a program that offers student developers stipends to write code for various open source projects. Historically, the program has brought together over 4,500 students with over 300 open source projects, to create millions of lines of code. The program, which kicked off in 2005, is now in its seventh year. If you are interested in learning more about the projects worked with in the past, check out the 2006, 2007, 2008, 2009, and 2010 program pages.

The student applications now being accepted for Google Summer of Code 2011. Application period begins March 28, 2011 and ends April 8th at 19:00 UTC. For full details, see the program timeline.

Important dates are:

March 28: 19:00 UTC	Student application period opens.
April 8: 19:00 UTC	Student application deadline.
Interim Period:	Mentoring organizations review and rank student proposals; where necessary, mentoring organizations may request further proposal detail from the student applicant.
April 22:	1. All mentors must be signed up and all student proposals matched with a mentor - 07:00 UTC 2. Student ranking/scoring deadline. Please do not add private comments with a nonzero score or mark students as ineligible (unless doing so as part of resolving duplicate accepted students) after this deadline - 17:00 UTC 3. IRC meeting to resolve any outstanding duplicate accepted students - 19:00 UTC
April 25: 19:00 UTC	Accepted student proposals announced on the Google Summer of Code 2011 site.