Dinesh Gupta
International Center for Genetic Engineering and Biotechnology, New Delhi, India.

ABSTRACT

There are several ways of representing genomic features for whole genome sequences emerging from annotation projects. Of these, the GFF (General Feature Format) has emerged as a widely accepted, portable and successfully used flat file format for storing genome annotations. With an increased interest in genome annotation projects, the need for reference annotations at coordinate and sequence level has immensely amplified the use of GFF files. We have developed GFF-Ex to automate and speeds up the extraction of the features from the feature files. Along with the sequence level extraction of the features described within the feature file, GFF-Ex also assigns most probable boundaries for the features (introns, intergenic, etc.) not coordinated or annotated certainly and exports the corresponding sequence information. The fusion scripts utilizing shell and PERL speeds up this tedious process to a large extent when compared with modules designed solely using either PERL or Python. GFF-Ex can also be integrated effectively with any genome annotation pipeline making annotation more specific and sensitive.

Availability: GFF-Ex is developed for platforms supporting UNIX based file system and is provided with source code and documentation. Package is freely available at http://bioinfo.icgeb.res.in/gff. Read more…

Please follow and like us:
News Reporter