Bioinformatics

Parse GTF file in Python

Photo by Nikolay Kovalenko | Colin Viessmann on Unsplash

As a bioinformatics engineer, we often need to work with gtf files (Gene transfer format).Here we will go over for how to use gtfparse to read data from gtf file.

Example gtf file we can get from here : ftp://ftp.ensembl.org/pub/release-100/gtf/homo_sapiens/Homo_sapiens.GRCh38.100.gtf.gz .

The format of the gtf file look like :

#!genome-build GRCh38.p13
#!genome-version GRCh38
#!genome-date 2013-12
#!genome-build-accession NCBI:GCA_000001405.28
#!genebuild-last-updated 2019-06
1 havana gene 11869 14409 . + . gene_id "ENSG00000223972"; gene_version "5"; gene_name "DDX11L1"; gene_source "havana"; gene_biotype "transcribed_unprocessed_pseudogene";
1 havana transcript 11869 14409 . + . gene_id "ENSG00000223972"; gene_version "5"; transcript_id "ENST00000456328"; transcript_version "2"; gene_name "DDX11L1"; gene_source "havana"; gene_biotype "transcribed_unprocessed_pseudogene"; transcript_name "DDX11L1-202"; transcript_source "havana"; transcript_biotype "processed_transcript"; tag "basic"; transcript_support_level "1";
1 havana exon 11869 12227 . + . gene_id "ENSG00000223972"; gene_version "5"; transcript_id "ENST00000456328"; transcript_version "2"; exon_number "1"; gene_name "DDX11L1"; gene_source "havana"; gene_biotype "transcribed_unprocessed_pseudogene"; transcript_name "DDX11L1-202"; transcript_source "havana"; transcript_biotype "processed_transcript"; exon_id "ENSE00002234944"; exon_version "1"; tag "basic"; transcript_support_level "1";

Let’s say we might want to get the gene_name value, and save the value to a text file. First we need to install gtfparse.

pip install gtfparse

Then we read the gtf file.

Notes: In this example, we ignore the gene like “AL627309.2”.

The result file will be saved.

DDX11L1
WASH7P
MIR6859-1
MIR1302-2HG
MIR1302-2
FAM138A
OR4G4P
OR4G11P
OR4F5
AL627309
CICP27
RNU6-1100P
FO538757
WASH9P
MIR6859-2
AP006222
RPL23AP24
AL732372
WBP1LP7
OR4F29
CICP7
U6
AL669831
AC114498
MTND1P23
MTND2P28
MTCO1P12

Hope it helps.

~~PEACE~~

A passionate automation engineer who strongly believes in “A man can do anything he wants if he puts in the work”.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store