bpy12. Using egquery EUtil in Biopython

We can use Bio.Entrez.egquery to find counts in different databases, for a query term.

For Entrez.email, you should give your email. The query term, here, is ‘asthma’. Biopython returns a dictionary of length 2, with the result in key ‘eGQueryResult”. The result inside ‘eGQueryResult’ are also a dictionary.

The dictionary within the result is iterated over (with row being the current subkey). Usually, we are only interested in two subkeys; the database name ‘MenuName’ and the count ‘Count’.

We populate lists X and Y with the strings from the 2 subkeys. Finally, we print the two lists, provided count is not zero.


# bpy12.py
from __future__ import print_function, division
from Bio import Entrez

Entrez.email = "[email protected]"
handle = Entrez.egquery(term="asthma")
record = Entrez.read(handle)
handle.close()
X=[]
Y=[]
for row in record["eGQueryResult"]:
for i in row:
if i == 'MenuName':
Y.append(row[i])
elif i=='Count':
X.append(row[i])
for i in range(len(X)):
if int(X[i])==0:
continue
print("%20st%s" % (Y[i],X[i]))

# PubMed 149142
# PubMed Central 106753
# MeSH 11
# Books 7165
# PubMed Health 1461
# OMIM 270
# Site Search 75
# Nucleotide 32446
# GSS 91
# EST 210
# Protein 2558
# Genome 1
# Structure 390
# dbVar 9
# Gene 1034
# SRA 8245
# BioSystems 125
# UniGene 1
# Conserved Domains 22
# PopSet 7
# GEO Profiles 731699
# GEO DataSets 4417
# HomoloGene 18
# PubChem Compound 122
# PubChem Substance 2125
# PubChem BioAssay 3515
# NLM Catalog 2610
# Probe 34
# dbGaP 2387
# BioProject 209
# BioSample 12542

Leave a Reply

Your email address will not be published. Required fields are marked *