A Critical Guide to the neXtProt knowledgebase: querying using SPARQL
This Critical Guide in the Introduction to Bioinformatics series briefly outlines how to explore the neXtProt human protein database using SPARQL. While text indexation has made database contents more accessible, being able to combine search criteria for specific content permits more powerful querying, and provides a means to mine the information stored in databases. This Guide illustrates the use of the SPARQL semantic query language to interrogate neXtProt and other databases that provide SPARQL endpoints.
Specifically, the Guide introduces the concept of database ‘semantic triples’, and examines features of the neXtProt data model. On reading this Guide, and completing the exercises, users will be able to: i) identify key entities within the neXtProt data model; ii) explain what these entities represent, what information they contain and what the information is used for; iii) identify key SPARQL syntax elements; iv) understand SPARQL tutorial examples; and v) write a SPARQL query to retrieve entries matching specific criteria.
Keywords: Human protein database, Introduction bioinformatics, Introduction nextprot, Nextprot data model, Rdf triples, Semantic triples, Sparql queries, Sparql syntax, Training material
Target audience: Beginners
Remote created date: 2019-06-06
Scientific topics: Database management
Activity log