Documentation

Identification scheme

Identification of data entities

The canonical URI for uniquely identifying data entities is of the form:

http://identifiers.org/[collection]/[entity]

In which one needs to replace [collection] with the namespace of a data collection and [entity] with the identifier of the entity created by the original data provider.

Examples of such URIs:

Those URIs should be used in most cases, as they directly identify the data.

Identification of Registry's records

Identification and reference to Registry's record is done via URIs of the form: http://info.identifiers.org/[collection]/[entity].

Those should only be used to identify and retrieve metadata provided by the Registry or access information in RDF/XML. Those URIs should NOT be used for identifying data entries.

Response formats

Currently supported formats

Format name Accepted Internet media types
(X)HTML text/html, application/xhtml+xml
RDF/XML application/rdf+xml

Content negotiation

Content negotiation is a mechanism defined in the HTTP specification that makes it possible to serve different versions of a document at the same URI. Identifiers.org is able to handle content negotiation.

Response format specified in URL

In addition to support content negotiation, one can specify requested format of the answer in the URL, via a suffix:

Testing

In order to perform some tests, you can use cURL. For example the following command will query http://info.identifiers.org/ec-code/1.1.1.1 and request the response to be encoded in RDF/XML:

    curl -H "Accept: application/rdf+xml" "http://info.identifiers.org/ec-code/1.1.1.1"

Custom requests: usage of profiles

Work is in progress to allow users to create profiles in the Registry. Within this project, users will be able to select the data collections they use and preselect a preferred resolving location (or resource) for each of them. Consequently, in order to resolve identifiers, users may then append the profile shortname as part of the URI, and will thereby resolve directly to the recorded preferred location. For example: http://identifiers.org/pubmed/22140103?profile=demo directly display the identified publication using Europe PubMed Central.

Such profiles may be either 'public' or 'private' and hence be made available to other users, or not, respectively. In the later case, direct resolution of the identifiers will need provision of a key.

The infrastructure is already ready to handle this new feature (hence the examples using the 'demo' and 'most_reliable' profiles), but work is still in progress to provide a user interface to manage those 'profiles'. We will send an announcement once this feature is available to all.

Errors handling

If an error was to happen (due to a malformed query or a server issue), a human readable message will be provided to the user and this response will use the appropriate HTTP status code.

Invalid entity identifier

If the identifier provided is not valid for the data collection, a 400 Bad Request response will be issued. For example: http://identifiers.org/uniprot/P123456

Unknown data collection

If the data collection is not registered in the Registry, a 404 Not Found response will be issued. For example: http://identifiers.org/foo/12345

Requested format is unavailable

Currently content negotiation is only supported for the Registry's content. If the format type is unavailable, a 404 Not Found response will be issued. For example: http://identifiers.org/uniprot/P12345.rdf

Server error

If there is an issue on the server side, a 500 Internal Server Error response will be issued. If you encounter such error, please report it. This will allow us to investigate and try to fix the underlying issue.

Example queries

Identifiers.org allows unique, persistent and unambiguous identification of various kind of concepts.

Identification of data entities

Identifiers.org allows identification of single entity of data:

Identification of record in the Registry

Identifiers.org allows identification of single record within the Registry:

Identification of data collections

Identifiers.org allows identification of data collections (those currently directly display information from the Registry):

Identification of resources

Identifiers.org allows identification of resources (or physical locations):

Response format

Request for a specific response format for a Registry entry can be achieved by using an extension:

Note: when used as identifiers, URIs follow the canonical form and not contain any extensions or info sub-domain.

Customised queries

It is possible to request the resolution to use a specific resource. This is done using the resource parameter:

It is possible using profiles to generate specific subsets of the complete Registry, with for each selected data collection, the usage of one preferred resource. This allows one to directly retrieve information from a Identifiers.org URI using one predefined physical location. If a profile is specified, the recorded preferred resource will be used:

A special profile exists, called most_reliable, which has a preferred resource recorded for all data collections. This preferred resource is always the most reliable resource available for the data collection. In case several resources claim this title with the same uptime, one is randomly selected during query time.

In case the profile is private, a key parameter must be provided in the URL.

SPARQL Endpoint

The Identifiers.org SPARQL endpoint allows the conversion of URIs from one given scheme to the alternative equivalent ones. This was specially developed with semantic data integration in mind, where one often needs to consume heterogeneous datasets which use different types of URIs. This service relies on URI schemes recorded in the Registry. If you find a URI which is not yet listed, please report them to us, either via the 'suggest modifications' link from the proper data collection page on the Registry, or directly by emailing us.

Implementation

This SPARQL Endpoint is implemented using Sesame openRDF platform. SPARQL query results are generated on the fly using the Registry's database content. Therefore this SPARQL Endpoint will not allow you to list all content in the database.

SPARQL query examples

1. All the cross-references associated to elements of the type SBML species in the model Edelstein1996 - EPSP ACh event, retrieve the relevant descriptions from Bio2RDF.

Run this query at BioModels SPARQL end point

    PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
    PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
    PREFIX owl: <http://www.w3.org/2002/07/owl#>
    PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
    PREFIX dc: <http://purl.org/dc/elements/1.1/>
    PREFIX dcterms: <http://purl.org/dc/terms/>
    PREFIX foaf: <http://xmlns.com/foaf/0.1/>
    PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
    PREFIX sbmlrdf: <http://identifiers.org/biomodels.vocabulary#>

    SELECT DISTINCT ?species ?annotation ?description WHERE {
       <http://identifiers.org/biomodels.db/BIOMD0000000001> sbmlrdf:species ?species .
       ?species <http://biomodels.net/biology-qualifiers#isVersionOf> ?annotation .

       SERVICE <http://dev.identifiers.org/services/sparql>{
          ?annotation owl:sameAs ?otherURIs .
       }

       SERVICE <http://bioportal.bio2rdf.org/sparql>{
          ?otherURIs dcterms:description ?description .
       }

    }LIMIT 10

2. A SPARQL query that combines information in UniProt and Wikipathways.

Run this query at UniProt SPARQL end point

    PREFIX up: <http://purl.uniprot.org/core/>
    PREFIX taxon: <http://purl.uniprot.org/taxonomy/>
    PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
    PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
    PREFIX faldo: <http://biohackathon.org/resource/faldo#>
    PREFIX owl: <http://www.w3.org/2002/07/owl#>

    SELECT ?protein ?diseaseComment ?pathway ?pathwayName ?x ?z ?otherIRIs WHERE {
       ?protein up:annotation/up:disease/rdfs:comment ?diseaseComment .

       SERVICE <http://identifiers.org/services/sparql>{
          ?protein owl:sameAs ?otherIRIs .
       }

    SERVICE <http://sparql.wikipathways.org/> {
       ?x <http://purl.org/dc/terms/isPartOf> ?pathway ;
             ?z ?otherIRIs .
       ?pathway <http://purl.org/dc/elements/1.1/title> ?pathwayName
       }
    }
    LIMIT 10