Federated queries

One major advantage of linked open data is that we can use not only the information stored within our graph. Instead, because we have matched most of our values to existing WikiData-Items (where possible), we are able to write queries linking to different Knowledge Bases like WikiData [or others that also link to WikiData-Items]. As we have location information like places of publication or narrative locations of the novels and WikiData provides coordinates to many locations, we can write a federated query to retrieve those and visualize the locations on a map.

We plan to change the prefixes in our MiMoTextBase soon, so that the wikidata prefixes do not have to be changed. However, the default setting for each Wikibase instance are the Wikidata prefixes, so we have so far changed the actual Wikidata prefixes as shown below. First you need to define the Prefixes you are going to use for the other knowledge base in the query as:

PREFIX wid: <http://www.wikidata.org/entity/> #wikidata wd PREFIX widt: <http://www.wikidata.org/prop/direct/> #wikidata wdt

Within the WHERE-part you can query all items you want to as long as they have a P13 (exact match with WikiData) property. So in our example ?item wdt:P32 ?nar_loc. #a novel has a narrative location ?nar_loc wdt:P13 ?WikiLink. #the narrative location has an exact match.

Next using the SERVICE referring to the WikiData-SPARQL-Endpoint, we can get all information listed on the matching WikiData-entry. To write the triple, we now need to use the property-values of WikiData. Here P625 is the coordinate location of the WikiData entity.

SERVICE <https://query.wikidata.org/sparql> { ?wikidataEntityLink widt:P625 ?coordinateLocation. }

Example: Show all narrative places (using the coordinate locations property of Wikidata)

Federated query
: SPARQL can be used to express queries across diverse data sources, whether the data is stored natively as RDF or viewed as RDF via middleware.

Previous Next

… MiMoTextBase: data.mimotext.uni-trier.de
… SPARQL-Endpoint: query.mimotext.uni-trier.de
… MiMoText project site: https://mimotext.uni-trier.de

If you are new to SPARQL, you can begin the (short) Tutorial, which will give you an overview of how to write basic queries based on examples in the MiMoTextBase. It’s supposed to help newbies to get an introduction on SPARQL, but it cannot give you a deep knowledge of SPARQL; maybe the helpful resources can help you with that.

If you are interested in the MiMoTextBase and its content on the authors, novels, spaces or themes of the French novel in 1751-1800 with already some SPARQL knowledge, you can have a look at the Links.

Within GOING FURTHER there are some queries on the data containing overviews of items like dates of publication or themes changing over time and comparing the different sources of the data in the MiMoTextBase together with some interpretation on the outcome which could show the potential of initial questions on further research.

If you want more detailed information about the structure and the aims of our tutorial, you can find it in the introduction of the tutorial.Information on the infrastructure and the models behind the MiMoTextBase, you can find here .

Having no results in the result table can have different reasons. A simple solution is to check whether you have written the variables the same in the SELECT and the WHERE part of the query.

Another reason could be being too specific in the query. Not all items in the MiMoTextBase contain all information on all properties due to its sources. So it can be helpful to add the OPTIONAL function on some of the properties in your query, see here

If you run into this error message, you probably have to group items. In the example below, we use the count function, but did not add GROUP BY.

Query to retrieve count of published works per author

The solution is easy: We have to aggregate ?authorName by grouping. We now can order the results descending via order by desc(?count) and set a limit of 20 to get the top 20.

Query to retrieve authors with most novels published (top 20)https://tinyurl.com/2b9af2zt

Sometimes you can get many results on a query which can slow down the result generation or impair the readability of some visualizations. In those cases you could add the LIMIT-operation (see here) to only get the TOP x-items or the HAVING COUNT-operation (see here) if you want only results that lie above a certain threshold.

If some of the items appear more often in the results than they should, make sure that you filter all labels for one language (FR, EN, DE) separately as the graph is multilingual and the output will represent all languages within the graph, see here.

You might look for the right identifier concerning the properties, novels, authors, themes or locations. The simplest way is to go on data.mimotext.uni-trier.de and use the search function typing in the label (for example “London” or “about” or “philosophy”). The numerical identifier of the property or the item is visible in the URL or behind the name of the item or the property:

For your ease, we also provide you here with lists of themes, locations and properties and their numerical identifier in the knowledge graph.

List of properties

Query: This is a list of all the properties used in this graph

List of themes

For a list of all thematic concepts in the graph, see this query which lists all thematic concepts and their Q-Identifier, ordered by occurence.

Query: Show all themes with corresponding Q-Identifier and occurence in the graph

List of locations

For a list of all narrative places in the graph, see this query which lists all narrative places and their Q-Identifier, ordered by occurence.

Query: Show all narrative locations with corresponding Q-Identifier & occurrence

These queries list themes or locations ordered by occurence. We would recommend items or properties which have a certain number of connections in the graph, in order to get good results (with enough data points in it).

There are several possibilities for a slowdown or a time out of your query. It could be that the quantity of results is very high, so you might limit the results to check if the syntax of the query is ok. This is done by using the LIMIT parameter . The LIMIT tells the algorithm where to stop, so if you insert for example LIMIT 100 at the end of your query, it will stop after 100 results. This can be helpful for debugging.

Parameters which potentially slow down the query are DISTINCT or ORDER BY. A strategy might be to comment them out to see if these slow down your query.

If you have not used Wikidata, the SPARQL syntax or the RDF format before, we can recommend the Wikidata SPARQL Tutorial, Wikidata:SPARQL queries examples or this Wikidata Query Service Tutorial by Wikimedia Israel as helpful resources. Furthermore, we can recommend Bob du Charme’s book Learning SPARQL as well as his blog.

DuCharme, Bob. Learning SPARQL. Sebastopol, UNITED STATES: O’Reilly Media, 2013. http://ebookcentral.proquest.com/lib/uni-trier/detail.action?docID=1250020.