Queries about secondary literature, references and quotations

Next to the statements mined from literary works and the bibliographic metadata, the team annotated statements about the authors and the literary works in different secondary publications. These were then lodified and imported in the MiMoTextBase. This allowed us to enrich the graph with different statements:

  • for author-items: topic interest (P47)
  • for literary work-items: about (P36)

Using reification the statements themselves have statements:

  • stated in (P18): reference to the scholarly work
  • quotation (P42): String containing the quotation where the statement was derived from

In order to retrieve references using SPARQL, it is necessary to extend the PREFIXES declared. A more detailed explanation of the prefixes can be found elsewhere, eg. here for wikidata. For more information about references, see here.

Not only do we want to use the prefix mmdt: for the properties to point directly to the value, but we want to address the whole statement group.

With the following query, we can display all Labels of the secondary literature used in the MiMoTextBase. Klick here to open it in a new tab.

Please note: As we are addressing only the reference within this query, it would be possible to combine this lines

?statement prov:wasDerivedFrom ?refnode.
?refnode mmpr:P18 ?ref.

using property paths into one line:

?statement prov:wasDerivedFrom/mmpr:P18 ?ref.

With the next query, we want to explore all items, either literary work-items or author-items, where any reference by a scholarly literature was made.

How many references by scholarly works does each author has? Ordered in descending order.

If you want to ask the same for the literary works, just change line 11 from

?item mmdt:P11 mmd:Q11. # item has occupation author


?item mmdt:P2 mmd:Q2. # item is instance of literary work

or click here.

For each of the secondary literature-items within the MiMoTextBase count how many references / quotations they have in total (please note: there can be the same quotations if it is connected to more than one work, author or statement). See query here

The following queries focus on the statements of the individual works.

The first query retrieves novels and the counts of their associated P36 (about)-statements that are referenced at least three times by any scholarly work.

In the next query, the works will be narrowed down to those that deal with the theme of “libertinism”. In addition, we ask for the literature on which the topic is referenced and the corresponding quotations.

Next we are going to connect both the topic interest-statements of the author items and the about-statements of the novels. With this query we retrieve all topic interests of one author and show the novels were these topic interests are represented. Also show the respective quotations for the about statements of the novels as well as the quotations for that statement found in topic interest within the author-statements. Show the quotations only if they are not the same within the topic interest and the about-statement.

If you want to bind the references and the quotations additionally, so that all quotations for topic interest and all quotations of about are shown as one string each, check this query.

As the quotation statements are of the datatype “string”, it is possible to create a string search within the quotations. For example if you want to search for the mention of “Voltaire” in either about-statements of the novels or the topic interest-statements of the authors, you can add the line FILTER(CONTAINS(?quotation, "Voltaire")), see query here.

If you are interested in the temporal development of the topics written about in the secondary literature, this can be displayed as a bar chart. For this purpose, in addition to the topics of the primary texts and their referencing, the year of publication of the secondary literature is queried. As secondary literature is not available for every year in the period from 1902-2008, the year is additionally filtered from the ?date variable using BIND(STR(YEAR(?date)) as ?year) and converted to a string in order to avoid larger gaps in the representation. See query here.

Previous Next

… MiMoTextBase: data.mimotext.uni-trier.de
… SPARQL-Endpoint: query.mimotext.uni-trier.de
… MiMoText project site: https://mimotext.uni-trier.de

If you are new to SPARQL, you can go through the (short)Tutorial,which will give you an overview of how to write basic queries based on examples inMiMoTextBase. It’s supposed to give newbies an introduction to SPARQL, but it cannot give you a deep knowledge of SPARQL – maybe theseresourcescan help you with that.

If you are interested in MiMoTextBase and its content onauthors,novels,spacesorthemesof the French novel in 1751-1800 with already some SPARQL knowledge, you can have a look at the links.

WithinGOING FURTHER there are some queries on the data containing overviews of items like dates of publication or themes changing over time and comparing the different sources of the data inMiMoTextBase together with some interpretation on the outcome which could show the potential of initial questions on further research.

If you want more detailed information about the structure and the aims of our tutorial, you can find it in theintroduction of the tutorial.Information on the infrastructure and the models behind MiMoTextBase you can findhere.

Having no results in the result table can have different reasons. A simple solution is to check whether the variables are spelled the same in the SELECT and the WHERE part of the query.

Another reason could be being too specific in the query. Not all items in MiMoTextBase contain all information on all properties due to its sources. So it can be helpful to add the OPTIONAL function on some of the properties in your query, seehere.

If you run into this error message, you probably have to group items. In the example below, we use the count function, but forgot to add GROUP BY.

Query to retrieve count of published works per author:

The solution is easy: We have to aggregate ?authorName by grouping. We can now get the results in descending order via order by desc(?count) and set a limit of 20 to get the top 20.

Query to retrieve authors with most novels published (top 20):

Sometimes you can get many results on a query which can slow down the result generation or impair the readability of some visualizations. In those cases you could add the LIMIT-operation (seehere)to only get the TOP x items or the HAVING COUNT-operation (seehere)if you want only results that lie above a certain threshold.

If some of the items appear more often in the results than they should, make sure you filter all labels for one language (FR, EN, DE) separately as the graph is multilingual and the output will represent all languages within the graph, seehere.

If you're looking for the right identifier for properties, novels, authors, themes or locations, the simplest way is to visitdata.mimotext.uni-trier.deand type in the label (for example “London” or “about” or “philosophy”) in the search bar. The numerical identifier of the property or the item is visible in the URL or behind the name of the item or the property.

You can also consult our lists of themes, locations and properties and their numerical identifier in the knowledge graph below.

List of properties

Query:Retrieve a list of all the properties used in this graph

List of themes

For a list of all thematic concepts in the graph, see thisquerywhich lists all thematic concepts and their Q-identifier, ordered by occurrence:

List of locations

For a list of all narrative places in the graph, see thisquerywhich lists all narrative places and their Q-Identifier, ordered by occurrence:

These queries list themes or locations ordered by occurrence. We recommend using items or properties which have a certain number of connections in the graph, in order to get good results (with enough data points).

There are several possible reasons for a slowdown or a timeout of your query. It could be that the quantity of results is very high, so you might limit the results to check if the syntax of the query is OK. This is done by using theLIMITparameter. The LIMIT tells the algorithm where to stop, so if you insert for example LIMIT 100 at the end of your query, it will stop after 100 results. This can be helpful for debugging.

Parameters which potentially slow down the query are DISTINCT or ORDER BY. A strategy might be to comment them out to see if these slow down your query.

If you have not used Wikidata, the SPARQL syntax or the RDF format before, we can recommend the Wikidata SPARQL Tutorial, Wikidata:SPARQL queries examples, the SPARQL Playground or this Wikidata Query Service Tutorial by Wikimedia Israel as helpful resources. Furthermore, we can recommend Bob du Charme’s book "Learning SPARQL" as well as his blog:

DuCharme, Bob. Learning SPARQL. Sebastopol, UNITED STATES: O’Reilly Media, 2013. http://ebookcentral.proquest.com/lib/uni-trier/detail.action?docID=1250020.