Wikidata general overview
General overview of the Wikidata universe for newcomers to get more familiar and self-reliant in this (sometimes) confusing ecosystem. Collected and curated by KB, national library of the Netherlands.
This page is a textual summary of
-
the (Dutch language) course Guide to Wikidata for employeees of KB, national library of the Netherlands on 6th June 2023. The (rather long) full slidedeck for this course is available on Wikimedia Commons and Zenodo as PDFs.
-
In October 2024 the section on Wikidata for research, science and cultural heritage was expanded, based on the content from the corresponding section in the presentation Wikidata Workshop - Theoretical part - Maastricht University - 15 October 2024 (also see Zenodo).
Contact
This page is maintained by Olaf Janssen, Wikimedia coordinator of KB. See his Wikidata user page and expert page on kb.nl for contact details.
Reuse and licensing
This overview can be reused freely and openly, it is available under the CC-BY 4.0 license, so attribution is required. Use something like
Wikidata general overview, Olaf Janssen & KB national library of the Netherlands, https://github.com/KBNLwikimedia/Wikidata-General-Overview/

See also
For more in-depth insights into how Wikidata is used specifically by/in/for the services of KB, national library of the Netherlands, see
Latest updates
Latest update: 12 October 2024 (updated section ‘Wikidata for research, science and cultural heritage’)
Objectives of this page
1) To provide a broad overview of the Wikidata landscape, including both the technical and non-technical (social, community) aspects.
2) To foster and stimulate more self-reliance in exploring the Wikidata universe
Contents
Table of contents generated with markdown-toc
1) What is Wikidata?
- Various answers: A collection of structured data - A linked open database - Something to do with the sematic web? - Data for people and machines - A sister/brother of Wikipedia - Data for everyone - Public counterpart to commercial big tech data collections - A hub for 1000s of other databases - A free public utility for LOD - A worldwide enthusiastic community - The coolest LOD project ever! - Linked data, but understandable - LOD, but without the complexity of LOD - A training tool to develop yourself in the field of (linked open) data - An (academic) research object
- Prompts for ChatGTP, 2023-05-02: “What can you tell me about Wikidata?” and “Can you give me 7 perspectives on Wikidata?”
- Random Wikidata items. Prompts for ChatGTP, 02-05-2023: “Can you give me 50 random ‘obscure’ things described in Wikidata”
- In addition to being a broad database, Wikidata can also be very specific, e.g. 892 Van Gogh paintings (d.d. 04-10-2024)
- Wikidata contains structured descriptions of 114M things, starting from October 2012 (source, per 5 Oct 2024)
- Things included in Wikidata: Scientific articles - People - Animals and plants - Events - Countries, provinces, municipalities, cities, villages - Streets, roads, squares - Buildings - Vehicles - Companies and institutions - Works of art (film, music, paintings, etc.) - Chemical substances - Astromic objects - Genes - etc. etc.
- Wikidata items with geo locations, June 2023. Addshore, CC0, via Wikimedia Commons
2) The principles of Wikidata
See Wikidata:Introduction and Wikidata:Philosophy
- Structured descriptions of things
- Central storage (vs distributed), no data silos
- Multilingual (200+ languages): Description of the Eiffel Tower, check the Dutch, English, Portuguese and Japanese interfaces.
- Linked data
- Things, not strings - No plain text, but clickable links
- Interconnected: Eiffel Tower and Gustave Eiffel
- Connected to external databases (external IDs): https://www.skyscrapercenter.com/building/wd/9410
5) Open & free
- Free, no trackers, no ads
- No copyright or database rights, CC0 license
- Everyone can reuse data: query, share, copy, edit, download, sell, etc.
- Everyone may contribute/edit data, add, improve, delete, merge, etc. –> Community, see below
6) Community
- International, 23K editors (Oct 2024)
- Under the banner of the Wikimedia Foundation –> sister project of Wikipedia, Commons etc.
- More in section 7: Who creates Wikidata?
7) For humans and machines
- Human readable, human writable –> Data available via GUIs in HTML, https://www.wikidata.org/wiki/Q1526131 (KB)
- Machine readable, machine writable –> Data available via APIs in JSON, XML/RDF, CSV etc.,
- https://www.wikidata.org/w/api.php?action=wbgetentities&ids=Q1526131&format=json
- https://www.wikidata.org/w/api.php?action=wbgetentities&ids=Q1526131&props=labels&format=xml
8) Or, formulated slightly differently: a free public utility for data - One stop shop for Linked open data - LOD, but understandable
3) How things are described in Wikidata
Overview: Help:About data
- “Mount Everest is the highest point in the world” : Earth (Q2) (item) –> highest point (P610) (property) –> Mount Everest (Q513) (value)
- Triple: Earth (Q2) –> highest point (P610) –> Mount Everest (Q513)
- KB on Wikidata: Concept URI: http://www.wikidata.org/entity/Q1526131 (http: and /entity/), redirecting to https://www.wikidata.org/wiki/Q1526131 (https: and /wiki/)
1) Unique identifier (Qxxxxxx)
2) Multilingual fingerprint: Label, Description, Aliases (Also known as)
3) Statements
4) Qualifiers
5) Source references
6) External identifiers: descriptions of KB in external databases
- In summary: Simplified Wikidata data model
- Version history of Koninklijke Bibliotheek (Q1526131): https://www.wikidata.org/w/index.php?title=Q1526131&action=history –> Every mutation is stored! And can be rolled back. Everything is public, for ever!
- Further reading & help: How does Wikidata work? + Wikidata:Glossary
- Wikidata intro videos on Youtube: Searches for Introduction Wikidata, Wikidata 101 and Wikidata help
4) How to discover Qs and Ps in Wikidata
5) How to request data from Wikidata
Overview: Wikidata:Data access
1) Searching in a web browser
2) HTML content in web browser
- Linked Data Interface (URI): provides access to individual Q-entities via URI: http://www.wikidata.org/entity/Q???
- Concept URI = the reference to the ‘thing’ in the real world. For instance Koninklijke Bibliotheek (Q1526131): http://www.wikidata.org/entity/Q1526131 (note ‘http’ and ‘entity’ instead of ‘https’ and ‘wiki’)
- Content negotiation: https://wikidata.org/wiki/Special:EntityData/Q1526131. When accessing a resource in the Special:EntityData namespace, the special page applies content negotiation to determine the output format.
- Result in web browser: the reference to the description of the KB: https://www.wikidata.org/wiki/Q1526131 (note the ‘https’ and ‘wiki’)
3) Non-HTML content in web browser
- See https://www.wikidata.org/wiki/Wikidata:Data_access#Details_2
- Request full Wikidata items in various output formats directly from the Qnumber via a Special:EntityData URL.
- The ouput can be obtained in seven different formats: HTML, JSON, JSON-LD, RDF, NT, TTL or N3 and PHP
- If you don’t want to depend on content negotiation (e.g. view non-HTML content in a web browser), you can actively request alternative formats by appendig a format suffix to the URL, eg. to retrieve JSON: https://www.wikidata.org/wiki/Special:EntityData/Q1526131.json.
- Other available formats are JSON-LD, RDF, NT, TTL or N3 and PHP.
- Equivalant URLs for these requests use the format argument, e.g. : Special:EntityData?id=Q1526131&format=rdf.
4) Enriched web browser interfaces
- Regular Wikidata interface (KB): http://www.wikidata.org/entity/Q1526131
- Reasonator Wikidata GUI: https://reasonator.toolforge.org/?q=Q1526131
- SQID Wikidata GUI: https://sqid.toolforge.org/#/view?id=Q1526131
- MediaWiki Action API - the common API for all Wikimedia projects (Wikidata, Wikipedia, Wikimedia Commons etc.)
- Base URL/API endpoint: https://wikidata.org/w/api.php
- Examples of API calls (bottom of the page)
- Examples KB (Q1526131):
6) Wikidata REST API
- Wikidata REST API - a specialized API for Wikidata
- Advantage: Cleaner, flatter structure in response data
- Base URL/API endpoint: https://www.wikidata.org/w/rest.php/wikibase/v0/
- Documentation (OpenAPI Swagger) –> GET to request data
- Examples KB (Q1526131):
7) Datadumps
8) Wikidata Query Service
Wikidata principles = Central data storage + all kinds of topics + everything connected –> Ask anything with SPARQL
- SPARQL = Language to search in Wikidata (and other LOD databases)
- SPARQL query examples:
- Want to get started with SPARQL in Wikidata yourself?
6) How to add data to Wikidata
1) Single item - via GUI
2) Single item - via API
3) In Bulk - OpenRefine
- OpenRefine: “From Excel sheet to Wikidata”
- https://www.openrefine.org
- See the dedicated KB OpenRefine-Wikidata Introduction Workshop, a practical 90 minutes workshop (in Dutch) to learn how to work with OpenRefine and Wikidata at a basic level.
4) In Bulk - QuickStatements
7) Who creates Wikidata?
- Photo of ±1% of all people globally working on Wikidata, Pierre-Selim Huard, CC BY 4.0 via Wikimedia Commons.
- Prompts for ChatGPT:
- “What can you tell me about the Wikidata community?” - 02-05-2023
- “What can you tell me about the mindset of/behind the Wikidata community?” - 04-10-2024
- Wikidata = Sister project/community of Wikipedia, Wikimedia Commons etc.: Common core values are: Open/free knowledge, Collaboration, Knowledge sharing for the benefit of all
- Wikidata in other Wikimedia projects
- Wikidata Statistics: 114M items (since Oct 2012) - 2.25B edits - 23K editors (dated 05-10-2024)
- Roles within the Wikidata community can include: Data editor - Data modeler - Writer of documentation/howtos - Bug reporter - Content domain expert - Software & tools developer - SPARQL expert - Translator - Ambassador - Event organizer/supporter - Writer of project proposals/funding - Conflict mediator - Liaison between community and GLAMs - Communication/outreach - Teacher/trainer/speaker - And many more. –> It’s not only about the ‘hard data’ roles!
- Wikidata community events + Events archive + Media files related to Wikidata events. See also Category:Wikidata Events
- WikidataCon: WikidataCon 2021 + WikidataCon 2019 + Media files related to WikidataCon.
- Wikidata workshops & trainings examples: Atelier Wikidata à Montréal - Wikidata and Wikibase Workshop 2019 - Wikidata for beginners - EuropeanaTech 2018 Wikidata workshop day
- Wikidata hackathon & editathon examples: Wiki Techstorm 2018 - Wikidata Zurich Hackathon 2019 - GLOW Edithaton
- Wikidata Birthdays: Wikidata’s 10th birthday homepage - Events to celebrate Wikidata’s 10th birthday worldwide - Wikidata 7th birthday presents
- Wikidata swag: T-shirts, socks, caps, pens, wallets, stickers, beer openers, key chains etc.
Projects
Wikidata for research, science and cultural heritage
- Unsorted articles
- Scientometrics / scholarly communication
- Life and biomedical sciences
- Wikidata: A platform for data integration and dissemination for the life sciences and beyond, Elvira Mitraka, Andra Waagmeester, Sebastian Burgstaller-Muehlbacher, Lynn M. Schriml, Andrew I. Su and Benjamin M. Good (2015)
- Wikidata as a knowledge graph for the life sciences, Waagmeester et al. (2020), eLife. 9.
- A protocol for adding knowledge to Wikidata: aligning resources on human coronaviruses, Waagmeester, A., Willighagen, E.L., Su, A.I. et al., BMC Biol 19, 12 (2021).
- Biological pathway abstractions: from two-dimensional drawings to multidimensional linked data, Waagmeester, A. S., Maastricht University (2024) - This doctoral thesis has 415 mentions of Wikidata!
- Wikidata and the bibliography of life, Page, R.D.M., PeerJ 10:e13712 (2022) - See also https://vimeo.com/451179359
- Astronomy
- Mathematics
- Language technology / Computational linguistics / AI / LLMs / NLP
- Fine-tuned LLMs Know More, Hallucinate Less with Few-Shot Sequence-to-Sequence Semantic Parsing over Wikidata, Xu, Silei et al. (2023).
- Refining Wikidata Taxonomy using Large Language Models, Peng, Yiwen, Thomas Bonald and Mehwish Alam (2024)
- User-friendly Comparison of Similarity Algorithms on Wikidata Ilievski, Filip, Pedro A. Szekely, Gleb Satyukov and Amandeep Singh (2021)
- Widaug. Data augmentation for named entity recognition using Wikidata, Pablo Calleja, Alberto Sanchez and Oscar Corcho (2023)
- Improving Candidate Retrieval with Entity Profile Generation for Wikidata Entity Linking, Tuan Lai, Heng Ji and ChengXiang Zhai (2022)
- GLAM / cultural heritage
- Representation of (female) scientists
- More scientific articles related to Wikidata
10) Where to find help (passively and actively)
11) How to proceed?
- Create an account on Wikidata
- Get started with Wikidata yourself
- Check the other Wikimedia related courses of the KB
- Wikidata & KB, an overview, an overview of how Wikidata is used by/in/for both the linked open datasets (thesauri) and public domain heritage collections of the KB.
- OpenRefine-Wikidata Introduction Workshop, a practical 90 minutes workshop (in Dutch) to learn how to work with OpenRefine and Wikidata at a basic level.
- Wikibase resources, a collection of resources, overviews, links and knowlegde related to Wikibase, collected and curated by KB.
- Wikidata & SPAQRL workshop (2024, to be announced)
- Questions? Looking for support or extra explanations? Contact Olaf Janssen, see above for his details