GLAMorousToHTML

Technical notes (under construction)

Latest update: 16 September 2024

This page gives more info about

  1. The structure of the GLAMorousToHTML repository, its files and folders
  2. Short description of their functions (see the docstrings for more detailed functional descriptions)
  3. How to run this repo yourself
  4. Change log
  5. Features to be added

Repository structure and functional descriptions

What are the main files and folders in this repo, and what do they do?

Main folder

category_logo_dict.json category_logo_dict_nde.json

build_html.py

build_excel.py

analytics.py

wikidata_cache.json

README.md - this file

pagetemplate.html

GLAMorous_MediacontributedbyKoninklijkeBibliotheek_Wikipedia_Mainnamespace_10012024.html

Subfolders


Running the scripts yourself

To follow.. <!–If you want to run this script for your own Commons category and create HTML and Excel overviews for your own institution, you can clone/download the repo and run it on your own machine. You will need to make some simple adaptations to the existing code to make it work for the Commons category of your choice. These are:

3) In setup.py, change

That’s all, you should now be able to run the main GLAMorousToHTML script. The generated HTML page will be added to the site/ folder and the Excel to the data/ folder.

In case you can’t get the script up and running, please open an issue in this repo.

Configuration of GLAMorous

The script relies on the XML output of the GLAMorous tool, which needs to be configured so that it only lists pages from Wikipedia

1) that are in the main namespace (a.k.a Wikipedia articles) (&ns0=1)

2) and not pages from Wikimedia Commons, Wikidata or other Wikimedia projects (projects[wikipedia]=1)

The base URL thus looks like https://glamtools.toolforge.org/glamorous.php?doit=1&use_globalusage=1&ns0=1&projects[wikipedia]=1&format=xml&category=. The Commons category of interest needs to be added to the end, omitting the Category: prefix. This base URL is defined (and can be adapted if necessary) in the xml_base_url variable in GLAMorousToHTML_functions.process_category.

The depth of the GLAMorous output (where ‘0’ means no subcategories are read) is specified in the depth (=fourth) variable in category_logo_dict.json. See the section on Repo structure below for more info.

category_logo_dict.json

1) Adapt the category_logo_dict.json for your own needs, making sure the existing syntax is maintained.

2) Add a small logo of the institution (256x256 px or so) as a .png of .jpg to the site/logos folder, and add the filename “icon_xxxxx.png/jpg” to the json file.

–>


Change log (needs updating)

xx April 2024

14 March 2024

29 February 2024

14 February 2024


Features to add