What I need should really be programmed in Lua, but not me at the moment I'm afraid.
Using Python or Java, I guess it would be something like this:-
Stage 1.
- go to the page with the index of tags and follow each link
- if there is no page, 404 page does not exist, create that page, this is a simple REST post
2.1. on the new page create the links back to the referring page and to this page (simple template)
Stage 2.
- go through each post and look for unique terms, why can't Lucene do this, or Solr, using a crawler, or a simple solution, just place all text into a map, but that loses a lot of information. em...
- use these terms filtered by common use down to a sensible number as tags.