HHK2HTML Tutorial: Preserve Index Structure When Exporting .hhk Files
Overview
- Purpose: Convert Microsoft HTML Help Index files (*.hhk) into HTML pages while retaining the original index hierarchy, anchors, and link relationships.
- Typical users: Technical writers, documentation engineers, software maintainers converting legacy CHM help to web documentation.
- Date: February 7, 2026
Quick concepts
- .hhk format: Plaintext HTML-like file containing index entries as nestedand elements; entries include NAME and local link (e.g., “topic.htm#anchor”).
- Preserve index structure: Maintain parent/child ordering, page anchors, and any display labels so the web version mirrors the original help index navigation.
Step-by-step tutorial
- Inspect the .hhk
- Open in a text editor to confirm structure (look for , , , and ).
- Extract entries
- Parse the .hhk to collect nodes with fields: label, target (local link), and depth (nesting level).
- Reasonable default: treat each start as depth+1 and as depth-1.
- Normalize targets
- Resolve relative paths to your output folder.
- Keep anchors (e.g., #anchor) intact.
- If multiple entries point to same target, preserve duplicates as separate index items.
- Generate HTML index page(s)
- Option A — Single index page: recreate nested structure reflecting depths; each item is an Label.
- Option B — Split pages: group entries by first-level sections into separate HTML files and include a master index linking them.
- Include optional JavaScript/CSS for collapsible sections and highlight current topic.
- Preserve anchors and link behavior
- Do not strip or alter “#anchor” fragments.
- If converting topics to new filenames, create a redirect map (old → new) and use it to rewrite hrefs.
- Maintain accessibility & SEO
- Use semantic lists (/) and descriptive link text.
- Add title attributes only when they add value.
- Testing
- Verify links open correct pages and jump to anchors.
- Check nested structure visually and via DOM inspector.
- Test across browsers and in static-site generators (if used).
- Automation suggestions
- Use a small script in Python (BeautifulSoup) or Node.js (cheerio) to parse .hhk and emit HTML.
- Include a dry-run option to output a mapping CSV of original → new targets before writing files.
Minimal Python example (parsing idea)
python
from bs4 import BeautifulSoup s = open(‘index.hhk’,‘r’,encoding=‘utf-8’).read() soup = BeautifulSoup(s, ‘html.parser’) for obj in soup.find_all(‘object’): name = obj.find(‘param’, {‘name’:‘Name’})[‘value’] local = obj.find(‘param’, {‘name’:‘Local’})[‘value’] print(name, local)
Common pitfalls & fixes
- Broken relative paths — resolve against CHM extraction root.
- Duplicate anchors — ensure anchors exist in topic pages or create in-page redirects.
- Encoding issues — read .hhk with correct charset (often ANSI or UTF-8).
Deliverable checklist
- Parsed index entries CSV (label, target, depth)
- Single or multi-page HTML index that mirrors nesting
- Redirect map for renamed topics
- Test plan confirming anchors and hierarchy
If you want, I can generate a ready-to-run Python or Node.js script that converts a sample .hhk into a nested HTML index and outputs a redirect map.
Leave a Reply