
HTMLtoMD
Convert HTML to Markdown. Single page or suck and convert an entire site.
- Converts HTML to Markdown:
- view a single web page (or local html file) as markdown
- links and images are preserved inline with link text / alt text and absolute url
- an efficient way to archive a site if you only need to save the text content
- save the archive with each page as a separate file or consolidated into a single markdown file
- currently free
- Taps into PeacockMedia's long expertise in crawling websites and parsing HTML (Integrity, Scrutiny, WebScraper)
- crawls and converts an entire website and saves to disc
- provide the home or starting url and let HTMLtoMD do the rest
- the archive is opened in a file browser for you to view
Category:Developer Tools / Utilities
System Requirements
Mac OS 10.14 or higher, Apple Silicon or Intel.
History
Version 2.2 - Aug 2024
- Adds support for <code> html element
- removes spurious spaces which surrounded emphasised or strong text
- updates application icon
Version 2.1 - Jul 2024
- Improves parsing of headings
Version 2.0.2 - Jul 2024
- Updated build for Apple Silicon / Intel, 10.14 upwards
- Fixes a bug which caused the engine to stall on certain pages
Version 2.0.1 - Oct 2019
- Built and notarized for running on 10.14/10.15 (Mojave/Catalina) and supporting dark mode.
- Updated to most recent version of the Integrity crawling engine
- Adds About / Help box
Version 2.0 - May 2018
- Updates to the Integrity v8 crawling engine
- Adds option to download images
- Adds option to save each page as a separate markdown file (as before) or consolidate them all into one long markdown file
- Adds option to ignore headers / footers / navs
- Adds option for save dialog to open when scan completes (this option is on by default in order to be consistent with behaviour of version 1)
- Adds options to include link urls and image urls in the markdown
- Fixes issue where extraneous dashes could appear within the markdown
- Fixes bug causing some text to be missing from the markdown immediately following comments in the html
- Other improvements / fixes relating to markdown conversion
- Sorts out some problems re drag / drop and pasting to the starting points view
Version 1.3 - May 2015
Improvements to workflow:
- a paste now works whether you're at the starting points screen or not
- if you've copied from browser or an email, HTMLtoMD will attempt to take that off the clipboard as HTML, avoiding the need to view source before copying
Enhancements / fixes:
- adds automatic update check
- fixes problems with pause/continue button when crawling a site
Version 1.2 (no longer beta) - April 2015
- Fixes bug which was causing a crash if unusually long urls existed in links on a page
- Minor interface changes
- Web download version of app is not sandboxed, was sometimes causing problems when crawling a site locally (alert message said 'you do not have permission to view this file')
- Out of beta and adds in-app purchase for website crawling functionality
Version 1.1 (Still beta) - November 2014
- Adds 'Starting point' screen. In addition to giving the url of a page or starting point for a crawl, you can paste html or drag & drop a file
- Return to the starting point screen any time (except when a site scan is running - pause that first) with cmd-2
v1.0 - November 2014
First beta release