Integrity
Version History
v4.1.1 Released April 13
Adds 'bad links only' checkbox (was possible to toggle bad links / all links using menu or toolbar button. This new checkbox makes the option more obvious if toolbar not showing and for similarity to Scrutiny)
Adds filter drop-down list (All, Internal, External, Images) and search box above all views
If flagging blacklisted urls, then the highlight colour used is orange or the warning colour (was red or bad link colour). Not an error so inappropriate to use an error colour.
v4.0.4 Released February 13
Fixes problems creating black/whitelist rules on first run with no settings saved
Correctly sets window to edited (dirty spot in red button) when black/whitelist rules are changed, triggering prompt to save when switching settings
Version 4.0.3
Small fixes, full release February 2013
Version 4
released January 2013
Major improvements to the data storage and engine meaning that even small sites will crawl more quickly and large sites will crawl very much more quickly without slowing down or losing responsiveness
When stop button is pressed, all open threads are abandoned, and then recreated if 'continue' is pressed. Gives a much better user experience
Routines for 'by page' view re-written to avoid apparent hanging at the end of the crawl of a big site
Adds new settings to Preferences, allows setting of some limits - default to 200,000 links. Offering the option of limiting the crawl of a large site (maybe better achieved by using blacklist / whitelist rules) but also a safety valve to prevent crashing due to running out of resources when crawling very large sites
If starting crawl within a directory, crawl is limited to that directory, ie crawl will go down a directory structure but not up. This matches users' expectations. Previously, crawl extended to all pages in the same domain
Blacklist and whitelist boxes replaced by a more user-friendly table of rules (existing data will be presented in the new way)
Moves 'check links on custom error pages' to settings rather than global preferences
Increases maximum number of threads from 30 to 40 (will improve crawling for some sites) with the default now 12 rather than 7. Extreme left (labelled 'fewer') is still a single thread.
Version 4.0.3
Small fixes, full release February 2013
Version 4
released as Release Candidate January 2013
Major improvements to the data storage and engine meaning that even small sites will crawl more quickly and large sites will crawl very much more quickly without slowing down or losing responsiveness
When stop button is pressed, all open threads are abandoned, and then recreated if 'continue' is pressed. Gives a much better user experience
Routines for 'by page' view re-written to avoid apparent hanging at the end of the crawl of a big site
Adds new settings to Preferences, allows setting of some limits - default to 200,000 links. Offering the option of limiting the crawl of a large site (maybe better achieved by using blacklist / whitelist rules) but also a safety valve to prevent crashing due to running out of resources when crawling very large sites
If starting crawl within a directory, crawl is limited to that directory, ie crawl will go down a directory structure but not up. This matches users' expectations. Previously, crawl extended to all pages in the same domain
Blacklist and whitelist boxes replaced by a more user-friendly table of rules (existing data will be presented in the new way)
Moves 'check links on custom error pages' to settings rather than global preferences
Increases maximum number of threads from 30 to 40 (will improve crawling for some sites) with the default now 12 rather than 7. Extreme left (labelled 'fewer') is still a single thread.
Version 3.9.1 / 3.9.2 / 3.9.3
Small fixes
Version 3.9
released September 2012
New view - 'Links by page' shows hierarchical view of your site's pages with its links below
All statuses are shown for redirected links rather than just the final one
Sorting available on all tables
efficiencies making crawl quicker and more memory-efficient, increasing the maximum size of site that can be crawled in one go
Blacklisteded urls can be flagged (option added to preferences)
Adds 'Clear and Re-start' to File menu
Fixes context help 'i' button for timeout and delay fields
German localisation removed as had become out of date
Version 3.8.6
released August 2012
Fix to avoid problem experienced sometimes when pasting in a url from elsewhere
Links relative to scheme eg //domain.com (see http://www.ietf.org/rfc/rfc3986.txt section 4.2) handled better - previously problem if the page's base href was given in this format
Fixes last used settings not being saved properly
Toolbar pause button removed and role now taken by Go button
Uses alternating rows in tables
fixes redirected urls (3xx) not being highlighted yellow
Removes good colour from Preferences (to allow for stripey views)
Version 3.8.5
released June 2012
Adds support for telephone links such as tel: and skype: (now recognised and skipped rather than reported as an error)
Fixes bug relating crawling local sites introduced in 3.8.4
Fixes problem with crawling local sites if they are stored in the root Library folder
Fixes bug causing special characters such as ü, ö, ä in page title or link text being altered to u, o, a when exported. All exports (.dot, .csv, .tdl, .html) now export using utf-8 character encoding. Note that in line with web standards (RFC 1738) Integrity and Scrutiny don't support non-ascii characters in urls
Version 3.8.4
released May 2012
Fixes problem of xml sitemap not reading user's setting for update frequency
Fixes a bug which could cause hanging or crashes in certain circumstances
Fixes problem with thread counting, faster crawling
Version 3.8.3
released May 2012
Fixes spurious text appearing in 'Link text' for links on images where the images alt = '' (empty string)
Fixes bug preventing proper construction of urls where base href = "/"
Fixes bug affecting checking of broken images where image has src = "" and improved handling of empty quotes if that option is switched on
Fixes problem of crawl or 'recheck broken links' not always finishing properly
Fixes potential crash under certain circumstances (involving redirect, url having trailing slash and settings set to ignore trailing slashes)
Fixes bug affecting checking of broken images where image has src = "" and improved handling of empty quotes if that option is switched on
Default link check timeout shortened to 30s
Fixes bug preventing images from being found if 'src' doesn't follow 'img' in the html
Version 3.8.1
released April 2012
Fixes comma or trailing comma in blacklist fields preventing proper crawl
Adds preference to ignore trim leading or trailing spaces or mismatched quotes from a url
Fixes global prefs not being saved properly
When crawling locally, fixes 'file is directory' status being included in bad links
Some fixes to the 're-check bad links'. (Was causing crash sometimes since last release)
Highlighting link on page feature is switchable between highlighting and simply visiting page. Default is the latter.
Fixes problem of throbber sometimes continuing to turn when crawl or re-check has finished
Version 3.8
released March 2012
Adds 'Ignore trailing slash' button to settings, can be set per site, set to 'yes' by default
Fixes a problem preventing crawling of pages if braces { } are present in the url
When crawling local files, directories are not reported as an error (as long as the directory exists)
'Customize' added to toolbar (although this has been dropped by Apple from Lion 10.7 onwards so will only appear in 10.4 -> 10.6)
Options for sitemap update frequency 'daily', 'weekly', 'monthly' etc altered to lowercase for compliance with the sitemap standard
Two versions now maintained, one built for distribution via web (10.4 - 10.7 supported) and one certified and built for distribution via App Store (10.5 to 10.latest supported). The latter will have a .1 at the end of the version number in the About box, eg 3.7.5.1 is the App Store version. Both remain free
App Store version has Lion features such as full-screen mode
Version 3.7.5
released March 2012
Fixes bug preventing settings from being saved
Small changes for compliance with App Store
Version 3.7.4
released February 2012
Sends referrer header field for every request (other than the starting url) - this fixes a very small number of odd bugs
'Open local file' is added to the File menu. Functionality to crawl a site locally or import a list of links did exist in previous versions and was documented, but wasn't very accessible as it relied on a drag and drop into the starting url field (which still works and is to be improved in a future version)
Clears data from flat link view before starting a new crawl
Improves re-check broken links - now correctly uses as many threads as are set in settings and fixes problem preventing it from finishing every time. Also small fix to prevent it going into a loop if button pressed when there are no bad links
Adds background image and installation instructions to dmg file
Fixes bug preventing links to w3c being checked properly
Fixes a small memory leak
Fixes bug preventing crawl from finishing properly if user tries to highlight link on page before link has been checked
Fixes bug preventing date stamp from being written properly every time
Fixes problem of link text not showing in main link table for certain sites by trimming whitespace characters from around link text
Version 3.7.3
released November 2011
Links to subdomains can be considered as internal rather than external. ie peacockmedia.co.uk and www.peacockmedia.co.uk are considered the same site (which is not necessarily true but most people would expect) and therefore both are followed. Adds checkbox in global preferences to switch this option. Default is on. With the option on, Integrity will discover more links (and potentially more bad links) on certain websites. Option needs to be switched off if you wish to deliberately limit your crawl to one subdomain
Fixes memory problem, helping application to deal with larger sites
Bug fix and small improvement to 'my sites' drawer
Closing main window quits application after 'are you sure' dialogue
Version 3.7.2
released October 2011
Exports .dot file (standard format used by graphing applications) which can be opened as a visualisation in third-party graphing apps. includes colour to indicate levels. Accessed via File>Export or a new toolbar button added via 'Customize toolbar...'
Fixes problems with 'Re-check broken links' and 'Re-check this link'
Fixes 'on page as title / url' preference (broken in last version)
Adds 'Getting started' to the Help menu and splash screen
replaces 'Bad links' icon with a more suitable one (previous one looks like 'delete')
Fixes glitch with 'Inspect selected' button when flat view is showing
Version 3.7.1
released October 2011
Single version compatible with OSX 10.4 Tiger through to 10.7 Lion (minimum Intel / ppc 10.4)
(since v3.6, an older version, v3.5 was offered to Tiger users)
Improvements to user interface: toolbar - customisation includes space and flexible space, contents of settings tab move to fill the space as main window is resized
Fixes problem of user not being able to get main window open again if closed
Fixes bug causing base href not to be discovered which could lead to many improperly-constructed relative urls
Fixes distance column in flat view
Version 3.7
released August 2011
OSX 10.7 Lion compatible
Improves 'My Sites' - allows the same url to be saved more than once with different settings.
Version 3.6
released May 2011
Ability to import list of links, either html format or plain text list
Online manual linked from Help menu, includes instructions for crawling sites locally and importing a list of links
Moves list of sites from drop-down list to 'my sites' pop-out drawer
'last checked' date and status is stored and displayed
Last used settings are saved and visible on launch
Minimum system requirements now Intel / 10.5
Version 3.5.4
released March 2011
Fixes bug relating to empty href's.
improved reporting of link text and page titles which contain non-ascii characters.
Version 3.5.3
released January 2011
efficiency improvements (using internal cache rather than copying data, object retention / release)Version 3.5.2
released November 2010
New option to allow 'not followed' links to be excluded from sitemap.
Fixes bug preventing Integrity from recognising a link if it has a carriage return immediately after the a.
Version 3.5.1
released October 2010
German localisation added.
Fixes bug causing crashes if internet connection fails or isn't stable.
Allows copy of url from 'on page' column of link inspector (as per filenames, requires two single-clicks to select the url - note that a double-click opens the page and attempts to highlight the link on the page using a style set in Preferences).
Fixes bug causing crawl to stop if starting url is redirected.
Version 3.4.1
released October 2010
Fixes bug causing random crashes introduced with major changes in 3.4
Version 3.4
released October 2010
Better string handling for urls and link text - makes running more efficient and correctly displays link text which includes non-ascii (non-English) characters.
Reduced background status logging also makes for faster running.
Fixes bug preventing sorting of flat view with 'bad links only' showing.
Fixes bug preventing generation of flat view if 'bad links only' showing when crawl finishes.
Other small fixes.
Version 3.3.6a
released September 2010
Fixes bug which caused instability with certain sites when using more threads.
Version 3.3.6
released September 2010
Fixes bug causing random crashes, especially when losing internet connection
Adds option to highlight missing link urls (where href = "#" or "" )
Version 3.3.5
released July 2010
Fixes bug preventing 'highlight link on page' feature working properly.
Fixes bug preventing crawling if comment terminated with more than two dashes eg '--->'
Fixes bug which prevented proper crawling if return or other characters were present inside </script> tag.
Version 3.3.4
released June 2010
Fixes bug which prevented proper crawling if return characters were present inside the <a> tag.
Version 3.3.3
released May 2010
Fixes bug which could cause crashing if using a custom user-agent string.
Context help added for some options.
Version 3.3.2
released May 2010
Minor improvements when checking sites on a local drive; improves adding 'file://' before crawling, and fixes bug preventing proper crawling.
Version 3.3.1
released April 2010
Adds setting - 'don't check external links' - makes crawl faster if you only need to generate a sitemap.
Version 3.3
released January 2010
Checks distance of each url from home page. Can be displayed as a column in Integrity's table views and exported files. See Preferences to switch this column on or off.
Generates XML sitemap. Note that the sitemap will be generated according to settings for the url crawled. (ie it is important to have settings like 'page titles are unique' or 'ignore querystrings' set correctly). Priority can be filled in automatically based on distance from home page.
Version 3.2
released November 2009
Changes to the user interface. Current url is displayed in a combo box along with the 'go' button at the top of the main window. The settings for the current url (previously called 'current config') are now displayed in the default tab of the main window. Flat and sortable views are now switched using tab buttons at the bottom of the main window.
Option for checking broken images added. Image urls are denoted by [img src] in the link text column.
Bug fix - alt text is now correctly shown (if it exists) in the link text column when the link contains an image rather than text. For example: [linked image]:NHS Direct
Some improvements to saving / deleting of settings for current site.
Auto-complete added to main url combo box. However, this only works if you type the 'http' or 'www' or however the saved url starts.
Progress indicator added for 'recheck this link'. Response time and time stamp are also correctly updated.
Help and donate links updated.
Automatic checking for updates. Checks for updates on startup. If a new version is available, informs user and invites visit to download page.
Version 3.1.2
released October 2009
Explicitly doesn't handle cookies (random behaviour previously).
will now pick up links withing imagemap area tags.
Version 3.1.1
released March 2009
Fixes bug which stopped further crawling if initial page is redirected.
Small efficiency/speed improvement.
Fixes bug which could register incorrect links if a request is redirected more than once.
Version3.1
released December 2008
Time stamp logged for each link checkedViews are now customisable - show or hide columns as you like. (Exported files reflect visible columns.)
"Redirected" no longer shows in status column as the information is available in its own column
New application icon with less transparency
Version3.02
released December 2008
Fixes bug related to unquoted href'sUnique page titles option (was new with v3.0 - crawls site faster and more accurately if you set this option and if your page titles *are* unique) now defaults to off for existing configs; defaulting to on was causing confusion.
Version3.01
released December 2008
Fixes bug preventing proper crawling of framesetsFixes problem with pause/continue button
Fixes problem with About panel
Version3
released December 2008
Adds 'Inspect Bad Links' to View menu (opens the first bad link in the link inspector)Adds 'Next Bad Link' button to link inspector (moves the link inspector to the next bad link if there is one)
Adds two new tools to the toolbar for 'Inspect bad links' and 'Inspect selected link' and a 'Customise Toolbar...' menu item
Adds highlighting feature - double-click an 'On page' from the list in the link inspector, Integrity will open selected page and highlight selected link with coloured background or coloured border.
Adds drop-down lists to preferences allowing you to choose the style of the highlighting (border / background, style and width of border)
Adds 'Archive pages while crawling' checkbox to preferences (archives pages while crawling - asks you for a save location when crawl is finished).
Version 2.2.2
released September 2008
If the link is around an image rather than text, the 'link text' columns will display [img]: and the alt text of the image.
'Redirected to' column added to flat view.
Changes
to button bar including addition of export as html, csv and text (tdl)
buttons. Now properly autosaves user customisation.
More information in the status display - now also shows how many bad links have been found
Version 2.2.1
released September 2008
Was generating the 'flat view' multiple times, giving the impression of 'hanging' after crawling large sites using lots of threads. Bug fixed, and progress bar added.Version 2.2
released July 2008
Server response time is logged. This is the time taken between Integrity sending the request and receiving the first response. This may not reflect the actual server response time if Integrity is running a large number of threads, or if the internet connection is busy
When Integrity has finished running, a 'flat' view is available, that can be sorted by any of the columns
Global preferences and current config are now combined into one tabbed window
Standard customisable toolbar added and main window rearranged. Stop is now renamed 'Pause'
Version 2.1
released June 2008
Crawls local files (drag the file into the 'starting URL' box)
Version 2.0 (beta)
Architecture
/ Logic changed. This fixes thread-safety issues (ie v1.x crashing on
faster machines when using larger number of threads). Architecture
change also makes v2 faster.
Now handles sites built using frames.
Max
number of threads increased. This was limited in version 1.6.6 as a
quick-fix to thread-safety issues. Max number of threads (when slider
is in 'more' position) is now 29, was 7.
'Threads' are no longer really separate threads owned by Integrity, but simultaneous asynchronous requests.
Version 1.6.11
released May 2008Fixes bug which was causing some links to be skipped on certain pages. Integrity's parser was getting confused sometimes by javascript on pages containing 'less than' and 'greater than' operators.
Other small fixes and efficiencies.
Version 1.6.10
released April 2008Progress indicators added to export functions.
Link info window now shows all occurrences of a link alongside the link text for each occurrence.
Version 1.6.9
released April 2008Fixes bug related to trimming which randomly prevented complete crawling of whole site.
Revised handling of incorrectly nested quotes - now correctly allows for apostrophes as part of url ( "/pdf/Educators'_Guide" ).
Help menu now links to support pages of peacockmedia.co.uk, 'Donate' menu option added.
Version 1.6.8
released April 2008
Routines for trimming whitespace, querystring etc rewritten in pure C, improving efficiency.
Better handling of incorrectly nested single/double quotes ( href = "http://..' )
Now correctly handles base href's which don't give a scheme (assumes http://)
Better
trimming of whitespace, ie carriage returns and other control
characters in unexpected places in the middle of <a ..> tags
Shows how many times a link occurs, not just how many pages it appears on (ie it may appear multiple times on same page).
Version 1.6.7
released March 2008
Fixes
bug which prevented links being found on a page if the end of a comment
and an 'end script' tag were adjacent to each other (
--></script> )
Version 1.6.6
released March 2008
sends user-agent string in header - default is "integrity/1.6" but this
can be changed (see Preferences) if your site needs integrity to appear
to be a recognised browser.
Other fixes and efficiencies.
Version 1.6.5
released November 2007
'whitelists' and 'blacklists' from the config are no longer case-sensitive.
some problems with mcms zref fixed. zrefs are now shown when good links are hidden.
links which are not checked because they are in the blacklist, are
treated as good links. They are hidden when good links are hidden and
are given no colour label.
"Hide good links" button has now become
"Show bad links only". This subtle change means that links which have
not been checked will not show and improves running.
Small fixes and efficiencies.
Version 1.6.4
released October 2007
Fixes problem with tab-delimited file export
Both tab-delimited and comma-separated exports are 'flat', ie each 'on page url' has its own row
Fixes crashes or problems caused by carriage returns or whitespace
present within a quoted href (yes, some html has really unexpected
features)
Ignores Javascript (anything between <script> tags)
More object retention fixes and small efficiencies
Version 1.6.2
released September 2007
'on page url' will now recognise 'http://peacockmedia.co.uk' and
'http://peacockmedia.co.uk/' as the same link. Therefore a broken links
may more correctly be reported on a lower number of pages and the whole
application is a little more efficient.
Recognises and reports 'zref' links, a difficult-to-find link inserted by Microsoft Content Management Server
other small efficiencies and fixes.
Version 1.6.1
released July 2007
Some changes to improve stability
Version 1.6
released July 2007
Adds user-definable colour labels (see Preferences). A 'good link' is
defined as server response code 2xx, redirected links include any 3xx
code, a bad link is a 4xx code, and an 'error' is a 5xx server code or
any other error.
Menu item added View > Info for Current Item
(command-I), shows link inspector pallette (previously only available
via double-click in the main table).
Fixes bug causing crash if no internet connection.
Version 1.5
released 28 May 2007
Supports base href.
Can now export tab-delimited text file along with CSV, plain text and HTML.
Improved HTML export - link urls are presented as links.
Adds 'Only follow links containing...' field.
Fixes bug allowing some 'commented out' urls to be tested.
Fixes bug preventing inspector window opening when some links double-clicked.
Preferences window added: allows choice of displaying 'on page' as url or page title.
Config Starting URL drop-down list behaviour improved .
Version 1.4.2
released May 21 2007
No longer parses and extracts links from error pages (eg 404 pages).
Now handles spaces in URLs (as long as correctly contained in single or double quotes).
Version 1.4
released April 22 2007
Fixes a problem in some earlier versions which prevented all links being found on some pages
HTML character entities in links are now 'un-encoded' (eg '&' is replaced with '&') before link is checked.
If link appears on more than one page, main table now shows actual number of pages rather than "multiple"
'Re-Check Bad Links' feature added (under File menu)
Fixes problem with export to CSV for some sites.
NB. early copies of 1.4 give the version number as 1.3.1 in about box.
Version 1.3.1
released April 7 2007
Fixes problem with the 'don't check URLs containing' feature which didn't work properly in v1.3
Fixes problem which caused some links to be missed
Small improvement to the stop button
Version 1.3
released April 6 2007
'This page only' checkbox added.
Status display more accurately shows number of links done.
Programme flow, thread safety and object retention improvements. Cures
an instability which seemed to be related to websites which have large
collections of external links and/or setting a larger number of threads.
Fixes bug preventing some link text from being recorded properly.
For some file types which may be larger files (pdf, mpg, mp3, jpg) the
parser no longer sends an http request to check the 'Content-Type',
speeding up the crawl time.
Version 1.2
released March 29 2007
Now tolerant to excessively long hrefs (previously hrefs over 1000
characters would break an internal limit and cause the application to
crash).
Timeout can now be set in the config window. Using a very
large number of threads can obviously make timeouts more likely and so
the timeout figure can now be increased accordingly.
The link
inspector window (double-click an entry in the main table) now shows
the 'on page' list in a form which is clickable. A double-click will
open the page in question.
The HTML report now shows the 'on page' column as links to the page in question.
Version 1.1
released March 26 2007
Link text shows up for more links - link text is still only held once
regardless of how many instances of that link are found on the site,
but if a link has no text (eg image link), then that will not overwrite
the existing link text.
Ignores javascript links as well as mailto links.
Fixes bug triggered by a return within the tag.
Fixes bug which could prevent all links being found on certain pages.
Version 1.0
released March 25 2007
First non-beta release, free and not set to expire. Not generally released, but provided to 2 magazine coverdiscs.
Version 0.5 (Beta)
released March 22 2007
corrected problem which allowed cached data to be checked - new data is now requested every time.
Fixes bug which could prevent some links being found if javascript present in page.
Version 0.4 (Beta)
released March 21 2007
Bug fixed which prevented some relative URLs from being formed correctly
Displays better information about any redirected urls. The final status
code shown is the status for the final (redirected to) URL
Link text included as column in main table
Change to programme flow and a number of small refinements and
efficiency improvements meaning that the application remains responsive
throughout larger crawls.
Bug fixed which prevented some configs saving properly
Version 0.3 (Beta)
released March 7 2007
Improved interface, added 'Continue' button, allows Integrity to be paused and re-started.
Exporting - results can be exported as HTML, CSV or plain text.
Version 0.2 (Beta)
released March 1 2007
Fixes bug preventing Integrity from following links where html is all uppercase.
Version 0.1 (Beta)
released Feb 2007
