The big problem with the web archive sites, is that they must comply with DMCA takedowns on a regular basis, they are susceptible to censorship.
The big problem with saving the page yourself, is that it can't be verified, nobody trust's you.
This made me think, what if a site was created, that does nothing but archive and share the hashes? The site downloads the web page and all it's content like a normal archiver would, but instead of sharing the site, it simply shares and store's the hash, but delete's the content.
A user would create the archive on thier own with a format that matches the archive verification site. The user would then request the site to verify and and archive the hash. The user would not upload their version of the archive, the site would generate it on it's own. The user would then check that their hash match's the verification site's hash (which it should if everything went smoothly).
The user would then be able to freely share this webpage, and avoid censorship, and this shared web page could be verified as accurate by anyone that wants, just like archive.is and the rest are assumed to be accurate with the contents of their archive, only the archive site only verifies that hash.
I wanted to bounce this idea.
The big keypoint here, is weather or not a hash is considered copyrighted content. If I hash a proprietary file, or website, or anything else proprietary, is the hash still protected by the original file's copyright? Everything I've been able to find on this says no.
The other issue is matching the format between the user's browser and the verification site, so that the hashes line up in the first place. maff & mht look like they are dead. This would require a good amount of work, but there are already extentions to save the web page into a single file,
addons.mozilla.org
chrome.google.com
There aren't as many as I thought there would be, but they do exist, save page WE is GPLv2, so the method they use to create the single page archive could be copied verbatim on the verification site, or a new format could be created (I'm thinking something like just tar the output of the browser's "save complete page", the folder and the file, not sure if that would line up cross browser), for use on the verification site, and the user's browser via webext.
Is this idea worth exploring? Do you think anyone would bother to use it, or just continue to deal with archive.is and the like's censorship?