WelcomeUser Guide
ToSPrivacyCanary
DonateBugsLicense

©2024 Poal.co

944

A new "Archive" button under the post's titles is now showing up and a few selected trustworthy users are now able to request an archive of posts that are worth archiving.

Pauline can archive a website's page as a single html file with all the images embedded as base64, then will post a sticky comment with the archive link, and will finally send a notification to both the archiver and the OP.

Here's an example: https://poal.co/s/Biden/482270

A new "Archive" button under the post's titles is now showing up and a few selected trustworthy users are now able to request an archive of posts that are worth archiving. Pauline can archive a website's page as a single html file with all the images embedded as base64, then will post a sticky comment with the archive link, and will finally send a notification to both the archiver and the OP. Here's an example: https://poal.co/s/Biden/482270

(post is archived)

[–] 1 pt

Fantastic stuff!

Not sure how you're rendering the pages, but if using something like selenium or similar you may be able to include an adblocker to remove site annoyances like popups/etc. prior to rendering out the final html file.

Awesome

[–] 1 pt

I'm using a headless Chrome engine.

So adding an anti-popup and ads is in the work.

The only problem is when websites are using Cloudflare, I need to work a function to randomize the User Agent as CF seems to be blocking the archiving after a while.

[–] 0 pt (edited )

You could look at cfscrape (github.com) for some inspiration.

Edit: this could help you out (github.com)

[–] 0 pt

Thanks! Saw that earlier, but I'm busy with preparing Christmas eve dinner menu and stuff ;)

I'll resume that a few days after.