Challenge
Compress the entire bee movie script to be small enough to fit on an A4 page and be readable by a human with the naked eye.
My approach was to count word frequencies and add them to a dictionary if they appear enough in the text for their length, so that doing so actually saves characters (I had a function that returns whether or not it will that takes into account the characters it'll use in the dictionary and all the substitutions of that word). I'd then have the dictionary written after the text after an empty line. The keys were base-36 (so they take up less characters), upper case, and I substituted them in where their corresponding words were in the text, which was all lower case and had all non-alphanumeric characters stripped besides apostrophes.
The original text is too long to post here so, pastebin!
The compressed text is only just over the character limit for a post here so pastebin, too I guess. :L
The length of the original was 46888 characters, and the resultant was 35513. That's almost 25% gone.
I challenge you to do this better.
Bigger font size perhaps.
I got it with narrow page margins (1.27cm on all sides), in Times New Roman with pt4.5, on one page.
Can be read with the naked eye. I'd call it a success.
Shouldn't be hard to do this better though. You could go lossy and remove all the vowels I guess, you can still read the vast majority of english text without vowels. this is by no means the answer. It's just a start.