Anybody notice the GAN captchas?

Has anybody else noticed the GAN generated images on the google captchas? Boats, trains, buses, planes, so for for me.

On a glance, the look like regular images. But usually if you look, a bunch will appear wrong. An airplane had the engine at 90 degrees to the fuselage. Busses and trains have wording that is gibberish. The boats have sails that come out in weird directions.

6 Likes

I have had too many times that it told me bicycle and I got buses and fences

4 Likes

You know you can fill 3/4 of the REcaptcha right and rest 1/4 wrong and get away with it?

2 Likes

Yeah, I knew I could get away with not marking a few. But what is better? For feeding the beast bad data? Marking the obviously wrong ones as “yep, that’s a bus”? Or not marking the fake busses? Or marking the Gan as real and not marking some of the real boats?

Do we really know what is the REAL purpose of REcaptcha?

1 Like

It could be from borked panoramic shots? I havent really seen much weird captchas.

It could also be for catching bots trained on machine learning?

We really cant tell if its working as intended unless someone in the forum actually works on those…

1 Like

I have seen this much more often on hCaptcha; for a while, there were many planes with two tail sections both landing into the other.

1 Like

I always thought it was training neural nets.

Or validating.

2 Likes

Haven’t noticed it, only captchas of cross walks and traffic

I’m surprised capchas haven’t implemented a typographic or other neural net attacks
Also good resources

1 Like

I heard originally when it was words that it was to help with digitising books, stuff that scanned odd or was damaged was presented to the human with an also deliberately warped text for verification you were actually reading.

Since then with the pictures I assumed it was for training ML and AI or refining their edge cases so the can more easily recognise normal things in odd situations.

Maybe they have moved on to generating images now and are letting the humans be test subjects to see if they can fool us with their new generations accuracy.

1 Like

It just keeps on being creepier, aint it?

2 Likes

From what I’ve heard, captchas are essentially a way of deterring lazy bots and getting free labour to label datasets. It did initially start from a large collection of scanned books and journals which Google were trying to transcribe automatically; they managed to build a large optical character recognition dataset with this technique, which allowed them to create the best algorithms at the time. Solving a captcha then was non-trivial, but now it’s really just a question of throwing the right model and amount of compute power at it.

Supposing they have a model that correctly labels the images with 60% accuracy, they can make the assumption that any set of answers that matches 60% of their model’s predictions is given by a human that has an accuracy closer to 90%-100%. This isn’t great if they are dealing with single samples, as there will be bots and inaccurate guesses, but they can have the exact same image be labelled by hundreds of test subjects, which means they can then do some statistical analysis over larger samples to get a more robust guess. It basically enables them to precisely label enormous datasets using free labour, given they already have a model that performs better than random guessing.