The part about bad Keras<->Tensorflow.js interop is classic Tensorflow. Using TF always felt like using a bunch of vaguely related tools put under the same umbrella rather than an integrated, streamlined product.
Actually, I'll extend that to saying every open source Google library/tool feels like that.
Appropriate response by 4Chan to this: simplify the human work given that anyway it's simple to solve via NNs. We are at a point where designing very hard captchas has high probabilities to increase the human annoyance without decreasing the machine solvability.
Following the links to the captcha solving service you can read profiles of the humans doing the work where its pitched as more ethical than them working in hazardous factories!
It might be worth noting that this, including the harder version the op encountered, are not the hardest captchas that 4chan can serve. There is a still harder version which is sent to less trustworthy IPs. I imagine it would still be tractably solved with computer vision. This in part misses the point though, since 4chan has been continuously altering their captcha since it released, making it difficult to create a permanent solution that won't be broken down the road.
Datacenter IPs can’t even post at all, nevermind needing to solve a CAPTCHA. That’s why the accusations of “VPN shill” are usually wrong, as is the assumption of anonymity – 4chan is in fact one of the least anonymous sites on the internet. The optional username feature gives it a veneer of anonymity, but the strict IP requirements ensure almost every post is attributable to a residential internet connection, and reliably associable with other posts from that same connection.
Some datacenter IPs can post fine, mostly just not those belonging to any large hosting company. I would mention a list of ones I know aren't blocked, but, well, that might get them blocked.
That’s surprising to me. I assumed they were using some service (like Cloudflare) with an updated list of non-residential IP addresses.
I’ve only ever tried to post through Cloudflare WARP (or Apple Private Relay, which is also Cloudflare but different exit IP range). Once I realized that didn’t work, I thought maybe it wasn’t worth posting at all :) I don’t like the idea of my ISP having any suspicion I posted to 4Chan (even if it’s technically https yadda yadda…)
That’s attributable with the right warrant and correlation with other data available to the ISP.
CGNAT is not an anonymity mechanism – at best it may be a very crude one, but the carriers will make extra effort to remove that anonymity through logging, retention, and segmentation.
That’s true, but to be fair my original comment also said posts would be reliably associable with other posts from the same IP. With CGNAT, that association will be slightly less reliable, but not meaningfully so. The segment of the population who posts on 4chan is so low that there is negligible chance of two 4chan users sharing an exit IP and time window. Even with non-overlapping time windows, the population will be low enough for stylometry (and other factors) to remove any remaining ambiguity.
I need to manipulate the data a bit, because right now it's just raw, unaligned foreground/background images with solutions. I need to do the alignment and save them as images rather than JSON files. I'll do that when I have the time.
I can only imagine how much worse they'll make the captcha after stuff like this picks up speed with the users all the while being ineffective against the bots.
captchas are broken, forever. There is no way to prevent bots without also preventing a bottom tier of human users (visually impaired people, old people, or just impatient people). Like this xkcd [1] comic suggests, we need to just focus on rewarding and punishing specific behavior, regardless of whether the agent is human or not
> The official TensorFlow-to-TFJS model converter doesn't work on Python 3.12. This doesn't seem to really be documented, and the error messages thrown when you try to use it on Python 3.12 are non-obvious. I tried an older version of Python (3.10) on a hunch, using PyEnv, and it worked like a charm.
Amazing. And then people wonder why "just use python 2" is still a thing.
Yeah, whenever i need to write a quick script and have no time to suffer "$library needs python 3.x, where x must be > $value and <= $value2, and not a prime except when that ends in a 3, except on leap days"
2 is stable and does not change from under you. Which is what you want in a programming langiuage
It's not like bots aren't already bypassing these CAPTCHAs. One author writing a blog post about how they accomplished what spammers and bots have been doing for ages isn't going to change anything.
I just opened 4chan and after the initial Cloudflare bot detection I was told to register an email or wait 15 minutes before I was allowed to even obtain a CAPTCHA. Looks like they're already taking a layered approach to combat bots.
Actually, I'll extend that to saying every open source Google library/tool feels like that.
reply