Hacker Newsnew | past | comments | ask | show | jobs | submit | bornfreddy's commentslogin

> The key: proxy_ssl_verify off — the new server’s SSL cert is valid for the domain, not for the IP address. Disabling verification here is fine because we control both ends.

Yeah - no, it's not. They made the MitM attack possible with this change. The exposure was limited to those 5 minutes, but it should have been a known risk.

Also not certain how they could check the apps on the new server with the read-only database, while it was a replica?

Still, nice to hear it succeeded, the reasons sound very familiar.


By continuously testing competitors and local LLMs? The reason for rising prices is that they (Anthropic) probably realized that they have reached a ceiling of what LLMs are capable of, and while it's a lot, it is still not a big moat and it's definitely not intelligence.

Anything but the simplest tooling is not transferable between model generations, let alone completely different families.

> Anything but the simplest tooling is not transferable between model generations, let alone completely different families.

It is transferable-yes, you will get issues if you take prompts and workflows tuned for one model and send them to another unchanged. But, most of the time, fixing it is just tinkering with some prompt templates

People port solutions between models all the time. It takes some work, but the amount of work involved is tractable

Plus: this is absolutely the kind of task a coding agent can accelerate

The biggest risk is if your solution is at the frontier of capability, and a competing model (even another frontier model) just can’t do it. But a lot of use cases, that isn’t the case. And even if that is the case today, decent odds in a few more months it won’t be


Yep. My approach has been, if I can’t reliably get something to 90+% with a flash / nano / haiku, then it’s not viable for any accuracy critical work. (I don’t know of or have the luck of having any other work.) Starting out with the pro / opus for any production classification work has always been a trick.

Ha. Sounds a lot like the one 10x vs. predictable mediocre guys with a scaffolding of processes. Aim high and hit or miss or try to grind predictably and continuously. Same with humans and depends on the loss you can afford.

If you're talking about APIs and SDKs, whether direct API calls or driving tools like Claude code or codex with human out of the loop, I think that's actually fairly straightforward to switch between the various tools.

If you're talking about output quality, then yeah, that's not as easy. But for product outputs (building a customer service agent or something like that), having a well-designed eval harness and doing testing and iteration can get you some degree of convergence between the models of similar generations. Coding is similar (iterate, measure), but less easy to eval.


For most tasks, at some future date, isn't there going to be some ambient baseline of capabilities you can get per $/tok, starting at ~0 for OSS models, such that eventually all tooling gets trivially transferable?

It's not that hard to make it generic. It does take a little work, but really it boils down to figuring out how to make things work with the "dumbest" model in your set.

Note that it is very likely this market can't sustain this level of competition for long. We are all still chasing the carrot of AGI, while hardware costs skyrocket.

Is there a way one can do DTP using LLMs? InDesign only integrates image generating, if I'm not mistaken. .

LLMs still seem to really struggle with layout. These design tools seem to work well for designs that flow naturally like webpages.

But try and design say an “An A4 poster with a hero image, main text saying this, details next to the image and fine print at bottom” and you end up with pretty poor results.


Or use mc (midnight commander).

This is actually where LLMs could be in advantage. Any code which is not clean (i.e. could be obfuscated) will trigger alarms and deeper inspection. It is much more difficult to create a good "underhanded" exploit that LLM will miss than it is to do the same for humans, imho.

LLMs are vulnerable to prompt injection attacks, so I'm not sure they are in advantage.

So in webmail, when you upload an image / file to attach it to an email, you expect it to be renamed? I don't.

Have you tried moving on from Google, and preferably not to Apple?

Yes, it’s trivial. What are you having difficulty with? There are plenty of threads here on HN about this

If you think it's trivial you must not be paying attention. You cannot keep your data from Google. Government websites include google tracking. Google drives past your house to take photos and sniff your wifi traffic. Your employer hands your data over to google. Your doctor hands your data over to google. Your bank hands your data over to google. You can limit how much you actively and voluntarily give them, but you can't free yourself from them entirely and still function in society.

Trivial? Ha! Way to say that you never tried it. Either that, or that you don't care for things like push notifications. Yes, most of the things work, but not nearly all of them.

+1 for Netguard, it is awesome. A bit clumsy UI, but indispensible.

Yeah, there are lots of pages that don't show the (google) map if you don't have google services enabled on your android phone. Not sure if this is something that could be solved on browser level though? I'm quite certain that these pages still work on iphones...

True. Sidenote: they are still however push notifications provider, so good luck getting rid of them completely (unless you're fine with not getting the notifications). MicroG is awesome wrt. that as you can turn it on/off as you wish, and it just works. GrapheneOS however only supports Google services in sandbox, but the notifications work sporadically IME (maybe because I keep turning them off and on... not sure). So... Pick your poison.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: