The worst crime for me is liquid glass on the apple watch.
All the menus are now lagging on the watch ultra gen2. Where previously it was smooth to interact with, the random lags now make it annoying and inconvenient to interact with. (I need to focus my attention on the Ui state instead of following an automatic procedure from muscle memory)
The battery sometimes randomly drains within less than a day.
There are absolutely no benefits of the new visual effects.
The watch was my favorite apple device because it helps me to reduce screen time on the phone. Now it is a source of anger.
The biggest threats to innovation are the giants with the deepest pockets.
Only 5% of chatgpt traffic is paid, 95% is given for free.
Gemini cli for developers has a generous free tier. It is easy to get Gemini credits for free for startups. They can afford to dump for a long time until the smaller players starve.
How do you compete with that as a small lab? How do you get users when bigger models are free?
At least the chinese labs are scrappy and determined. They are the small David IMO.
If you want reliable Wifi at home, get yourself Ubiquity access points and throw away TP-Link. The issue is not the protocol. After many years of unplugging and plugging back in my TP-link router I know that they are cursed.
I think it's more a case of don't use cheap consumer grade hardware in any kind of remotely demanding scenario.
I have "enterprise" TP-Link equipment for my house which I bought 3 years ago now and am very happy with it, in particular I'm using:
- 4x EAP245 Access Points
- 1x SG3428 Switch (the APs came with PoE injectors and I wanted a fan-less switch, hence why the switch is not PoE enabled)
I rent out a room on my property and have my tenant on a separate VLAN to the main house. I also have my IoT devices on a separate VLAN.
I use a generic PC with pfSense as my "router".
My only complaint is that their Omada Controller software doesn't want to run as a Windows Service (I'm not interested in trying to manage a Linux box). Fortunately, it's not required at all, but is useful for centralized configuration management and facilitation of handover of WiFi clients between APs.
Before I moved into my current large-ish place, I used "cheap" ISP supplied TP-Link routers with WiFi, and aside from limited speed capabilities, were 100% reliable for me, in particular I used the following two models:
- TL-WR840N
- Archer C20.
I also use a few cheap (but again fully reliable) 5 port and 8 port TP-Link 1GB/s switches, for example under my desk in my office to allow both my laptop and desktop to share the single CAT6 cable to the room.
Before buying the "enterprise" TP-Link equipment I considered Ubiquiti, but the TP-Link stuff was less expensive, I liked the controller being optional and considering all my past TP-Link equipment's reliability was a non-issue, I was happy "to take a risk".
It's tough. On one hand side, TP-Link has some weird issues. On the other, I spent a while debugging an issue with their engineers and they seem to actually care to improve things. Maybe the lower price is worth it sometimes.
Or better yet, Aruba Instant On. Enterprise hardware for SOHO money, truly plug and play, rock solid, no tinkering involved (which can be both good and bad depending on one's goals).
Can relate. I went through various consumer ish access points until I got tired of it and splurged on an Ubiquiti.
Haven't had any wifi problems since. To the point I don't remember what wifi standard my home is on :)
Too bad they may or may not have given up on the cloud connectivity requirement. I've been told (even on here) they have, but I've also been told that you can disable it after setup instead of setting up without any stinking cloud.
Say, did Ubiquity stuff work during the AWS outage?
As far as I know devices like Access Points only need the Controller to be configured or monitored. Once they are configured they work completely without it.
What does that mean, that the controller won't work without a cloud account? :)
It doesn't really matter which part in the chain requires the stinking cloud as long as they sneak it in somewhere.
Speaking of which, I just bought a Razer mouse again because I've read they gave up on the login requirement for configuring your blinkenlights. But they didn't. They invented a 'guest login' instead.
You can run the controller locally or from the cloud. When it is running locally you have the option to tunnel to it through the cloud. It runs completely locally.
AFAICT, the controller is needed for fast roaming of clients.
I have the UAP-AC-LR and if you only need basic AP functionality you can even configure it from the phone (no cloud). 99% sure no cloud stuff is needed when self-hosting the controller. May have changed since then, though.
Yes you can just not enable it. It is disabled by default. Maybe if you register everything through the app on your phone you need to be a bit careful. I think that makes it easy to enable the cloud tunnelling.
Indeed. You need the controller running first before you can configure AP, Switches, and gateways.
Better yet, buy actual enterprise gear even if it's a generation behind. You can find decent Aruba, Ruckus and Cisco kit on eBay going for decent prices.
Anecdote time - I've had their BE550 Wifi 7 router for over a year now and it's been rock solid. Easily does 2Gbps over wifi, never had to reset it once despite having 40+ devices connected to it all the time, the 4x 2.5Gbps ports are super useful.....it is one of their more expensive devices so maybe that's why, but generally it's been very very solid.
I own two consumer-grade Deco XE75 access points which I purchased several years ago as the most cost-effective 6E compatible access points available. They have proven to be exceptionally reliable.
Although I have previously encountered significant issues with WiFi, I now do not see a need to replace these devices despite the availability of WiFi 7.
My deco seems to be really poor at picking a channel at boot up. When I have performance issues running the optimizer almost always moves it to a different channel and solves my issues.
I just ran the optimiser after reading your post and it changed the 2.4GHz channel - I think the main reason I've been so happy is that the laptops and smartphones I primarily use support 6GHz where there is no interference and channel selection doesn't matter.
Interesting tech choices. I am also always on the hunt for React alternatives.
But the lack of type safety and static analysis usually leads to brittle templates. Stuff that could be expressed in verifiable code, is compressed in annotations and some custom markup. You need to manually re-test all templates when you make any change in the models. How do you deal with that?
Btw, very cool project. Deployments for simple projects are a huge time sink.
I did just that and I ended up horribly regretting it.
The project had to be coded in Rust, which I kind of understand but never worked with. Drunk on AI hype, I gave it step by step tasks and watched it produce the code. The first warning sign was that the code never compiled at the first attempt, but I ignored this, being mesmerized by the magic of the experience.
Long story short, it gave me quick initial results despite my language handicap. But the project quickly turned into an overly complex, hard to navigate, brittle mess. I ended up reading the Rust in Action book and spending two weeks cleaning and simplifying the code. I had to learn how to configure the entire tool chain, understand various cargo deps and the ecosystem, setup ci/cd from scratch, .... There is no way around that.
It was Claude Code Opus 4.1 instead of Codex but IMO the differences are negligible.
We used to have private bridges and private roads, and that was an expensive travel situation for everyone. Now, internet search is kind of like a bridge that leads clients to businesses and Google is deciding on the tolls.
Government-controlled Internet search would definitely be horrible. But I'm thinking if there is a path towards more competitiveness in this landscape, maybe the ISPs could somehow provide free search as part of the Internet service fee? Can we have more specialized, niche search engines? Can governments be asked to break up the Google search monopoly?
Partially, the problem here is that the demand for classic search has been reduced quite a bit with LLM-based search finding wide usage amongst common people.
You could single out Google for it, as the DoJ and some other entities are doing, but even in that case someone else would take that place with the same dynamics, such as OpenAI or Perplexity.
Also, while building search is complex, it’s also not as unfathomable as it’s made out to be, see [1] where a ML engineer made a production-grade search engine in 2 months with their own ingest, indexing and storage infrastructure.
If end users find Google's bridges good enough, there isn't really much fight to be had unfortunately. Kagi is an example of competition, but the question is how many people actually take issue with how Google picks the results they show.
Anti-trust is needed when a company is interfering with competition.
I think there is no cost to switching Search providers. Android is the one place Google has control over the OS. Two taps gets me to a list of search providers in Chrome, with 5 choices. It's not clear how to add more providers.
I don't think having new search engines would be a challenge. The problem is attracting new users and making them realize there is benefit it diversifying what they use. There are already other search engines but the majority still use Google.
Definitely seems to have exacerbated the anti-trust issue. Search was search, but now they can insert AI and short-circuit the route to other products.
I just realized that Opus 4 is the first model that produced "beautiful" code for me. Code that is simple, easy to read, not polluted with comments, no unnecessary crap, just pretty, clean and functional. I had my first "wow" moment with it in a while.
That being said it occasionally does something absolutely stupid. Like completely dumb. And when I ask it "why did you do this stupid thing", it replies "oh yeah, you're right, this is super wrong, here is an actual working, smart solution" (proceeds to create brilliant code)
> Code that is simple, easy to read, not polluted with comments, no unnecessary crap, just pretty, clean and functional
I get that with most of the better models I've tried, although I'd probably personally favor OpenAI's models overall. I think a good system prompt is probably the best way there, rather than relying in some "innate" "clean code" behavior of specific models. This is a snippet of what I use today for coding guidelines: https://gist.github.com/victorb/1fe62fe7b80a64fc5b446f82d313...
> That being said it occasionally does something absolutely stupid. Like completely dumb
That's a bit tougher, but you have to carefully read through exactly what you said, and try to figure out what might have led it down the wrong path, or what you could have said in the first place for it avoid that. Try to work it into your system prompt, then slowly build up your system prompt so every one-shot gets closer and closer to being perfect on every first try.
My issue is that every time i've attempted to use Opus 4 to solve any problem, I would burn through my usage cap within a few min and not have solved the problem yet because it misunderstood things about the context and I didn't get the prompt quite right yet.
With Sonnet, at least I don't run out of usage before I actually get it to understand my problem scope.
I've also experienced the same, except it produced the same stupid code all over again. I usually use one model (doesn't matter which) until it starts chasing it's tail, then I feed it to a different model to have it fix the mistakes by the first model.
This is actually not true. I'm getting traffic from ChatGpt and Perplexity to my website which is fairly new, just launched a few months ago. Our pages rarely rank in the top 4, but the AI answer engines mange to find them anyways. And I'm talking about traffic with UTM params / referrals from chatgpt, not their scraper bots.
If chatgpt is scraping the web, why can they not link tokens to source of token? being able to cite where they learned something would explode the value of their chatbot. At least a couple of orders of magnitude more value. Without this chatbots are mostly a coding-autocomplete tool for me—lots of people have takes, but it's the tying into the internet that makes a take from an unknown entity really valuable.
Perplexity certainly already approximates this (not sure if it's at a token level, but it can cite sources. I just assumed they were using a RAG.)
That's asking for the life stories and photos and pedigrees and family histories of all the chickens that went into your McNuggets. It's just not the way LLMs work. It's an enormous vat of pink slime of unknown origins, blended and stirred extremely well.
I asked it once to simplify code it had written and it refused. The code it wrote was ok but unnecessary in my view.
Claude 3.7:
> I understand the desire to simplify, but using a text array for .... might create more problems than it solves. Here's why I recommend keeping the relational approach:
( list of okay reasons )
> However, I strongly agree with adding ..... to the model. Let's implement that change.
I was kind of shocked by the display of opinions. HAL vibes.
My experience is, that it very often reacts to a simple question with apologizing and completely flipping it's answer 180 degrees. I just ask for explanation like "is this a good way to do x,y,z?" and it goes "I apologize, you are right to point out flaw in my logic. Lets do it the opposite way."
Isn’t this like a brute force approach?
Given it costs $ 3000 per task, thats like 600 GPU hours (h100 at Azure)
In that amount of time the model can generate millions of chains of thoughts and then spend hours reviewing them or even testing them out one by one. Kind of like trying until something sticks and that happens to solve 80% of ARC. I feel like reasoning works differently in my brain. ;)
They're only allowed 2-3 guesses per problem. So even though yes it generates many candidates, it can't validate them - it doesn't have tool use or a verifier, it submits the best 2-3 guesses. https://www.lesswrong.com/posts/Rdwui3wHxCeKb7feK/getting-50...
Chain of thought can entirely self validate. The OP is saying the LLM is acting like a photon, evaluate all possible solutions and choosing the most "Right" path. not quoting the OP here but my initial thought is that is does seem quite wasteful.
the LLM only gets two guesses at the "end solutions". The whole chain of thought is breaking out the context, and levels of abstraction. How many "Guesses" is it self generating and internally validating, well that's all just based on compute power and time.
My counter point to OP here would be is that is exactly how our brain works. In every given scenario, we are also evaluating all possible solutions. Our entire stack is constantly listening and eithier staying silent, or contributing to an action potential (eithier excitatory, or inhibitory). but our brain is always "Evaluating all potential possibilities" at any given moment. We have a society of mind always contributing their opinion, but the ones who don't have as much support essentially get "Shouted down".
The real turker studies, resulting in the ~70% number, are scored correctly I believe. Higher numbers are just speculated human performance as far as I’m aware.
The trick with AlphaGo was brute force combined with learning to extract strategies from brute force using reinforcement learning, that's what we'll see here. So maybe it costs a million dollars in compute to get a high score, but use reinforcement learning ala alphazero to learn from the process and it won't cost a million dollars next time and let it do lots of hard benchmarks, math problems and coding tasks and it'll keep getting better and better.
The best interpretation of this result is probably that it showed tackling some arbitrary benchmark is something you can throw money at, aka it’s just something money can solve.
Its not agi obviously in the sense that you still need to some problem framing and initialization to kickstart the reasoning path simulations
this might be quite an important point - if they created an algorithm that can mimic human reasoning, but scales terribly with problem complexity (in terms of big O notation), it's still a very significant result, but it's not a 'humans brains are over' moment quite yet.
Haha. Hopefully you’re right and solving the ARC puzzle translates to solving all of physics. I just remain skeptical about the OpenAI hype. They have a track record of exaggerating the significance of their releases and their impact on humanity.
Please do show me a novel result in physics from any LLM. You think "this guy" is stupid because he doesn't extrapolate from this $2MM test that nearly reproduces the work of a STEM graduate to a super intelligence that has already solved physics. Maybe you've got it backwards.
The problem is not that it is expensive, but that, most likely, it is not superintelligence. Superintelligence is not exploring the problem space semi-blindly, if the thounsands $$$ per task are actually spent for that. There is a reason the actual ARC-AGI prize requires efficiency, because the point is not "passing the test" but solving the framing problem of intelligence.
The battery sometimes randomly drains within less than a day. There are absolutely no benefits of the new visual effects.
The watch was my favorite apple device because it helps me to reduce screen time on the phone. Now it is a source of anger.
reply