Live captions? Didn’t ask for that, wouldn’t use it.
Dubbing? Ditto.
Summary? Wouldn’t trust an AI for that, plus it’s just more tik-tokification. No fucking thanks. I don’t need to experience life as short blips of everything.
Rewrite text better? Might as well kill myself once I’m ready to let a predictive text bot write shit in my place.
Yes, Translate is the only one I want - and we already have that!
The worst is anything that tries to suggest stuff in text fields or puts buttons etc. to try and get you to "rewrite with AI" or any nonsense like that - makes me just want to burn anything like that to the ground.
But that’s exactly my point - addons already solve these problems without baking them in natively. Adding AI just creates bloatware/privacy/security/maintenance problems that are already solved by users being able to customise the browser for their own needs.
I do get that and I'm like 60% with you, but I'm just saying that it is easy to get a bit in a bubble and Mozilla needs to cater to the average person. And let's be honest, we aren't the average user.
Personally I'm fine as long as it continues to be easy to disable and remove. Yeah, I'd rather it be opt-in instead of opt-out but it's not a big price to pay to avoid giving Chrome more power over the internet. At the end of the day these issues are pretty small fish in comparison.
I mean, Chrome/Google have already won the browser wars and it isn't even close. 'Average' persons don't use Firefox, period - they use Chrome. I dunno when you last looked at browser market share, but Firefox is already extremely niche. Trying to cater to the 'average user' when your entire userbase consists of power users is asinine but Mozilla clearly doesn't understand this. They think it's still 2008 or something.
I've successfully migrated my girlfriend, parents, and several friends. Half those friends don't even know how to program. So yes, normal people can use Firefox and they really don't notice the difference.
> Chrome/Google have already won the browser wars
It isn't over till its over. It's trivial to make a stand in this fight. It is beyond me why a large portion of HN users aren't using FF or one of its derivatives. Of all people they should be more likely to understand what's at stake here...
Yes I use FF. You’ve completely misunderstood my point.
Your comment about how YOU had to get the people close to you to use FF was exactly my point. Techies are the only people who use FF now without it being foisted onto them by their techie friends.
You personally wouldn't use live captions and dubbing, so there's no point building it for the millions of people who need it as an accessibility feature?
Couldn’t care less about any of that. English is the world’s dominant language and will remain so for the foreseeable future. There’s nothing wrong with that. And subtitles exist already or can be generated by addons. Most people don’t use them. So, once again, maybe don’t inconvenience the vast majority of users for some small subset of the population.
> English is the world’s dominant language and will remain so for the foreseeable future.
Based on the fact that you said this I'm going to assume you can't read/write Mandarin, apologies if that's incorrect because that leads to my second assumption which is that you're unaware of the astonishingly vast amount of content and conversation related to open source and AI/ML you're missing out on as a result of not being able to read/write Mandarin.
What does what you wrote have to do with what I wrote, or the comment I was replying to? Literally every reasonably educated Chinese person speaks English as a 2nd language.
I'm missing out on all sorts of shit I'd find interesting by virtue of not being a prodigious polyglot. That fact has nothing to do with English being the global language for literally everything in every domain, nor with the fact that in-browser language translation doesn't require baked-in AI.
I mean, sure. I don’t generally give a shit about other people. That’s also not really relevant here. There will always be a dominant language. Currently, it happens to be English and it will remain English into the near future (250+ years). If you attend even a shitty school in a third world country today you are taught English as a second language. Look at the Philippines or sub-Saharan African countries. Everybody speaks English + their native language.
Crying about English’s global penetration is super weird while also being pointless, since it’s a fait accompli at this point.
I get very annoyed by generative AI, but to be fair I could imagine an AI-powered "Ctrl+F" which searches text by looser meaning-based matches, rather than strict character matches; for example Ctrl+AI+F "number of victims" in a news article, or Ctrl+AI+F "at least 900 W" when sorting through a list of microwave ovens on Walmart.
Or searching for text in images with OCR. Or searching my own browsing history for that article about that thing.
>"at least 900 W" when sorting through a list of microwave ovens on Walmart.
Newegg has that as a built in filter.
Why do you people keep insisting I "need" an LLM to do things that are standard features?
I find shopping online for clothes to suck, but there's nothing an LLM can do to fix that because it's not a magic machine and I cannot try on clothes at home. So instead, I just sucked it up and went to Old Navy.
Like, these things are still lying to my face every single day. I only use them when there's no alternative, like quickly porting code from python to Java for an emergency project. Was the code correctly ported? Nope, it silently dropped things of course, but "it doesn't need to be perfect" was the spec.
>Or searching for text in images with OCR.
That thing that was a mainline feature of Microsoft OneNote in 2007 and worked just fine and I STILL never used? I thought it was the neatest feature but even my friend who runs everything out of OneNote doesn't use it much. Back in middle school we had a very similar Digital Notebook application that predates OneNote with a similar feature set, including the teachers being able to distribute Master copies of notes for their students, and I also did not use OCR there.
The ONE actual good use case of LLMs that anyone has offered me did not come from techbros who think "Tesla has good software" is not only an accurate statement but an important point for a car, it came from my mom. Turns out, the text generation machine is pretty good at generating text in French to make tests! Her moronic (really rich of course, one of the richest in the state) school district refused to buy her any materials at all for her French classes, so she's been using ChatGPT. It does a great job, because that's what these machines are actually built for, and she only has to fix up the output occasionally, but that task is ACTUALLY easy to verify, unlike most of the things people use these LLMs for.
She STILL wouldn't pay $20 monthly for it. That shouldn't be surprising, because "Test generator" for a high school class is a one time payment of $300 historically, and came with your textbook purchase. If she wasn't planning on retiring she would probably just do it the long way. A course like that is a durable good.
Not at all. If you want or need a feature it's not some "my browser has to support it or my OS does" dichotomy.
As a couple parents up stated, there's no technical reason a browser has to have a transformer embedded into it. There might be a business reason like "we made a dumb choice and don't have the manpower to fix it", but I doubt this is something they will accept, at least with a mission statement like they have.
I much prefer every individual piece of software and website I interact with implement their own proprietary AI features that compete for my attention and interfer with each other.
Cool, and some DEs make it possible to start implementing this for most applications today. But Mozilla is not KDE or Gnome, so the most they can do is to make this on their software, and make it easy to copy for the entire system.
> Sounds like a web site, not a browser feature.
Sounds like a bit of lack of imagination on your part. Do you think the same for text search?
Exactly. Would be nicer if they did their own features somewhat right (including interfaces for configuration and disabling approachable for non-engineers) before they scope-creep the entire desktop.
No, even when they switched to machine learning their translations still made mistakes that would have made you look goofy. And even today their models still make mistakes that are just weird.
It is especially baffling because Google has much better data sets and much more compute than their competitors.
What tech CEO says is "a text box with magic" Google translate fulfills that and there are ways to integrate with LLM if technology marketing is important.
Unless it is nVidia's CEO, who wants to sell specific hardware, they mostly care about the buzz of the term, not a specific technology, though.
It has everything to do with it. Mozilla explicitly talked about AI in the context of their relatively new translation feature a year or two back. Live captions also uses "AI". The term AI includes machine learning in marketing speech.
If that was the case that means Firefox is already an AI browser. But he wouldn’t be talking about AI browsers if he planned on maintaining the current features and approach, would he?
They just existed before the GenAI craze and no one cared because AI wasn't a buzzword at the time. Google Translate absolutely was based on ML before OpenAI made it a big deal to have things "based on AI".
But just putting stuff in your browser that hooks into third-party services that use ML isn't enough anymore. It has to be front and center otherwise, you're losing the interest of... well, someone. I'm not sure who at this point. I don't care, personally.
Many of these things were "AI" but the marketing hype hadn't gotten there yet. E.g. the local translation in FF is a transformer model, as was Google translate in the cloud since 2018 (and still "AI" looong before that, just not transformer based).
Safari does most of this by leveraging system-level AI features, some of which are entirely local (and in turn, can be and do get used elsewhere throughout the system and native apps). This model makes a lot more sense to me than building the browser around an LLM.
Firefox uses local models for translation, summarisation and possibly other stuff. As it is not restricted on one platform, I guess that it has to use its own tools, while apple (or macos/ios focused software in general) can use system level APIs. But the logic I guess is the same.
Image search?
Live captions?
Dubbing?
Summary?
Rewrite text better?