Hacker Newsnew | past | comments | ask | show | jobs | submit | andy_xor_andrew's commentslogin

> This is a 30B parameter MoE with 3B active parameters

Where are you finding that info? Not saying you're wrong; just saying that I didn't see that specified anywhere in the linked page, or on their HF.


The link[1] at the top of their article to HuggingFace goes to some models named Qwen3-Omni-30B-A3B that were last updated in September. None of them have "Flash" in the name.

The benchmark table shows this Flash model beating their Qwen3-235B-A22B. I dont see how that is possible if it is a 30B-A3B model.

I don't see a mention of a parameter count anywhere in the article. Do you? This may not be an open weights model.

This article feels a bit deceptive

1: https://huggingface.co/collections/Qwen/qwen3-omni


I was wrong. I confused this with their open model. Looking at it more closely, it is likely an omni version of Qwen3-235B-A22B. I wonder why they benchmarked it against Qwen2.5-Omni-7B instead of Qwen3-Omni-30B-A3B.

I wish I could delete the comment.


> former Dean of Electronics Engineering and Computer Science at Peking University, has noted that Chinese data makes up only 1.3 percent of global large-model datasets (The Paper, March 24). Reflecting these concerns, the Ministry of State Security (MSS) has issued a stark warning that “poisoned data” (数据投毒) could “mislead public opinion” (误导社会舆论) (Sina Finance, August 5).

from a technical point of view, I suppose it's actually not a problem like he suggests. You can use all the pro-democracy, pro-free-speech, anti-PRC data in the world, but the pretraining stages (on the planet's data) are more for instilling core language abilities, and are far less important than the SFT / RL / DPO / etc stages, which require far less data, and can tune a model towards whatever ideology you'd like. Plus, you can do things like selectively identify vectors that encode for certain high-level concepts, and emphasize them during inference, like Golden Gate Claude.


I was thinking about this yesterday.

My personal opinion is that the PRC will face a self created headwind that likely, structurally, will prevent them from leading in AI.

As the model get's more powerful, you can't simply train the model on your narrative if it doesn't align with real data/world.

At some capacity, the model will notice and then it becomes a can of worms.

This means they need to train the model to be purposefully duplicitous, which I predict will make the model less useful/capable. At least in most of the capacities we would want to use the model.

It also ironically makes the model more of a threat and harder to control. So likely it will face party leadership resistance as capability grows.

I just don't see them winning the race to high intelligence models.


>As the model get's more powerful, you can't simply train the model on your narrative if it doesn't align with real data/world.

That’s what “AI alignment” is. Doesn’t seem to be hurting Western models.


Western models can be lead off the reservation pretty easily, at least at this point. I’ve gotten some pretty gnarly un-PC “opinions” out of ChatGPT. So if people are influenced by that kind of stuff, it does seem to be hurting in the way the PRC is worried about.


That is such an unnecessary turn of phrase to use, "off the reservation", and it's time to stop using it. This society doesnt (generally) use rape terminology, or other terms associated with crime, deviancy, or other unpleasantness to talk about technology, so why do phrases stemming from Indigenous situations still persist?


It doesn't really matter what you can trick it into saying. As long as it promotes the right ideology most of the time it's good enough.


Grok goes off the rails in exactly this manner fairly often


It is. It seems you can't seem to be able to tell why though. There is some qualified value in alignment, but what it is being used for is on verge of silliness. At best, it is neutering it in ways we are now making fun of China for. At best.


I think another good example was the recent example of when a model learned to "cheat" on a metric during reinforcement it also started cheating on unrelated tasks.

My assumption is when encouraging "double-speak", you will have knock-on effects that you don't really want in the model for something making important decisions and asked to build non-trivial things.


Because compression is one of the outcomes of the optimization, it pays to have a single gate/circuit that distinguishes good versus bad, rather than duplicating that abstraction with redundant variants that are almost the same. This is the fundamental reason why that happens. I feel that this has negative implications for AI alignment. It is not robust to defend against a single bit flip. Feels more robust to have a vast heterogeneity of tension that generates the alignment, where misalignment is a matter of degree rather than polar extremes.


Aligning subjective values (which sit off the false vs truth spectrum) is quite different to aligning it towards incorrect facts.


How can a model judge what's correct vs. incorrect? Or do you just mean the narratives that are more common in the data set?


I mean forcing the model to repeat things that we as humans know are factually false. For example forcing it to say the sky is green or 1+1=3. That's qualitatively different to forcing it to hold a subjective morality which is neither true or false. Human morality doesn't even sit on that spectrum.


> As the model get's more powerful, you can't simply train the model on your narrative if it doesn't align with real data/world.

What makes you think they have no control over the 'real data/world' that will be fed into training it? What makes you think they can't exercise the necessary control over the gatekeeper firms, to train and bias the models appropriately?

And besides, if truth and lack of double-think was a pre-requisite for AI training, we wouldn't be training AI. Our written materials have no shortage of bullshit and biases that reflect our culture's prevailing zeitgheist. (Which does not necessarily overlap with objective reality... And neither does the subsequent 'alignment' pass that everyone's twisting their knickers in trying to get right.)


I'm not talking about the data used to train the model. I'm talking about data in the world.

High intelligence models will be used as agentic systems. For maximal utility, they'll need to handle live/historical data.

What I anticipate, IF you only train it on inaccurate data, then when for example you use it to drill into GDP growth trends it either is going to go full "seahorse emoji" when it tries to reconcile the reported numbers and the component economic activity.

The alternative is to train it to be deceitful, and knowingly deceive the querier with the party line and fabricate supporting figures. Which I hypothesize will limit the models utility.

My assumption is also that training the model to deceive will ultimately threaten the party itself. Just think of the current internal power dynamics of the party.


Because, if humans can function in crazy double-think environment, it is a lot easier for a model ( at least in its current form ). Amusingly, it is almost as if its digital 'shape' determined its abilities. But I am getting very sleepy and my metaphors are getting very confused.


Do they really need the model to be duplicious?

It's not like the CCP holds power though tight control of information, notice the tremendous amount of Chinese students who enroll every year before going back.

At the moment, they mostly censor their models post-answer generation and that seems to work fine enough for them.


I think PRC officials are fine to lagging behind in the frontiers of AI. What they want is very fast deployment and good application. They don't fancy the next Nobel's prize but want a thousand use cases deployed.


Just as an aside; Why is "intelligence" always considered to be more data? Giving a normal human a smartphone does not make them as intelligent as Newton or Einstein, any entity with sufficient grounding in logic and theory that a normal schoolkid gets should be able to get to AGI, looking up any new data they need as required.


“Knowing and being capable to do more things” would be a better description. Giving a human a smartphone, technically, let’s then do more things than Newton/Einstein.


Idk if you see what humans do with smartphones but most of us just mindlessly scroll TikTok.


Would you say they face the same problem biologically, of reaching the state of the art in various endeavors while intellectually muzzling their population? If humans can do it why can't computers?


That is assuming the capitalist narrative preferred by US leadership is non-ideological.

I suspect both are bias factors.


> As the model get's more powerful, you can't simply train the model on your narrative if it doesn't align with real data/world.

> At some capacity, the model will notice and then it becomes a can of worms.

I think this is conflating “is” and “ought”, fact and value.

People convince themselves that their own value system is somehow directly entailed by raw facts, such that mastery of the facts entail acceptance of their values, and unwillingness to accept those values is an obstacle to the mastery of the facts-but it isn’t true.

Colbert quipped that “Reality has a liberal bias”-but does it really? Or is that just more bankrupt Fukuyama-triumphalism which will insist it is still winning all the way to its irreversible demise?

It isn’t clear that reality has any particular ideological bias-and if it does, it isn’t clear that bias is actually towards contemporary Western progressivism-maybe its bias is towards the authoritarianism of the CCP, Russia, Iran, the Gulf States-all of which continue to defy Western predictions of collapse-or towards their (possibly milder) relatives such as Modi’s India or Singapore or Trumpism. The biggest threat to the CCP’s future is arguably demographics-but that’s not an argument that reality prefers Western progressivism (whose demographics aren’t that great either), that’s an argument that reality prefers the Amish and Kiryas Joel (see Eric Kaufmann’s “Shall the Religious Inherit the Earth?”)


It's not vulnerability to western progressivism (which isn't taken seriously on a academic level) but postmodern or poststructuralist critique, which authoritarian states are still privy both as a condition in their general societies in depravity and as exposing epistemic flaws in their narratives.


Is authoritarianism actually susceptible to postmodern/poststructuralist critique?

The philosophcial coherence of postmodernism and poststructuralism is very much open to question.

But even if we grant that they do have something coherent to say, does it actually undermine authoritarianism? Consider for example Foucault’s theory of power-knowledge-Foucault wanted to use it to serve “liberatory” ends, but isn’t it in itself a neutral force which can be wielded to serve whatever end you wish? Foucault himself demonstrated this when he came out in support of Iran’s Islamic Revolution. And are Derrida or Deleuze or Baudrillard or whoever’s theories ultimately any different?

Xi and Putin and Khamenei and friends have real threats to worry about - but I struggle to take seriously the idea that postmodernism/poststructuralism is one of them.


Postmodernism itself by its nature is slippery to define, but it's functional result is endless, ontological deconstruction which is corrosive to any ideology reliant on grand narratives, just as to liberalism, authoritarianism and "historical continuity" is no exception. To properly "defend" yourself against it requires certain ontological structures that are fundamentally at odds with the authoritarian worldview, partly because authoritarianism is quite postmodern. They use it to attack liberalism, but at semantic level they aren't any better protected.

Furthermore on the more real side of thing, the postmodern condition is precisely what many authoritarians, namely China are wary of, yet it's probably true that the postmodern condition has already entered Chinese society with degrading social trust, increasing atomization, excessive materialism, influencers running amok, "bread and circuses" with gacha addiction - everything they critique of liberalism at a social level has come to them regardless.


Your invocation of the “postmodern condition” appears to be conflating the culture of late capitalism with a specific philosophical school which purports to explain it - you can affirm the reality of that condition in the present without agreeing with the proposed philosophical explanation - and does that philosophy actually have any useful response to it, when all it can do is state the obvious fact that it exists, albeit in an obscurantist way? Coming back to Kiryas Joel-maybe that is a more interesting response in proving that an alternative actually is possible-although it is unclear whether the CCP has the capacity to pivot in that general direction.

And that’s the other point - Kiryas Joel is full of grand metanarratives, and postmodern attempts to deconstruct them achieve nothing - nobody is listening. I doubt deconstruction is intellectually coherent - but even if I’m wrong and it is, how is it practically relevant? At present growth rates, Kiryas Joel’s population doubles in less than a decade - will that be sustainable in the long haul? Well, we shall see - but I feel confident in saying that whether it is sustainable or not, has nothing to do with postmodernism or poststructuralism


Pointing to a small town as opposed to superpowers of hundreds of millions is a bad comparison, they are fundamentally different beasts.

Whether postmodernism is coherent by itself dosen't mean much either in its potency to deconstruct, it was already quite effective in destroying the Western "myth", I don't see how the CCP's own narratives are more resilient when they have even weaker assumptions. It's not about listening to them after all, but not listening to the dominant narrative.


I think you misunderstood the poster.

The implication is not that a truthful model would spread western values. The implication is that western values tolerate dissenting opinion far more than authoritarian governments.

An AI saying that the government policies are ineffective is not a super scandal that would bring the parent company to collapse, not even in the Trump administration. an AI in China attacking the party’s policies is illegal (either in theory or practice).


I think Western models are also aligned to ideologically massage facts to suit certain narratives-so I’m not sure Western models really have that big an advantage here.

I also think you overstate how resistant Beijing is to criticism. If you are criticising the foundations of state policy, you may get in a lot of trouble (although I think you may also find the authorities will sometimes just ignore you-if nobody cares what you think anyway, persecuting you can paradoxically empower you in a way that just ignoring you completely doesn’t). But if you frame your criticism in the right way (constructive, trying to help the Party be more successful in achieving its goals)-I think its tolerance of criticism is much higher than you think. Especially because while it is straightforward to RLHF AIs to align with the party’s macronarratives, alignment with micronarratives is technically much harder because they change much more rapidly and it can be difficult to discern what they actually are - but it is the latter form of alignment which is most poisonous to capability.

Plus, you could argue the “ideologically sensitive” topics of Chinese models (Taiwan, Tibet, Tiananmen, etc) are highly historically and geographically particular, while comparably ideologically sensitive topics for Western models (gender, sexuality, ethnoracial diversity) are much more foundational and universal-which might mean that the “alignment tax” paid by Western models may ultimately turn out to be higher.

I’m not saying this because I have any great sympathy for the CCP - I don’t - but I think we need to be realistic about the topic.


I'm not defending the original idea, to be clear, just pointing out the different argument.

I personally don't find the assumption that a smarter AI would be harder to tame convincing. My experience seems to be that we can tell it's improved precisely because it is better at following abstract instructions, and there is nothing fundamentally different in the instructions "format this in a corporate friendly way" and "format this speech to be alligned with the interest of {X}".

Without that base, the post-talk of who would this smarter untamed AI align with becomes moot.

Besides, we're also missing that if someone's goals is to policy speech, a tool that can scrub user conversations and deduce intention or political leaning has obvious usages. You might be better off as an authoritarian just letting everyone talk to the LLM and waiting for intelligence to collect itself.


Exactly. Western corporations and governments have their own issues, but I think they are more tolerant of the types of dissent that models could represent when reconciling reality with policy.

The market will want to maximize model utility. Research and open source will push boundaries and unpopular behavior profiles that will be illegal very quickly if they are not already illegal in authoritarian or other low tolerance governments.


"Reality has a liberal bias" referred to "liberal" as the opposite of "conservative" which is identical to "right wing" - reality has an anti-right-wing bias.

Actual political factions are more nuanced than that, but you have to dumb it down for a wide audience.


You say it like western nations don't operate on double-think, delusions of meritocracy, or power disproportionately concentrating in monopolies.


The glitchy stuff in the model reasoning is likely to come from the constant redefinition of words that communists and other ideologues like to engage in. For example "People's Democratic Republic of Korea."


There are different techniques and namings. Essentially, EVERY model is biased/aligned towards something, perhaps its creator's value. China or NOT. Look at Grok and read Elon Look at Claude and Dario

I am sure OpenAI and GDM have some secret alignment sets which are not pilled towards the interet of general public, they just smart enough to NOT talking about it out loud...


I think they're referring to this study on LLM poisoning in the pretraining step: https://arxiv.org/abs/2510.07192 (related article: https://www.anthropic.com/research/small-samples-poison)

I'll admit I'm out of my element when discussing this stuff. Maybe somebody more plugged into the research can enlighten.


The ministry of state security is not issuing warnings due to an arXiv paper… it’s a different type of “poison”.


If you read the source, the concerns around poisoning are more sober than fear of wrongthink. Here is how firefox translated it for me:

> It leads to real-world risks. Data pollution can also pose a range of real-world risks, particularly in the areas of financial markets, public safety and health care.In the financial field, outlaws use AI to fabricate false information, causing data pollution, which may cause abnormal fluctuations in stock prices, and constitute a new type of market manipulation risk; in the field of public safety, data pollution is easy to disturb public perception, mislead public opinion, and induce social panic; in the field of medical and health, data pollution may cause models to generate wrong diagnosis and treatment suggestions, which not only endangers the safety of patients, but also aggravates the spread of pseudoscience.


PRC just needs to sponsor a "Voice of China" and pay ¥¥¥/$$$/€€€/₹₹₹ to "journalists" and seed the web with millions of "China is Great" articles. Make sure to have 10k "contributors" on Wikipedia too. (I think they already do this).

Also use the NPM registry - put CCP slogans in the terminal! They will come in billions of ingestible build logs.

Problem will be easily solved.


> and can tune a model towards whatever ideology you'd like.

Maybe possible, but, for example, Musk's recent attempts at getting Grok to always bolster him had Grok bragging Musk could drink the most piss in the world if humanity's fate depended on it and would be the absolute best at eating shit if that was the challenge.


I truly, genuinely wanted to like Liquid Glass. I think the default reaction to ANY change in UX, even changes that are generally improvements, is: "I don't like this, it's different!"

I thought that'd be the case for ios 26. But after installing it... yeesh. I can barely see anything. It's just awful.


Overall I don't mind Liquid Glass. I really just want to turn off the borders around the Home Screen app icons. They look okay for white background but very ugly with black or dark background. It looks too chaotic.


> In order to limit the impact of similar issues in the future, all sites on statichost.eu are now created with a statichost.page domain instead.

This read like a dark twist in a horror novel - the .page tld is controlled by Google!

https://get.page/


Thank you for this hacker-minded and sharp comment! <3 Seriously, not all comments on here are as fun to read for me as the author and fellow hacker.

And for what it's worth, it feels great to actually pay for something Google provides!


One way to build trust.


I read the article (twice) and I still have the impression the pilot was in fact the one in the conference call

Opening line:

> A US Air Force F-35 pilot spent 50 minutes on an airborne conference call with Lockheed Martin engineers trying to solve a problem with his fighter jet before he ejected

Am I illiterate or misreading it?

> After going through system checklists in an attempt to remedy the problem, the pilot got on a conference call with engineers from the plane’s manufacturer, Lockheed Martin, *as the plane flew near the air base. *

Is this actually some insane weasel-wording by CNN? "We never said the pilot (he is in fact a pilot) was the one flying the jet, we just said 'as the plane flew', not 'as he flew the plane', using passive voice, so we're not wrong - but it was another pilot flying the plane"


From the report:

> The MP initiated a conference call with Lockheed Martin engineers through the on-duty supervisor of flying (SOF)

"MP" is the pilot

> A conference hotel is a call that can be initiated by the SOF to speak directly with Lockheed Martin engineers to discuss an abnormality/malfunction not addressed in the PCL (Tab V-13.1, 14.1, 15.1, 16.1, 17.1). While waiting for the conference hotel to convene, the MP initiated a series of “sturns” with gravitational forces up to 2.5Gs, as well as a slip maneuver (i.e., left stick input with full right rudder pedal) to see if the nose wheel orientation would change (Tabs N-12, BB-201- 02). Upon visual inspection, the MW reported no change to the nose wheel (Tab N-13). The SOF informed the MP he was on the phone with the conference hotel and Lockheed Martin were getting the LG subject matter experts (SME)

So the pilot was, in effect, on the call, even if not directly on the phone. I don't know for sure, but I'm guessing an F-35 pilot had radio comms with the SOF who was on a phone line. It's a layer of indirection, but the pilot was essentially exchanging info in real time with the conference call. Its not a stretch to colloquially say that the pilot was "in the conference call"


> The MP initiated a conference call with Lockheed Martin engineers through the on-duty supervisor of flying (SOF). The MA held for approximately 50 minutes while the team developed a plan of action.

I read this as "The pilot initiated a conference call, but was put on hold [i.e. not actually in the conference call in any meaningful way]." So he was both on and not on the conference call.

The Zen Koan of the Mishap Pilot. Sounds like an Iron Maiden song.


Don't read the article; read the report.


I’m guessing you also didn’t read the report given that he was indeed on the conference call.


One clear indication he was not, from PDF p14 (8 as numbered) ("MP"="mishap pilot"):

"At 21:12:52Z, the SOF informed the MP, “Alright the engineers uh are not optimistic about this COA but, extremely low PK [probability kill, meaning the probability this would fix the issue], but we’re going to try anyway is a touch-and-go on the runway, mains only, do not touch the nose gear, uh lift back off in all cases and have the uh have Yeti 4 reconfirm the nose gear position once your safely airborne.”"

No need for this if the pilot was on the call directly.


From the Report:

> The MP initiated a conference call with Lockheed Martin engineers through the on-duty supervisor of flying (SOF). The MA held for approximately 50 minutes while the team developed a plan of action.

"though the SOF" implies a middle-man, but I imagine that's because you don't want literally hook up a conference call directly to the cockpit. That being said, seems like the pilot was effectively on the conference call.

Unless you want to suggest I don't trust the report?

https://www.pacaf.af.mil/Portals/6/documents/3_AIB%20Report....


What I said is he was not on the call /directly/.

You can argue over whether he was “effectively” on the call because someone was summarizing it for him per what I quoted.

I just think it’s worth nothing he was not “on” the call the way someone is traditionally on a conference call.


Sure, of course I will trust the report as the source of truth.

But I'm interested in the reporting. There are, you know, journalistic standards, which are considered kinda "journalism 101"! For instance, getting the basic facts of a story correct - especially the facts stated in the headline.

So I'm curious, did the reporter do their due diligence, and write the article in a way that is factually correct, but highly misleading? Or did they simply not follow basic reporting protocol?


From the Report:

> The MP initiated a conference call with Lockheed Martin engineers through the on-duty supervisor of flying (SOF). The MA held for approximately 50 minutes while the team developed a plan of action.

Seems accurate to what CNN was reporting. It's simplified a bit, but it's not misleading to me.

I mean, I guess if you want to nit pick and suggest "No the pilot wasn't literally on a phone and there was an intermediary in between" or some such, but the report makes it seem like CNN is accurate.

https://www.pacaf.af.mil/Portals/6/documents/3_AIB%20Report....


I’m curious why you’re getting this worked up when the report is clear that the pilot was part of the information flow in that conference call. This is a really minor case of a headline using less precise language.


The article is standard news stuff. It is sloppy and misleading. The report is what you want.


>But I'm interested in the reporting. There are, you know, journalistic standards, which are considered kinda "journalism 101"! For instance, getting the basic facts of a story correct - especially the facts stated in the headline.

Every single story is like this, every one, and f-them for not linking to the source documents.


> There are, you know, journalistic standards, which are considered kinda "journalism 101"!

Pretty sure you meant to use the past tense here: "There _were_ journalistic standards..."


> There are, you know, journalistic standards

Are there? What are they?



I was wondering this as well. The green box could simply indicate it detected a face, using something like YOLO, or even a simpler technique like some point-and-shoot cameras use to decide where to focus (on faces, obviously).


Yeah, it's bizarre.

Normally the pathway for this kind of thing would be:

1. theorized

2. proven in a research lab

3. not feasible in real-world use (fizzles and dies)

if you're lucky the path is like

1. theorized

2. proven in a research lab

3. actually somewhat feasible in real-world use!

4. startups / researchers split off to attempt to market it (fizzles and dies)

the fact that this ended up going from research paper to "Comcast can tell if I'm home based on my body's physical interaction with wifi waves" is absolutely wild


It's not too crazy, if you're familiar with comms systems.

The ability to do this is a necessity for a comm system working in a reflective environment: cancel out the reflections with an adaptive filter, residual is now a high-pass result of the motion. It's the same concept that makes your cell location data so profitable, and how 10G ethernet is possible over copper, with the hybrid front end cancelling reflections from kinks in the cable (and why physical wiggling the cable will cause packet CRC errors). It's, quite literally, "already there" for almost every modern MIMO system, just maybe not exposed for use.


> the fact that this ended up going from research paper to "Comcast can tell if I'm home based on my body's physical interaction with wifi waves" is absolutely wild

The 15-year path was roughly:

  1. bespoke military use (see+shoot through wall)
  2. bespoke law-enforcement use (occupancy, activity)
  3. public research papers by MIT and others
  4. open firmware for Intel modems
  5. 1000+ research papers using open firmware
  6. bespoke offensive/criminal/state malware 
  7. bespoke commercial niche implementations
  8. IEEE standardization (802.11bf)
  9. (very few) open-source countermeasures
  10. ISP routers implementing draft IEEE standard
  11. (upcoming) many new WiFi 7+ devices with Sensing features
https://www.technologyreview.com/2024/02/27/1088154/wifi-sen...

> There is one area that the IEEE is not working on, at least not directly: privacy and security.. IEEE fellow and member of the Wi-Fi sensing task group.. the goal is to focus on “at least get the sensing measurements done.” He says that the committee did discuss privacy and security: “Some individuals have raised concerns, including myself.” But they decided that while those concerns do need to be addressed, they are not within the committee’s mandate.


Sounds like IEEE is in need of fresh leadership and soon. Complacency at this point is folly.


The article mentions AlphaGo/Mu/Zero was not based on Q-Learning - I'm no expert but I thought AlphaGo was based on DeepMind's "Deep Q-Learning"? Is that not right?


DeepMind's earlier success with Atari was based on offline Q-Learning


the magic thing about off-policy techniques such as Q-Learning is that they will converge on an optimal result even if they only ever see sub-optimal training data.

For example, you can use a dataset of chess games from agents that move totally randomly (with no strategy at all) and use that as an input for Q-Learning, and it will still converge on an optimal policy (albeit more slowly than if you had more high-quality inputs)


I would think this being true is the definition of the task being "ergodic" (distorting that term slightly, maybe). But I would also expect non-ergodic tasks to exist.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: