Are all the inputs (from buoys, weather balloons, stations, etc) from decades of history stored, as well as past daily forecasts of existing weather models, so that this AI algorithm (and any future new ones) can be run across historic data and compared in terms of performance to existing methods?
Is there a big Clearinghouse for this data?
Kind of like how fintech algos can be run against historic stock market data to evaluate them.
Many commercial and even third party governments rely on the data from NOAA and its archives, on top of that the EU runs its own EUMETSAT fleet of data, plus a ton of national services - unfortunately, the result is there's a looooot of datasets.
NOAA's dataset is public domain [1], EUMETSAT only requires attribution for most of its data [2]. On top of that you got the EU's Climate Data store [3], ECMWF [4], and ECA&D [5].
The service that many private weather services provide is to aggregate and weigh all of the publicly available datasets, and some also add in data from their own ground stations, commercially licensed "realtime" data from governmental services, and their own models as well.
The interesting question is what DOGE will do regarding NOAA - it is increasingly possible that NOAA will shut down, either to be turned into a pay-for-play model, replaced by private services, or just carelessly dropped in its entirety.
NOAA marine forecasts and FAA integration is critical infrastructure for marine transportation and aviation. If they shut that down it will result in direct losses much larger than NOAAs budget, and thousands of lives lost through disaster and workplace accidents.
I sure hope those “smart” people are capable of understanding that.
> The service that many private weather services provide is to aggregate and weigh all of the publicly available datasets
Yes, and more people need to understand this. Too many people seem to think that commercial services won’t be impacted if NOAA stops doing what they do.
The World Meteorological Organization has data that national weather services exchange to make forecast. Apart from that, its on a country-by-country basis.
How does windy do it? Did they just scrap all the sites or api. Would be great to have global dump every 1 minute of all the countries. Or have it have in a radio broadcast.
> Are all the inputs (from buoys, weather balloons, stations, etc) from decades of history stored
I was also thinking about smartphones. They have barometric data, and while it might vary from phone to phone, I'm sure something like a kalman filter + historic data could do something there.
Think about gathering all the data from "stationary" phones, correlate that with weather sat data, and with real "ground truth" weather stations, and then go back 30 - 60 min / a day, and see what comes out.
Not really; it was a gimmick. They used standard forecast post-processing techniques to bias correct global/regional weather models. There is virtually no evidence they actually used device data in this process.
Depends on smartphone (and smartwatches). Not all have it, at times it disappeared from brands that had it earlier.
The smartwatch series I use explicitly include it because it's essentially the "all in one" version in a series that had smartwatches designed for aircraft use (including military versions where they serve as backup cabin pressure warning, apparently)
Smartwatches include barometers because they enable elevation measurement far more accurate than than GNSS (GPS and GPS-alikes, they tend to be noticeably less accurate in the vertical than in the horizontal). In particular when both are combined: barometry for high frequency changes (going up a hill) continuously calibrated against a long term average of the GNSS elevation to filter out low frequency changes in barometry (weather influence).
But back to device could weather observation, this would require continuous GNSS, devices don't do that (network location would not be good enough).
The short answer is, "no." There are some projects like the "NNJA-AI" project at Brightband[1] which is attempting to create such a clearing house in order to focus research efforts across the community.
It would also be nice to have a historical store of the weather predictions and not just instantaneous parameters. But I'm not aware of such record, perhaps because weathermen don't want a record of their (mis)predictions...
There's climate reanalysis, which combines historical observations with weather models to get clean data of past weather conditions, which is then used by researchers for various purposes. Most notably is ERA5 by ECMWF.
What data is used to train this AI? The article doesn't say anything about that (tough I have to admit I didn't read it super carefully). My first thought would be exactly all this historical data, but then you can't use that same data to test the AI's performance. Are different subsets of the available historical data used for training vs testing?
No, they say end-to-end, meaning they use raw obsevations. Most or all other medium-range models start with ERA5.
There's a paper from Norway that tried end-to-end, but their results were not spectacular. That's the aim of many though, including ECMWF. Note that ECMWF already has their AIFS in production, so AI weather prediction is pretty mainstream nowadays.
Google has a local nowcast model that uses raw observations, in production, but that's a different genre of forecasting than the medium-range models of Aardvark.
> Google has a local nowcast model that uses raw observations, in production, but that's a different genre of forecasting than the medium-range models of Aardvark.
It's very clear from the MetNet announcement blog[1] that they require HRRR or other NWP output at runtime.
Is there an equivalent modelling approach for earthquake prediction? A data repository for it widely shared on the same line as the top comment would work
Getting high-quality, harmonized data at global scale can still be a challenge, especially when it comes to observational coverage in developing regions
Sniff test: ASTM E230 standard tolerances for the venerable Type K thermocouple is +/- 2.2°C or +/- 0.75%, whichever is greater.
Expectations are in need of recalibration if anyone thinks a single number is going to meaningfully achieve that level of accuracy across a volume representing any metro area at any given point in time, let alone two days out.
That is the accuracy of one sensor. The law of large numbers comes into play when using forecasts such as this, since accross any reasonably sized metro area you will have hundreds if not thousands or even tens of thousands of weather stations, accross which you can average to bring down error thresholds.
I think you misunderstood the assignment. The temp predictions in my area wildly swing around, until the day of, and even then I have to adjust for my actual location. And I'm not in Kansas City. The only place that has ~3° accuracy is San Diego. And maybe Antarctica.
a thermometer will tell me if it's going to freeze in 2 days?
here's the scenario. I have started a bunch of seeds inside my house. I am waiting for the "last frost" as per the instructions on the seed packets. Now, how do i use a thermometer to tell if the last frost has passed? You need prediction models that can be accurate out at least a few days with temperature, and a 10% error "at freezing" means my plants either live or die, based on that error.
there's no counterargument, here. "oh just cover the new plants" or "just wait longer" don't work, especially with larger gardens or "farms" or "homesteads". Most of us home-gamers just use environmental clues - number of bugs, buds on dogwood or pecan trees (native ones), when other trees flower. But this year i got a lesson that pecan trees get it wrong, too. They are in the process of leafing out and there was an unpredicted, non-forecast cold snap that took the temperature so close to 32.0F (0.00006C) that i think any plant not equipped for that would have died that night. Now, now i'm fairly certain there's no more frost chance, but it's just a guess.
Now imagine i lived anywhere outside of the subtropics, like nebraska or montana and needed to plant food for livestock or whatever.
Forecasting isn't fine grained enough to get your backyard temp within 2 degrees. What your asking for is silly. The temp across your city is going to vary by more than 2 degrees every single day.
you're still not listening. Once more, the forecast can't tell me with any certainty whether or not there is going to be a frost, specifically in my area, and apparently in Kansas City, as well.
What i am asking for is what is promised by weather forecasters and the models. If it can't say with any certainty if it's going to freeze, it's completely worthless for this common circumstance. Like most people, i don't care if the forecast is 75F and it's actually 70F or 80F (or 68F), but what i do care about is a forecast for lows of 50F and it ends up being 32.15F. If you were a roofer and you were off by 18 degrees, you'd still be in prison.
You're not listening either. Of course it can't. It's not fine grained enough to tell you what's happening in your back yard. Your back yard could be 30 degrees and 2 miles away it could be 35. You're asking for something impossible.
> but what i do care about is a forecast for lows of 50F and it ends up being 32.15F. If you were a roofer and you were off by 18 degrees, you'd still be in prison.
Well, take the actual weather observations for the past year. Take the actual weather observations for the last week. Overlay and slide this week of observations on top of the observations of the past year, until you find the window that matches "the best" — then take the day right after this window and predict that the weather will be just like that (or maybe try to tweak the values a bit).
I wonder how poorly this thing operates, and whether taking several years of history to look at would help much.
Extremely poorly, because forecasting the weather is all about forecasting the deviations from the expected seasonal patterns - the "eddies" in the atmospheric flow which give rise to storm systems and interesting, impactful weather.
According to paper model grid resolution is 1.5 degrees. I do not think it can predict accurate weather in any location. It show global weather trends.
It's lower than many other medium-range AI forecasts, but note that those other models get state-of-the-art with pretty coarse grids, 0.5° or so. The point is that upper atmosphere and broad patterns are smooth, so with ML/AI they don't require high resolution (while simulating them with physical models does require). And at the forecast lag of say 5-10 days, all local detail is lost anyway, so what skill remains comes from broad patterns, in all models. (Some extra skill can be gained by running local models initialized with the broad patterns, for there are clear cases like mountains where fine resolution is useful.)
1.5 degrees is perfectly fine for predicting large-scale (synoptic) weather patterns. They're not just "global trends." But yes, typical global NWP models and their MLWP competitors are run at 0.25 degrees or finer. All forecasts are statistically post-processed and biased-corrected to create local forecasts.
I'm curious if some future, hypothetical AGI agent, which had been trained to have these kinds of abilities, would be akin to how most humans see a ball in flight and just instinctively know (within reason) where the ball is going to go? We don't understand, consciously and in the moment, how our brain is doing these calculations (although obviously we can get to similar results with other methods), but we still trust the outputs.
Would some hypothetical future AI just "know" that tomorrow it's going to be 79 with 7 mph winds, without understanding exactly how that knowledge was arrived at?
I remember learning that humans trying to catch a ball are not actually able to predict where the ball will land, but rather, will move in a way that maintains the angle of movement constant.
As a result a human running to catch the ball over some distance (eg during a baseball game) runs along a curved path, not linearly to the point where the ball will drop (which would be evidence of having an intuition of the ball's destination).
This hypothesis could be tested, now that major league baseball tracks the positions of players in games. In the MLB app they show animations of good outfield plays with "catch difficulty" scores assigned, based (in part) on the straight-line distance from the fielder's initial position to the position of the catch. The "routes" on the best catches are always nearly-straight lines, which suggests that high-level players have developed exactly this intuitive sense.
Certainly what I was coached to do, what outfielders say they do, and what I see watching the game, is to "read" the ball, run towards where you think the ball is going, and then track the ball on the way. I was and am a shitty outfielder, in part because I never developed a fast-enough intuitive sense of where the ball is going (and because, well, I'm damn slow), but watch the most famous Catch[1] caught on film, and it sure looks like Mays knew right away that ball was hit over his head.
There are a few that kind of theories. You are probably referring to the Optical Acceleration Cancellation theory[1]. There are some similar later so called "direct perception" theories too.
The problem with these is that they don't really work, often even in theory. People do seem to predict at least some aspects of the trajectory, although not necessarily the whole trajectory [2].
Agreed. Reminds me of juggling, while learning I noticed that as long as I could see each ball for at least a split second on its upwards trajectory I could "tell" if it would be a good throw or not. In order to keep both hands/paths in my view I would stare basically straight forward and not look at the top of the arc and could do it at any height. Now I can do it much more with feel and the motion is muscle memory but the visual cues were my main teacher.
It makes sense that there are several heuristics. After all, "Thinking: Fast and Slow" already makes the point that human brains have several layers of processing with different advantages and drawbacks depending on situations.
> would be akin to how most humans see a ball in flight and just instinctively know (within reason) where the ball is going to go?
Up to a point .. and that point is more or less the same as the point where humans can no longer catch a spinning tennis raquet.
We understand the gravitional rainbow arc of the centre of mass, we fail at predicting the low order chaotic spin of tennis raquet mass distributions.
Other butterflies are more unpredictable, and the ones that land on a camels back breaking a dam of precariously balanced rocks are a particular problem.
Yes, humans are obviously limited in the things we can instinctively, intuitively predict. That's not really the point. The point is whether something that has been trained to do more complicated predictions will have the a similar feeling when doing those predictions (of being intuitive and natural), or if it will feel more explicit, like when a human is doing the calculus necessary to predict where the same ball is going to go.
My phrase "future, hypothetical" was trying very specifically to avoid any discussions about whether current AI have qualia or internal experiences. I was trying to think about whether something which did have some kind of coherent internal experience (assumed for the sake of the idle thought), but which had far greater predictive abilities than humans, would have the same intuitive feeling when making those much more complicated predictions.
It was an idle thought that was only barely tangentially related to the article in question, and was not meant to be a comment at all on the model in the article or on any current (or even very near future) AI model.
I don't expect anyone would have an answer, given the extreme degree of hypothetical-ness.
When you say "The point is whether a LLM has any feelings..." are you thinking specifically about an LLM, or AI in general?
I've seen nothing to indicate that Project Aardvark is using an LLM for weather prediction.
I think “chain-of-thought” LLMs, with access to other tools like Python, already demonstrate two types of “thinking”.
We can query an LLM a simple question like “how many ‘r’ are in the word strawberry”, and an LLM with know access to tools will quite confidently, and likely incorrectly, give you an answer. There’s no actual counting happening, and any kind of understanding of the problem, the LLM will just guess an answer based on its training data. But that answer tends to be wrong, because those types of queries don’t make up a large portion of its training set, and if they do, there’s a large body of similar queries with vastly different answers, which ultimately results in confidentiality incorrect outputs.
But provide an LLM tools like Python, and a “chain-of-thought” prompt that allows it to recursively re-prompt itself, while also executing external code and viewing the outputs, and an LLM can easily get the correct answer to query “how many ‘r’ are in the word strawberry”. By simply writing and executing some Python to compute the answer.
Those two approaches to problem solving are strikingly similar to intuitive vs analytical thinking in humans. One approach is driven entirely by pattern matching, and breaks down when dealing with problems that require close attention to specific details, the other is much more accurate, but also slower because directed computation is required.
As for your hypothetical “weather AI”, I think it’s pretty easy to imagine an AGI capable of confidently predicting the weather tomorrow, not be capable of understanding how it computed the prediction, beyond a high level hand wavy explanation. Again, that’s basically what LLM do today, confidently make predictions of the future, with zero understanding of how or why they made those predictions. But you can happily ask an LLM how and why it made a prediction, and it’ll give you a very convincing answer, that will also be a complete and total deception.
> would be akin to how most humans see a ball in flight and just instinctively know (within reason) where the ball is going to go?
Generally no. If I show you a puddle of water, can you tell me what shape was the ice sculpture that it was melted from?
One is Newtonian motion and the other is a complex chaotic system with sparse measurements of ground truth. You can minimize error propagation but it’s a diminishing returns problem, (except in rare cases like for natural disasters where a 6h warning can make a difference).
> "Sma," the ship said finally, with a hint of what might have been frustration in its voice, "I'm the smartest thing for a hundred light years radius, and by a factor of about a million ... but even I can't predict where a snooker ball's going to end up after more than six collisions."
[GCU Arbitrary in "The State of the Art"]
6? That can't be right. I don't know how big a GCU is, so the scale could be up to 1 OOM off, but a full redirection of all simulation capacity should let it integrate out further than that.
For ball-to-ball collisions, 6 is already a highly conservative estimate-- this is basically a chaotic system (outcome after a few iterations, while deterministic, is extremely sensitive to exact starting conditions).
The error scales up exponentially with the number of (ball-to-ball) collisions.
So if the initial ball position is off by "half a pixel" (=> always non-zero) this gets amplified extremely quickly.
Your intuition about the problem is probably distorted by considering/having experienced (less sensitive) ball/wall collisions.
> Would some hypothetical future AI just "know" that tomorrow it's going to be 79 with 7 mph winds, without understanding exactly how that knowledge was arrived at?
I think a consciousness with access to a stream of information tends to drown out the noise to see signal, so in those terms, being able to "experience" real-time climate data and "instinctively know" what variable is headed in what direction by filtering out the noise would come naturally.
So, personally, I think the answer is yes. :)
To elaborate a little more - when you think of a typical LLM the answer is definitely no. But, if an AGI is likely comprised of something akin to "many component LLMs", then one part might very well likely have no idea how the information it is receiving was actually determined.
Our brains have MANY substructures in between neuron -> "I", and I think we're going to start seeing/studying a lot of similarities with how our brains are structured at a higher level and where we get real value out of multiple LLM systems working in concert.
As another comment mentioned, papers get revised during review, usually in response to reviewer comments. Also, some journals (not sure about Nature) do not allow authors to "backport" revisions made in response to reviewer comments to preprints; I guess they view the review process as part of their "value add".
Its quite common to revise papers. For example, they might have uploaded to arxiv in order to submit to a conference. Later, they revised and submitted to Nature.
There was a previous Turing Institute in Glasgow doing AI research (meaning, back then rules-based systems, but IIRC my professor was doing some work with them on neural networks), which hit the end of the road in 1994. There was some interesting stuff spun out of there, but it's a whole different institute.
not to hijack this thread but my dad did extensive research in sea breeze and rainfall modeling and he would have loved to see these AI and machine learning advancements in weather prediction.
Google's is initialized with a gridded dataset, ERA5, from ECMWF. Using ERA5 is the current standard here, and ECMWF themselves build on that mostly now. Meanwhile, Aardvark tries to do the same directly from observations.
GDM's GraphCast/GenCast require an "analysis state" of the atmosphere. This is a 3D, gridded dataset with key variables like temperature, humidity, and winds. Generally speaking, an analysis is produced by a physics-based weather model through "data assimilation", an optimization process which tries to create a 3D state that is consistent with observations over some window of time.
"Observations" is overloaded; it's really any "raw" weather data product, like satellite imagery, surface station measurements, or weather balloon traces. I'm handwaving away a lot of complexity here.
The AardvarkWeather model is a significant new development and paradigm shift - it's one of the first models of a new class which do not require an analysis, and can directly use the "raw" weather data observations that are typically used to perform data assimilation.
Weather predictions are for specific events and areas, made on the order of days- typically no more than 2-3 weeks into the future.
Climate models predict future averages over large regions on the order of decades.
"rapid climate change" is on the order of "within this century". Whether the climate changes or not doesn't really impact the weather models at all, because the climate's not changing on the same time scale.
One of the characteristics of the present climate change on Earth is that weather events (storms, heat waves, maximum and minimum temperatures, start of seasons) display a lot more variability, and consequently also a lot more extremes than in the recent past. This is already happening.
If you train a weather model ML on past weather data, which has reduced variability, and use it to predict future weather, it is possible that it under predicts variability of weather variables (amount of expected rainfall, maximum temperature tomorrow etc). Variance not mean.
I say "it is possible" because only rigorous modeling can tell us the truth. Which is why I asked the question originally. I don't think it can be denied by abstract arguments.
But when climate changes even by what appear, to us, to be very small amounts, that can have huge effects on how weather systems behave.
Thus, if you want your AI model that predicts weather, which was trained on data before that change, to predict weather after that change, you may very well find that it rapidly loses accuracy.
I think the naive answer is that physics doesn't change when the climate does, just the initial conditions for the day's simulation of the next week's weather.
And every year over year change is within error bars of the prior years results, once you account for those initial conditions.
And if your model is based on physics, then that's fine.
But if your model is an LLM (or close cousin) purely based on previous weather patterns, then it may struggle to accurately predict once the underlying conditions change.
While I really want to agree with your sound argument, I have this suspicion that climate change significantly affected forecast skill on Hurricane Otis.
The climate didn't change in the time from Otis forming to landing.
Weather models look at current conditions and extrapolate what the next set of conditions will be.
Another way to look at it is that climate change models reflect what we will experience. Weather models reflect the mechanics of humidity, temperature and air flow- given a current state, what is the next likely state. Climate change doesn't change how those mechanics interact. It only guesses what "current state" will look like in the far future.
Climate change doesn’t impact the physics of weather, but our understanding of the inputs to a weather system is still very incomplete. Climate change does impact the relative importance of different physical inputs into a weather system, and inherently pushes weather systems away from the most studied and well understood aspects of weather physics. That in turn degrades the accuracy of our models, as it takes time to research and understand whats changed in a weather system, and what previously unimportant inputs have become much more important.
Let's say I live in the hypothetical state of Midwestia. It gets up to 100 degrees F in the summer and down to -40 in the winter months.
Over the course of ten years, the average daily temperature goes up by 1 degree F (which is many times faster than reality).
That means that there's basically one day in the entire year that doesn't fit within the existing model, or maybe a handful if some other part of the year is cooler than normal. We see more variation between normal years than we expect to see average increase in both the short and medium terms.
Our weather models predict no more than two or three weeks into the future.
There's literally nothing about climate change that will change how weather systems interact that will occur faster than our weather models will adapt to. Our weather patterns may change drastically, but that's a fundamentally different problem.
How quickly do you think our weather models adapt? Our current models are built on top of decades of detailed weather research by tens of thousands of researchers around the world, and has involved the launch of some the largest satellites in space. All to help us try and understand how weather works.
Those models don't just "adapt", they're carefully and manually refined over time, slowly incorporating better physical models as they're developed, and better methods of collecting data as it become available. Climate change absolutely changes how weather systems interact faster than our models can adapt, primarily because we can only adapt models to changes after their accuracy degrades, and it becomes possible to start identifying potential weaknesses in the existing approaches.
You're example of seasonal temperature changes vs global average temperature increases, vastly underestimates how complicated weather models are, and how much impact changes in global climate systems (like the North Atlantic Current) can drastically impact local weather conditions. Weather models, like all models, will make assumptions about the behaviours of extremely large systems which aren't expected to change very much, or are very hard to measure in real-time. As those large systems change due to climate change, the assumptions made by weather models will grow increasingly incorrect. But that hard part is figuring our which assumption is now incorrect, not always easy to identity exactly what assumptions have been made, or accurately measure the difference between the assumption and reality.
I wish HNers would ask themselves "If I, someone with nothing more than layman's knowledge, thought of this - would someone who has spent years studying in their field, possibly the head of their lab, the people they collaborated with, the people who approved their grant, and the people who reviewed the paper for publication - think of this?"
I can hear the keyboard mashing already. YES, I am aware there are problems in both research and publishing.
That does not mean that a HN'ers "did they think of X?" is any more valid. It's like saying that because bridges collapse, we should be allowed people with no relevant experience, knowledge, or training to look at new bridges and say "that bit there doesn't look strong enough to me!"
On the other hands experts do need to be questions and provide explanations.
"It's like saying that because bridges collapse, we should be allowed people with no relevant experience, knowledge, or training to look at new bridges and say "that bit there doesn't look strong enough to me!""
Not a valid comparison. Civil engineering is a field with a lot of known answers. bridge designs rarely rely on cutting edge research.
I wish HNers would stop and ask themselves “Do HNers need to read yet another misguided Dunning-Kruger rant from someone who clearly misread the parent comment?”
The GP asked _how_ robust the ML models will be, which is a perfectly good question to ask. Maybe a climate scientist specializing in ML can answer that question.
It remains to be seen but right now models are constantly evolving and being trained in on recent data, I.e training data including an already changing climate.
I have heard the opinion that the models would not learn weather patterns but rather “weather physics”. I am not knowledgeable enough in ML to comment on that though.
The title has been editorialised. The paper is about building AI weather models on par with current state-of-the-art weather models that require super computers to run. But without the need for the super computers, just a normal desktop computer is enough.
That represents a huge step-change in need compute for accurate weather models, and opens up the possibility of even more accurate models, if the accuracy of these techniques scales with available compute. If you can get state-of-the-art accuracy with a desktop computer, something that normally requires a super-computer, what happens if you run those techniques on a super-computer?
Didn't read the paper, but it would be cool to apply distillation to these big physics-driven models, in order to simulate their outcome on much smaller models.
Additionally (to the other replies to your question), asking for the weather is just another data point in the huge amount of surveillance occurring. If I can get the weather for my location without accessing a remote server where someone else gets to see my query and probably location, all the better. This provides that possibility.
Although based on their bios and UK population statistics, it seems likely they're both English, the article omits this information. They do work at a British institution, located in England..
Is there a big Clearinghouse for this data?
Kind of like how fintech algos can be run against historic stock market data to evaluate them.