Hacker Newsnew | past | comments | ask | show | jobs | submit | more godelski's commentslogin

  > Any scientist will tell you
The number of people, including scientists, who treat algorithms as black boxes is incredibly concerning. The math is meaningless without interpretation, and that requires understanding what goes into the scores.

That said, why would anyone think such scores could be a reasonably accurate representation? You are aggregating such complex situations and with that you kill off nuance. It doesn't mean the scores are useless, but need to be used carefully. I mean even look at the chart and you'll see weird things pop out. Ireland is ranked 5th by the "Freedom in the World" Index and falls into the highest binning for all 3 categories: economic freedom, press freedom, democracy. Yet New Zealand is 3rd, falling into the second bins for economic and press freedom. Further down you see the US below Argentina yet the US's scores are significantly higher than Argentina in each other category and the US is tied with Mongolia (who has a major problem with Press Freedom).

It should be quite clear that these scores are missing a lot of important details. Like the US definitely has problems with Freedom of Speech (and growing) but you can call Trump and Clinton pedos on the internet all day and nothing will happen to you[0]. Nuance is needed and treating these indexes as black boxes is just harmful to a conversation about freedom.

[0] https://www.bbc.com/news/articles/c04vqldn42go


We're all in bubbles. But it's good to expand them when you recognize you're in one.

And the Internet also consists in large part of bots talking to bots. This is not to say that some people won't always promulgate the "Won't somebody please think of the children?" argument every time an expansion of the surveillance state apparatus is in question, but rather to say that we should not take for granted that every bad opinion we see online is one deeply held by any real people.

For the folks in the back row:

Age Verification is about Kids and Censorship: to track them and censor them

Age Verification is about Kids: giving it to companies who will keep it as safe as they've kept your identity, email, and other information.

Age Verification is about Kids and Censorship: taking control from you and giving it to corporations and government.

Age Verification is about Kids and Censorship: to keep them on their platforms so they can profit from them

Age Verification isn't just about Kids: it's also about tracking you

I don't know why we want to put children's data online. I don't want cameras in the kids rooms to verify their face, that camera will be used by others. That camera will be used to do the very thing they claim it is to protect against. I don't want the kids online, easily meeting with pedos, pretending to be kids or otherwise. I don't want kids data online for those people to use it to harm them. I don't want kid's data being leaked and exposed forever. To create lasting damage that will follow then the rest of their lives.

The road to hell is paved with good intentions. The devil uses this to fool you. Seriously, y'all gonna trust your kids' data with the people in the Epstein list? Why would you let a fox guard the chicken coop?


Hear, hear! I'll add another:

"Age Verification is about suppressing the 'undesirable' for kids, but also for you". I.e. having to share your personal identity for that taboo (but legal and harmless) sex site (or whatever someone in power might not approve of) sure is going to make you think twice and may even kill many smaller instances withojt making them illegal outright.

It may not sound bad in a single example, but eventually there is something you like that is fine but gets added to the list.


  > Mickey mouse is now copyright free
Not true.

Scroll down on this page[0] and you'll see the different Mickeys and most of them are not under copyright. You got Steamboat Whillie + gloves but no Fantasia Mickey or later. Definitely no red-pants version.

Unsurprisingly Disney knows what they're doing and they have 95 years to modify a character's looks (and how the public imagines that character) before it enters public domain.

  > pluto is in two weeks
Not the Pluto you're thinking of...[1]

[0] https://web.law.duke.edu/cspd/mickey/

[1] https://www.disneydining.com/disney-copyright-loss-pluto-202...


  >  in retrospect was this actually huge news?
Yes

  > Craft software that makes people feel something
Meta, Google, and all of FAANG already did that. They crafted software that made people feel hate, anger, depression, but sometimes joy. It's nice to get those cute animal posts when doom scrolling. It's a nice break from "you're all going to die", "everyone is dumb except you", and "you're powerless".

Joking aside, I do very much agree with the OP. But I also wanted to note how things can get perverted. Few people are actually evil and most evil people get there slowly. What's that old cliché that everyone forgets? "The road to hell is paved with good intentions". The point is to constantly check that you're on the right path and realize that most evil is caused in the pursuit of good because good is difficult to do.

But also I wanted to share a Knuth quote

  | In fact what I would like to see is thousands of computer scientists let loose to do whatever they want. That’s what really advances the field
  - Donald Knuth
I am fully with him on this. It is the same reason Bell Labs had so much success.

  How do you manage a bunch of geniuses?[0]. You don't. 
You let experts explore. They already know the best ways forward. Many will fail, but that's okay. In CS one of the biggest problems we have is that we try to optimize everything, but we're also really bad at math. If you want to optimize search over a large solution space with a highly parallelized processor you create a distribution. It's good to have that mean but you need the tails too and that's what we lose. You tighten the distribution when you need focus on a specific direction but then relax it to go back to exploration. But what we do, is we like railroads. We like to try to carry all the groceries from the car in one trip. We like to go fast, but don't really care where. We love to misquote Knuth about premature optimization to justify our laziness and ignore his quotes about being detail oriented and refining solutions.

I think progress has slowed down. And I think it's because we stopped exploring. We're afraid to do anything radical, and that's a shame

[0] Knuth has another quote about programmers not being geniuses lol


My saying is that if you want to be able to herd cats, you need to be cat yourself.

That can require so much experience most people could never imagine it could be accomplished at all.

But if so, you could then herd geniuses, as long as they were also cats :)

When you single-handedly have to cover a lot of bases, you do what you have to do.


  > I was reminded again of my tweets that said "Be good, future LLMs are watching". You can take that in many directions, but here I want to focus on the idea that future LLMs are watching. Everything we do today might be scrutinized in great detail in the future because doing so will be "free". A lot of the ways people behave currently I think make an implicit "security by obscurity" assumption. But if intelligence really does become too cheap to meter, it will become possible to do a perfect reconstruction and synthesis of everything. LLMs are watching (or humans using them might be). Best to be good.
Can we take a second and talk about how dystopian this is? Such an outcome is not inevitable, it relies on us making it. The future is not deterministic, the future is determined by us. Moreso, Karpathy has significantly more influence on that future than your average HN user.

We are doing something very *very* wrong if we are operating under the belief that this future is unavoidable. That future is simply unacceptable.


Given the quality of the judgment I'm not worried, there is no value here.

To properly execute this idea rather than to just toss it off without putting in the work to make it valuable is exactly what irritates me about a lot of AI work. You can be 900 times as productive at producing mental popcorn, but if there was value to be had here we're not getting it, just a whiff of it. Sure, fun project. But I don't feel particularly judged here. The funniest bit is the judgment on things that clearly could not yet have come to pass (for instance because there is an exact date mentioned that we have not yet reached). QA could be better.


I think you're missing the actual problem.

I'm not worried about this project but instead harvesting, analyzing all that data and deanonymizing people.

That's exactly what Karparthy is saying. He's not being shy about it. He said "behave because the future panopticon can look into the past". Which makes the panopticon effectively exist now.

  Be good, future LLMs are watching
  ...
  or humans using them might be
That's the problem. Not the accuracy of this toy project, but the idea of monitoring everyone and their entire history.

The idea that we have to behave as if we're being actively watched by the government is literally the setting of 1984 lol. The idea that we have to behave that way now because a future government will use the Panopticon to look into the past is absolutely unhinged. You don't even know what the rules of that world will be!

Did we forget how unhinged the NSA's "harvest now, decrypt later" strategy is? Did we forget those giant data centers that were all the news talked about for a few weeks?

That's not the future I want to create, is it the one you want?

To act as if that future is unavoidable is a failure of *us*


Yes, you are right, this is a real problem. But it really is just a variation on 'the internet never forgets', for instance in relation to teen behavior online. But AI allows for weaponization of such information. I wish the wannabe politicians of 2050 much good luck with their careers, they are going to be the most boring people available.

The internet never forgets but you could be anonymous. Or at least somewhat. But that's getting harder and harder

If such a thing isn't already possible (it is to a certain extent), we are headed towards a point where your words alone will be enough to fingerprint you.


Stylometry killed that a long time ago. There was a website, stylometry.net that coupled HN accounts based on text comparison and ranked the 10 best candidates. It was incredibly accurate and allowed id'ing a bunch of people that had gotten banned but that came back again. Based on that I would expect that anybody that has written more than a few KB of text to be id'able in the future.

You need a person's text with their actual identity to pull that off. Normally that's pretty hard, especially since you'll get different formats. Like I don't write the same way on Twitter as HN. But yeah, this stuff has been advancing and I don't think it is okay.

The AOL scandal pretty much proved that anonymity is a mirage. You may think you are anonymous but it just takes combining a few unrelated databases to de-anonymize you. HN users think they are anonymous but they're not, they drop factoids all over the place about who they are. 33 bits... it is one of my recurring favorite themes and anybody in the business of managing other people's data should be well aware of the risks.

I think you're being too conspiracy theorist here by making everything black and white.

Besides, the main problem of how difficult it is to deanonymize, not if possible.

Privacy and security both have to perfect defense. For example, there's no passwords that are unhackable. There are only passwords that cannot be hacked with our current technology, budgets, and lifetime. But you could brute force my HN password, it would just take billions of years.

The same distinction it's important here. My threat model on HN doesn't care if you need to spend millions of dollars nor thousands of hours to deanonymize me. My handle is here to discourage that and to allow me to speak more freely about certain topics. I'm not trying to hide from nation states, I'm trying to hide from my peers in AI and tech. So I can freely discuss my opinions, which includes criticizing my own community (something I think everyone should do! Be critical of the communities we associate with). And moreso I want people to consider my points on their merit alone, not on my identity nor status.

If I was trying to hide from nation states I'd do things very very differently, such as not posting on HN.

I'm not afraid of my handle being deanonymized, but I still think we should recognize the dangers of the future we are creating.

By oversimplifying you've created the position that this is a lost cause, as if we already lost and that because we lost we can't change. There are multiple fallacies here. The future has yet to be written.

If you really believe it is deterministic then what is the point to anything? To have desires it opinions? Are were just waiting to see which algorithm wins out? Or are we the algorithms playing themselves out? If it's deterministic wouldn't you be happy if the freedom algorithm won and this moment is an inflection in your programming? I guess that's impossible to say in an objective manner but I'd hope that's how it plays out


I have enough industry insights to prove that your data is floating out there, unprotected, in plain text and that those that are not bound by the law are making very good use of it. Every breach leaks more bits about you.

This is the main driver behind the targeted scams that ordinary people now have to deal with. It is why people get voice calls from loved ones in distress, why they get 'tech support' calls that aim to take over their devices and why lots of people have lost lots of money.

If you think I am too conspiracy theorist by making everything black and white that is maybe simply because we live different lives and have different experience.


I call this the "judgement day" scenario. I would be interested if there is some science fiction based on this premise.

If you believe in God of a certain kind, you don't think that being judged for your sins is unacceptable or even good or bad in itself, you consider it inevitable. We have already talked it over for 2000 years, people like the idea.


You'll be interested in Clarke's "The Light of Other Days". Basically a wormhole where people can look back at any point in time, ending all notion of privacy.

God is different though. People like God because they believe God is fair and infallible. That is not true for machines nor men. Similarly I do not think people will like this idea. I'm sure there will be some but look at people today and their religious fever. Or look in the past. They'll want it, but it is fleeting. Cults don't last forever, even when they're governments. Sounds like a great way to start wars. Every one will be easily justified

https://en.wikipedia.org/wiki/The_Light_of_Other_Days


Okay, I got two questions and I never seem to get satisfactory answers but I'm actually curious.

1) What kind of code are you writing that's mostly boilerplate?

2) Why are you writing code that's mostly boilerplate and not code that generalizes boilerplate? (read: I'm lazy. If I'm typing the same things a lot I'm writing a script instead)

I'd think maybe the difference is in what we program but I see say similar things to you that program the types of things I program so idk


  > for bash scripts etc
Everyone keeps telling me that it's good for bash scripts but I've never had real success.

Here's an example from today. I wanted to write a small script to grab my Google scholar citations and I'm terrible with web so I ask the best way to parse the curl output. First off, it suggests I use a python package (seriously? For one line of code? No thanks!) but then it gets the wrong grep. So I pull up the page source, copy paste some to it, and try to parse it myself. I already have a better grep command and for the second time it's telling me to use pearl regex (why does it love -P as much as it loves delve?). Then I'm pasting in my new command showing it my output asking for the awk and sed parts while googling the awk I always forget. It messes up the sed parts while googling, so I fix it, which means editing the awk part slightly but I already had the SO post open that I needed anyways. So I saved maybe one minutes total?

Then I give it a skeleton of a script file adding the variables I wanted and fully expected it to be a simple cleanup. No. It's definitely below average, I mean I've never seen an LLM produce bash functions without being explicitly told (not that the same isn't also true for the average person). But hey, it saved me the while loop for the args so that was nice. So it cost as much time as it gave back.

Don't get me wrong, I find LLMs useful but they're nowhere near game changing like everyone says they are. I'm maybe 10% more productive? But I'm not convinced that's even true. And sure, I might have been able to do less handholding with agents and having it build test cases but for a script that took 15 minutes to write? Feels like serious overkill. And this is my average experience with them.

Is everyone just saying it's so good at bash because no one is taking the time to learn bash? It's a really simple language that every Linux user should know the basics of...


  >  completely effed on most distributions
How does the distribution make this an issue? You can always freeze drivers and install old ones. I get that it might not work out of the box, especially with rolling-release distros like Arch, but you also don't want rolling-releases for an older machine.

I know it's also me that's the issue. But I just want a Linux distro that works. I've had enough of people saying "Nvidia has been getting so much better recently!" and "It's completely usable now!" when the newest drivers break my whole experience. I would use arch, and have tried about 5 times, but it's too complicated to get the driver I need and I won't even bother at this point. I've just accepted the fact that I'm going to use Mint until I get a desktop. Maybe I'll try to get help on a forum somewhere but idk, I think I would need personal help.

  > But I just want a Linux distro that works.
This is perfectly valid. But I would add that Arch is not that distro. Even though projects like Endeavour and Manjaro are trying that I don't think it'll ever be the case. You have rolling-releases and even though they've done a great job you're never going to be the most stable because of this.

But I think Pop is the best distro for this. System76 is highly incentivized to do exactly this and specifically with nvidia drivers and laptops (laptops create extra complications...). I can't promise it'll be a cure-all but it is worth giving a shot. I would try their forums too.

I totally get the frustration. I've been there, unfortunately. I hope you can get someone to help.


CachyOS just works for me. Highly optimized Arch working flawless and without hassle.

I know my ways around Arch, and in the about two years using CachyOS I never needed to intervene, with the exception of things like changed configs/split packages. But those are announced in advance on their webpages, be it Arch itself, or CachyOS, and also appear in good old Pacman in the terminal, or whichever frontend you fancy. It's THE DREAM!

What's lacking is maybe pre-packaged llm/machine learning stuff. Maybe I'm stupid, but they don't even have current llama.cpp, WTF? But at least Ollama is there. LM-Studio also has to be compiled by yourself, either via the AUR, or otherwise. But these are my only complaints.


  > Maybe I'm stupid, but they don't even have current llama.cpp, WTF?
I don't understand. It's in the AUR...

https://aur.archlinux.org/packages/llama.cpp

  > has to be compiled by yourself, either via the AUR
I don't think I'd call the AUR "compiled by yourself". It's still a package manager. You're not running the config and make commands yourself. I mean what do you want? A precompiled binary? That doesn't work very well for something like llama.cpp. You'd have to deliver a lot more with it and pin the versions of the dependencies, which will definitely result in lost performance.

Is running `yay -S llama.cpp` really that big of a deal? You're not intervening in any way different for any other package (that also aren't precompiled binaries)


> I mean what do you want? A precompiled binary?

Yes, exactly :-)

Haven't used yay or other aur helpers so far. Maybe that's why my systems run so stable?

Should maybe look into it.

Have used Yaourt on Arch in the far past, with...errm...varying success ;->


  > Haven't used yay or other aur helpers so far. 
  > Have used Yaourt on Arch in the far past,
Yaourt is an aur helper?

  > Maybe that's why my systems run so stable?
Sorry?

  >>> I know my ways around Arch
Forgive me, you said this earlier and I think I misunderstood. What does this mean exactly? How long have you been using Arch? Or rather, have you used Arch the actual distro or only Arch based distros?

I guess I'm asking, have you installed the vanilla distro? Are you familiar with things like systemd-boot, partitioning, arch-chroot, mkinitcpio, and all that?


I have used plain Arch in the past, for several years, no derivatives.

At that time there existed an AUR-helper called Yaourt, which I made heavy use of. But often in haste, sloppy. Which lead to many unnecessary clean-up actions, but no loss of system. Meanwhile I had to use other stuff, so no Arch for a while. When the need for using other stuff was gone I considered several options, like Gentoo, but naa, I don't wanna compile anymore!1!! (Yes, Yes, I know they serve binpkgs now, but would they have my preferred USE-flags?) Maybe Debian, which can be fucking fast when run in RAM like Antix, but I had that for a while, and while it's usable, Debian as such is bizarre.

Anything Redhat? No thanks. SuSe? Same. So I came across CachyOS, and continued to use that, from the first "test-installation" running to this day, because it works for me, like I wrote before. Like a dream come true.

Remembering my experiences with Yaourt I abstained from using the AUR. And that worked very well for me, so far. Also the Gentoo-like 'ricing' comes for free with their heavily optimized binary packages, without compromising stability.

> I guess I'm asking, have you installed the vanilla distro? Are you familiar with things like systemd-boot, partitioning, arch-chroot, mkinitcpio, and all that?

Yes.

Are we clear now?

Edit: I'm so overconfident I'm even considering disabling the pacman-hooks into BTRFS-snapshots, because I never needed them.

No rollback necessary, ever, so far. Same goes for pacman cache. After every -Syu follows an immediate -Scc.

Because the only way is forwaaaaard ;-)


I've used Yaourt too. Things are a lot better these days. Yay is the standard. But I think the biggest help of helpers is updating.

Yes, we're clear now, but are you surprised by my hesitation? Because having that experience would imply you've had a lot of experience compiling things the long way. Running makepkg -si isn't that complicated. It's as easy as it gets. There's no make, no configure, no cmake, no determining the dependencies yourself and installing those yourself too. I don't get the issue. Take too long? Not happen automatically?

  > I'm so overconfident I'm even considering disabling the pacman-hooks into BTRFS-snapshots, because I never needed them.
lol yeah I'm sure they're not needed. Not hard to recover usually and yeah I agree, things are stable these days. I can't remember the last time I needed to chroot (other than an nspawn). I only snapshot data I care about these days and it's usually backed up remotely too. I've learned my lesson the hard way too many times lol.

> I don't get the issue. Take too long? Not happen automatically?

Yes and Yes. Long before Arch I did LFS and Gentoo. And NetBSD like Gentoo.

I'm having had it! Gimme binaries in the flavors (Hello OpenBSD!) I want/like!1!! ;->


  >> Not happen automatically?
  > Yes
I got you fam

  # /etc/systemd/system/pacman_auto_update.timer
  [Unit]
  Description=Update automatically because ain't nobody got time for that
  Documentation=man:pacman(8)
  
  [Timer]
  OnCalendar=weekly
  Persistent=true
  # Optionally wake system up to upgrade
  #WakeSystem=true
  
  [Install]
  WantedBy=timers.target
  After=network-online.target

  # /etc/systemd/system/pacman_auto_update.service
  [Unit]
  Description=Update automatically because ain't nobody got time for that
  Documentation=man:pacman(8)

  [Service]
  Type=simple
  ExecStart=/usr/bin/pacman -Syu --noconfirm
Joking aside, I do use a version of this except I just run -Sy and I do it daily. I find it does help speed things up.

  > Gimme binaries
Definitely not going to happen on Arch and this runs completely counter to what you claimed to like about CachyOS. Distributing binaries is not going to result in a very optimal system... Which is what caused those red flags to be raised in the first place

  >>> After every -Syu follows an immediate -Scc
Btw, I don't suggest doing this. If an update breaks your system then you don't have the versions cached to roll back to. I mean you can download again but your cache gives you a good hint at what did in fact work.

That's not what I meant by 'automatically'.

I'm perfectly at ease with initiating them manually, as I see fit.

For me that means automatically tracking dependencies of things like USE-flags in Gentoo's Portage, or Exherbo's Paludis.

And the possibly resulting conflicts. Arch and its makepkg and the stuff in the AUR has simply no provisions(that I'm aware of) for that. It's all manual, IMO. AUR-helper, or not.

> Definitely not going to happen on Arch and this runs completely counter to what you claimed to like about CachyOS. Distributing binaries is not going to result in a very optimal system... Which is what caused those red flags to be raised in the first place

Says you. I counter that with my years long experience(on CachyOS), limited to the stuff they DO deliver as binary. Obviously carefully tested by people who really know what they do, on much faster systems than I have, before delivery to the general public.

> Btw, I don't suggest doing this. If an update breaks your system then you don't have the versions cached to roll back to. I mean you can download again but your cache gives you a good hint at what did in fact work.

Never needed it, neither on plain Arch in the far past, nor the two years of CachyOS now. Should something bad happen I can boot some rescue-image from whereever, and fix it that way. It's just a waste of space.

Edit: Please don't suggest Nix(OS) or Guix. They give a shit about optimization in the name of 'reproducible builds', and go for the lowest common denominator because of that. Which is understandable, given their goals. But they are unaligned with mine.


Ho-hum, so I gave this yay-thing a try, as a binary, out of CachyOS repos, and let it run an outstanding update of 77 packages, mostly new Plasma/KDE to 6.5.4 from 6.5.3. It even discovered some things which I must have installed manually via makepkg from the AUR, mainly i7z(probably during discovery, when the system(s) were 'new' to me), some microsoft fonts, and even Hexchat, which I've forgotten about, because I switched to KVirc when Hexchat began to crash. It doesn't do that anymore, at least not during autoconnect to EFNET & Libera Chat. Didn't test further. Did reboot with the usual insane brazenness of kill -9ing Firefox from within htop beforehand, to have it reliably restart my session, with all its windows and tabs in there. Yay -Scc, erasing all btrfs-snapshots, and so on.

Closing all other apps, terms, filemanagers. Klicking restart. Hands off. Very quiet and fast boot. Sddm appears. Login. Plasma is there. FF reloads as it should. Everything else works. Still ultra-smooth.

So Yay!?

(Squeekily screaming: *Oh my gawd!1!! Nao my (almost) pristine binary system iz tainted!1!!*)

You were saying?

Edit: Wanna 'see'? https://postimg.cc/5HmJb0g3

Edit: Hrrm. When Hexchat began to crash... So I've told shit about no app ever crashing. But that was a general problem on Distros which updated the underlying substrate faster than others, IIRC.

It was just 'bitrotten'.

Very annoying at the time because I've been used to it since a long time, and had it heavily customized and themed, but (binary!) KVirc came to the rescue, so I've forgotten about that. Sorry.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: