~www_lesswrong_com | Bookmarks (692)
-
Liron Shapira vs Ken Stanley on Doom Debates. A review — LessWrong
Published on January 24, 2025 6:01 PM GMTI summarize my learnings and thoughts on Liron Shapira's...
-
Is there such a thing as an impossible protein? — LessWrong
Published on January 24, 2025 5:12 PM GMTThis is something I’ve been thinking about since my...
-
Stargate AI-1 — LessWrong
Published on January 24, 2025 3:20 PM GMTThere was a comedy routine a few years ago....
-
QFT and neural nets: the basic idea — LessWrong
Published on January 24, 2025 1:54 PM GMTPreviously in the series: The laws of large numbers...
-
Eliciting bad contexts — LessWrong
Published on January 24, 2025 10:39 AM GMTSay an LLM agent behaves innocuously in some context...
-
Insights from "The Manga Guide to Physiology" — LessWrong
Published on January 24, 2025 5:18 AM GMTPhysiology seemed like a grab-bag of random processes which...
-
Do you consider perfect surveillance inevitable? — LessWrong
Published on January 24, 2025 4:57 AM GMTA lot of my recent research work focusses on:1....
-
Uncontrollable: A Surprisingly Good Introduction to AI Risk — LessWrong
Published on January 24, 2025 4:30 AM GMTI recently read Darren McKee's book "Uncontrollable: The Threat...
-
Contra Dances Getting Shorter and Earlier — LessWrong
Published on January 23, 2025 11:30 PM GMT I think of a standard contra dance as...
-
Starting Thoughts on RLHF — LessWrong
Published on January 23, 2025 10:16 PM GMTCross posted from SubstackContinuing the Stanford CS120 Introduction to...
-
Recursive Self-Modeling as a Plausible Mechanism for Real-time Introspection in Current Language Models — LessWrong
Published on January 22, 2025 6:36 PM GMT(and as a completely speculative hypothesis for the minimum...
-
Ut, an alternative gender-neutral pronoun — LessWrong
Published on January 22, 2025 5:36 PM GMTThis post is about ‘ut’, a gender-neutral pronoun I...
-
Mechanisms too simple for humans to design — LessWrong
Published on January 22, 2025 4:54 PM GMTCross-posted from Telescopic TurnipAs we all know, humans are...
-
Training Data Attribution: Examining Its Adoption & Use Cases — LessWrong
Published on January 22, 2025 3:41 PM GMTNote: This report was conducted in June 2024 and...
-
Training Data Attribution (TDA): Examining Its Adoption & Use Cases — LessWrong
Published on January 22, 2025 3:40 PM GMTNote: This report was conducted in June 2024 and...
-
The Quantum Mars Teleporter: An Empirical Test Of Personal Identity Theories — LessWrong
Published on January 22, 2025 11:48 AM GMTtl;dr: If a copy is not identical to the...
-
Bayesian Reasoning on Maps — LessWrong
Published on January 22, 2025 10:45 AM GMTThis is a linkpost for an article I've written...
-
Against blanket arguments against interpretability — LessWrong
Published on January 22, 2025 9:46 AM GMTOn blanket criticism and refutationIn his long post on...
-
Evolution and the Low Road to Nash — LessWrong
Published on January 22, 2025 7:06 AM GMTSolution concepts in game theory—like the Nash equilibrium and...
-
The Human Alignment Problem for AIs — LessWrong
Published on January 22, 2025 4:06 AM GMTIf there was a truly confirmed sentient AI, nothing...
-
Natural Intelligence is Overhyped — LessWrong
Published on January 21, 2025 6:09 PM GMTLike this piece? It's cross-posted from by blog: https://collisteru.net/writing/This...
-
14+ AI Safety Advisors You Can Speak to – New AISafety.com Resource — LessWrong
Published on January 21, 2025 5:34 PM GMTGetting personalised advice from a real human can help...
-
[Linkpost] Why AI Safety Camp struggles with fundraising (FBB #2) — LessWrong
Published on January 21, 2025 5:27 PM GMTCrossposted on The Field Building Blog and the EA...
-
The Manhattan Trap: Why a Race to Artificial Superintelligence is Self-Defeating — LessWrong
Published on January 21, 2025 4:57 PM GMTDiscuss