Bookmarks (714)

  • screenshot

    AI Tools for Existential Security — LessWrong

    Published on March 14, 2025 6:38 PM GMTRapid AI progress is the greatest driver of existential...

  • screenshot

    AI for AI safety — LessWrong

    Published on March 14, 2025 3:00 PM GMT(Audio version here (read by the author), or search...

  • screenshot

    On MAIM and Superintelligence Strategy — LessWrong

    Published on March 14, 2025 12:30 PM GMTDan Hendrycks, Eric Schmidt and Alexandr Wang released an...

  • screenshot

    Whether governments will control AGI is important and neglected — LessWrong

    Published on March 14, 2025 9:48 AM GMTEpistemic status: somewhat rushed out in advance of the...

  • screenshot

    Something to fight for — LessWrong

    Published on March 14, 2025 8:27 AM GMTA short science fiction story illustrating that if we...

  • screenshot

    Interpreting Complexity — LessWrong

    Published on March 14, 2025 4:52 AM GMTThis is a cross-post  - as some plots are...

  • screenshot

    Bike Lights are Cheap Enough to Give Away — LessWrong

    Published on March 14, 2025 2:10 AM GMT While in a more remote area bike lights...

  • screenshot

    Should AI safety be a mass movement? — LessWrong

    Published on March 13, 2025 8:36 PM GMTWhen communicating about existential risks from AI misalignment, is...

  • screenshot

    Auditing language models for hidden objectives — LessWrong

    Published on March 13, 2025 7:18 PM GMTWe study alignment audits—systematic investigations into whether an AI...

  • screenshot

    Vacuum Decay: Expert Survey Results — LessWrong

    Published on March 13, 2025 6:31 PM GMTDiscuss

  • screenshot

    A Frontier AI Risk Management Framework: Bridging the Gap Between Current AI Practices and Established Risk Management — LessWrong

    Published on March 13, 2025 6:29 PM GMTWe (SaferAI) propose a risk management framework which we...

  • screenshot

    Creating Complex Goals: A Model to Create Autonomous Agents — LessWrong

    Published on March 13, 2025 6:17 PM GMTWhy do adults pursue long-term and complex goals? People...

  • screenshot

    Habermas Machine — LessWrong

    Published on March 13, 2025 6:16 PM GMTThis post is a distillation of a recent work...

  • screenshot

    The "Reversal Curse": you still aren't antropomorphising enough. — LessWrong

    Published on March 13, 2025 10:24 AM GMTI scrutinise the so-called "reversal curse", wherein LLMs seem...

  • screenshot

    AI #107: The Misplaced Hype Machine — LessWrong

    Published on March 13, 2025 2:40 PM GMTThe most hyped event of the week, by far,...

  • screenshot

    Intelsat as a Model for International AGI Governance — LessWrong

    Published on March 13, 2025 12:58 PM GMTIf there is an international project to build artificial...

  • screenshot

    Stacity: a Lock-In Risk Benchmark for Large Language Models — LessWrong

    Published on March 13, 2025 12:08 PM GMTIntroSo far we have identified lock-in risk, defined lock-in,...

  • screenshot

    The prospect of accelerated AI safety progress, including philosophical progress — LessWrong

    Published on March 13, 2025 10:52 AM GMTThis started life as a reaction to a post...

  • screenshot

    Formalizing Space-Faring Civilizations Saturation concepts and metrics — LessWrong

    Published on March 13, 2025 9:40 AM GMTCrossposted on the EA Forum.Displacement of other Space-Faring Civilizations...

  • screenshot

    Elon Musk May Be Transitioning to Bipolar Type I — LessWrong

    Published on March 11, 2025 5:45 PM GMTEpistemic status: Speculative pattern-matching based on public information. In 2023,...

  • screenshot

    How Language Models Understand Nullability — LessWrong

    Published on March 11, 2025 3:57 PM GMTTL;DR Large language models have demonstrated an emergent ability...

  • screenshot

    Forethought: a new AI macrostrategy group — LessWrong

    Published on March 11, 2025 3:39 PM GMTForethought[1] is a new AI macrostrategy research group cofounded by Max...

  • screenshot

    Preparing for the Intelligence Explosion — LessWrong

    Published on March 11, 2025 3:38 PM GMTThis is a linkpost for a new paper called Preparing...

  • screenshot

    AI Control May Increase Existential Risk — LessWrong

    Published on March 11, 2025 2:30 PM GMTEpistemic status: The following isn't an airtight argument, but...

  • screenshot

    When is it Better to Train on the Alignment Proxy? — LessWrong

    Published on March 11, 2025 1:35 PM GMTThis is a response to Matt's earlier post. If...

  • screenshot

    Do reasoning models use their scratchpad like we do? Evidence from distilling paraphrases — LessWrong

    Published on March 11, 2025 11:52 AM GMTTL;DR: We provide some evidence that Claude 3.7 Sonnet...

  • screenshot

    A Hogwarts Guide to Citizenship — LessWrong

    Published on March 11, 2025 5:50 AM GMTThose engaged with questions of how to make the...

  • screenshot

    Cognitive Reframing—How to Overcome Negative Thought Patterns and Behaviors — LessWrong

    Published on March 11, 2025 4:56 AM GMTCognitive reframing is a powerful psychological technique that encourages...

  • screenshot

    Trojan Sky — LessWrong

    Published on March 11, 2025 3:14 AM GMTYou learn the rules as soon as you’re old...

  • screenshot

    Have you actually tried raising the birth rate? — LessWrong

    Published on March 10, 2025 6:06 PM GMTI just saw on twitter someone claiming that we...

  • screenshot

    Split Personality Training: Revealing Latent Knowledge Through Personality-Shift Tokens — LessWrong

    Published on March 10, 2025 4:07 PM GMTProduced as part of the ML Alignment & Theory...

  • screenshot

    We Have No Plan for Preventing Loss of Control in Open Models — LessWrong

    Published on March 10, 2025 3:35 PM GMTNote: This post is intended to be the first...

  • screenshot

    Lock-In Threat Models — LessWrong

    Published on March 10, 2025 10:22 AM GMTEpistemic status: a combination and synthesis of others' work,...

  • screenshot

    Book Review: Affective Neuroscience — LessWrong

    Published on March 10, 2025 6:50 AM GMTAfter years of clumsily trying to pick up neuroscience...

  • screenshot

    The chessboard world — LessWrong

    Published on March 10, 2025 1:26 AM GMTrelevant roon Our new friend in the cloud As...

  • screenshot

    when will LLMs become human-level bloggers? — LessWrong

    Published on March 9, 2025 9:10 PM GMT"Short AI timelines" have recently become mainstream.  One now...

  • screenshot

    Everything I Know About Semantics I Learned From Music Notation — LessWrong

    Published on March 9, 2025 6:09 PM GMTThis video provides a lot of background: https://www.youtube.com/watch?v=Eq3bUFgEcb4 and...

  • screenshot

    Phoenix Rising — LessWrong

    Published on March 9, 2025 11:53 AM GMTPreserving the memory, and the cells, of the best...

  • screenshot

    How well can Claude write coding questions? — LessWrong

    Published on March 9, 2025 5:29 AM GMTI'm curious as to how well Claude can write...

  • screenshot

    The machine has no mouth and it must scream — LessWrong

    Published on March 8, 2025 4:40 PM GMTI'm in a coworking space on the 25th floor...

  • screenshot

    HPMOR Anniversary Party — LessWrong

    Published on March 7, 2025 7:45 PM GMTDetails will follow.see https://www.lesswrong.com/posts/KGSidqLRXkpizsbcc/it-s-been-ten-years-i-propose-hpmor-anniversary-partiesDiscuss

  • screenshot

    How Do We Fix the Education Crisis? — LessWrong

    Published on March 8, 2025 2:59 AM GMTKey points:Standardized assessments do not provide signals for the...

  • screenshot

    GPT-4.5 Can Play Losing Chess — LessWrong

    Published on March 8, 2025 12:58 AM GMTAfter recently playing some chess against GPT-4.5 (it is...

  • screenshot

    #1 — LessWrong

    Published on March 7, 2025 8:09 PM GMTTheir comment was a paranoid and conspiratorial thought process...

  • screenshot

    are "almost-p-zombies" possible? — LessWrong

    Published on March 7, 2025 10:58 PM GMTIt's probably not possible to have a twin of...

  • screenshot

    Sufficiently Decentralized Intelligence is Indistinguishable from Synchronicity — LessWrong

    Published on March 7, 2025 9:50 PM GMTLoosely inspired by a submission to a hackathon on...

  • screenshot

    Amplifying the Computational No-Coincidence Conjecture — LessWrong

    Published on March 7, 2025 9:29 PM GMTIntroductionRecently, the Computational No-Coincidende Conjecture[1] was proposed, presented as an...

  • screenshot

    [ages 16-21] Apply to PAIR & ESPR, Summer AI & Rationality Programs — LessWrong

    Published on March 7, 2025 7:49 PM GMTTL;DR: PAIR on AI & Reasoning. ESPR on Everything,...

  • screenshot

    Forecasting newsletter #3/2025: Long march through the institutions — LessWrong

    Published on March 7, 2025 6:17 PM GMTHighlights:Manifold ending (a) cash markets, Kalshi slapped by regulators...

  • screenshot

    Childhood and Education #9: School is Hell — LessWrong

    Published on March 7, 2025 12:40 PM GMTThis complication of tales from the world of school...