Bookmarks (696)

  • screenshot

    Index of rationalist groups in the Bay July 2024 — LessWrong

    Published on July 26, 2024 4:32 PM GMTThe Bay Area rationalist community has an entry problem!...

  • screenshot

    End Single Family Zoning by Overturning Euclid V Ambler — LessWrong

    Published on July 26, 2024 2:08 PM GMTOn 75 percent or more of the residential land...

  • screenshot

    Common Uses of "Acceptance" — LessWrong

    Published on July 26, 2024 11:18 AM GMT“You should practise acceptance.”“What do you mean?”“You’re being too...

  • screenshot

    Universal Basic Income and Poverty — LessWrong

    Published on July 26, 2024 7:23 AM GMT(Crossposted from Twitter)I'm skeptical that Universal Basic Income can...

  • screenshot

    A Solomonoff Inductor Walks Into a Bar: Schelling Points for Communication — LessWrong

    Published on July 26, 2024 12:33 AM GMTA Solomonoff inductor walks into a bar in a...

  • screenshot

    What does a Gambler's Verity world look like? — LessWrong

    Published on July 25, 2024 10:03 PM GMTStatus: Thought experiment for fun Imagine a world in which...

  • screenshot

    Pacing Outside the Box: RNNs Learn to Plan in Sokoban — LessWrong

    Published on July 25, 2024 10:00 PM GMTWork done at FAR AI.There has been a lot...

  • screenshot

    Does robustness improve with scale? — LessWrong

    Published on July 25, 2024 8:55 PM GMTAdversarial vulnerabilities have long been an issue in various...

  • screenshot

    Organisation for Program Equilibrium reading group — LessWrong

    Published on July 25, 2024 7:11 PM GMT2024-07-25: I'm organising a little reading group, undecided between...

  • screenshot

    Constructing Benchmarks and Interventions for Combating Hallucinations in LLMs — LessWrong

    Published on July 25, 2024 2:58 PM GMTThis is based on our recent preprint paper “Constructing...

  • screenshot

    "AI achieves silver-medal standard solving International Mathematical Olympiad problems" — LessWrong

    Published on July 25, 2024 3:58 PM GMTGoogle DeepMind reports on a system for solving mathematical...

  • screenshot

    AlphaProof: an LLM to auto-formalize + AlphaZero self-trained to prove mathematical statements in Lean — LessWrong

    Published on July 25, 2024 3:54 PM GMTThere doesn't seem to be an arxiv PDF out...

  • screenshot

    [Talk transcript] What “structure” is and why it matters — LessWrong

    Published on July 25, 2024 3:49 PM GMTThis is an edited transcription of the final presentation...

  • screenshot

    AI #74: GPT-4o Mini Me and Llama 3 — LessWrong

    Published on July 25, 2024 1:50 PM GMTWe got two big model releases this week. GPT-4o...

  • screenshot

    AI Constitutions are a tool to reduce societal scale risk — LessWrong

    Published on July 25, 2024 11:18 AM GMTSammy Martin, Polaris VenturesAs AI systems become more integrated...

  • screenshot

    Determining the power of investors over Frontier AI Labs is strategically important to reduce x-risk — LessWrong

    Published on July 25, 2024 1:12 AM GMTProduced as part of the ML Alignment & Theory...

  • screenshot

    A framework for thinking about AI power-seeking — LessWrong

    Published on July 24, 2024 10:41 PM GMTThis post lays out a framework I’m currently using...

  • screenshot

    Llama Llama-3-405B? — LessWrong

    Published on July 24, 2024 7:40 PM GMTIt’s here. The horse has left the barn. Llama-3.1-405B,...

  • screenshot

    AI Safety Memes Wiki — LessWrong

    Published on July 24, 2024 6:53 PM GMTExtensive collection of memes compiled by Victor Li and...

  • screenshot

    Unlearning via RMU is mostly shallow — LessWrong

    Published on July 23, 2024 4:07 PM GMTThis is an informal research note. It is the...

  • screenshot

    Monthly Roundup #20: July 2024 — LessWrong

    Published on July 23, 2024 12:50 PM GMTIt is monthly roundup time. I invite readers who...

  • screenshot

    Confusing the metric for the meaning: Perhaps correlated attributes are "natural" — LessWrong

    Published on July 23, 2024 12:43 PM GMTEpistemic status: possibly trivial, but I hadn't heard it...

  • screenshot

    Ransomware Payments Should Require a Sin Tax — LessWrong

    Published on July 22, 2024 9:16 PM GMTCan a sin tax solve the ransomware problem? The...

  • screenshot

    My covid-related beliefs and questions — LessWrong

    Published on July 23, 2024 3:27 AM GMTThings I'm fairly confident in: I should take colds...

  • screenshot

    Is there a Schelling point for group house room listings? — LessWrong

    Published on July 23, 2024 3:03 AM GMTMy rationalist group house near Boston has a room...

  • screenshot

    Room Available in Boston Group House — LessWrong

    Published on July 23, 2024 2:55 AM GMTWe have a room opening up in a rationalist...

  • screenshot

    D&D.Sci Scenario Index — LessWrong

    Published on July 23, 2024 2:00 AM GMTThere have been a lot of D&D.Sci scenarios, but...

  • screenshot

    ML Safety Research Advice - GabeM — LessWrong

    Published on July 23, 2024 1:45 AM GMTThis is my advice for careers in empirical ML...

  • screenshot

    Trying to understand Hanson's Cultural Drift argument — LessWrong

    Published on July 22, 2024 8:20 PM GMTAt 2024's Manifest, Robin Hanson gave a talk (in...

  • screenshot

    Using an LLM perplexity filter to detect weight exfiltration — LessWrong

    Published on July 21, 2024 6:18 PM GMTRecently, there has been discussion on how to make...

  • screenshot

    Would a scope-insensitive AGI be less likely to incapacitate humanity? — LessWrong

    Published on July 21, 2024 2:15 PM GMTI was listening to Anders Sandberg talk about "humble...

  • screenshot

    Holomorphic surjection theorem (Picard's little theorem) — LessWrong

    Published on July 21, 2024 1:24 PM GMTConsider an entire function (complex-differentiable everywhere) f(z).mjx-chtml {display: inline-block; line-height:...

  • screenshot

    aimless ace analyzes active amateur: a micro-aaaaalignment proposal — LessWrong

    Published on July 21, 2024 12:37 PM GMTThis idea is so simple that I'm sure it's...

  • screenshot

    Pivotal Acts are easier than Alignment? — LessWrong

    Published on July 21, 2024 12:15 PM GMTThe prevailing notion in AI safety circles is that...

  • screenshot

    Introduction to Modern Dating: Strategic Dating Advice for beginners — LessWrong

    Published on July 20, 2024 3:45 PM GMTHeads up: This is not really a post about...

  • screenshot

    Ball Sq Pathways — LessWrong

    Published on July 21, 2024 2:20 AM GMT With the Red Line shut down north of...

  • screenshot

    Freedom and Privacy of Thought Architectures — LessWrong

    Published on July 20, 2024 9:43 PM GMTI don't work in cyber security, so others will...

  • screenshot

    Why Georgism Lost Its Popularity — LessWrong

    Published on July 20, 2024 3:08 PM GMTHenry George’s 1879 book Progress & Poverty was the...

  • screenshot

    A more systematic case for inner misalignment — LessWrong

    Published on July 20, 2024 5:03 AM GMTThis post builds on my previous post making the...

  • screenshot

    Truth is Universal: Robust Detection of Lies in LLMs — LessWrong

    Published on July 19, 2024 2:07 PM GMTA short summary of the paper is presented below.TL;DR:...

  • screenshot

    Sustainability of Digital Life Form Societies — LessWrong

    Published on July 19, 2024 1:59 PM GMTHiroshi Yamakawa1,2,3,41 The University of Tokyo, Tokyo, Japan2 AI Alignment Network,...

  • screenshot

    JumpReLU SAEs + Early Access to Gemma 2 SAEs — LessWrong

    Published on July 19, 2024 4:10 PM GMTNew paper from the Google DeepMind mechanistic interpretability team,...

  • screenshot

    Romae Industriae — LessWrong

    Published on July 19, 2024 1:03 PM GMTWhatever each culture grows and manufactures cannot fail to...

  • screenshot

    Have people given up on iterated distillation and amplification? — LessWrong

    Published on July 19, 2024 12:23 PM GMTThe BlueDot Impact write-up for scalable oversight seems to...

  • screenshot

    How do we know that "good research" is good? (aka "direct evaluation" vs "eigen-evaluation") — LessWrong

    Published on July 19, 2024 12:31 AM GMTAI Alignment is my motivating context but this could...

  • screenshot

    Linkpost: Surely you can be serious — LessWrong

    Published on July 18, 2024 10:18 PM GMTAdam Mastroianni writes about "actually caring about stuff, and...

  • screenshot

    My experience applying to MATS 6.0 — LessWrong

    Published on July 18, 2024 7:02 PM GMTThe current cohort of the ML Alignment & Theory...

  • screenshot

    What are the actual arguments in favor of computationalism as a theory of identity? — LessWrong

    Published on July 18, 2024 6:44 PM GMTA few months ago, Rob Bensinger made a rather...

  • screenshot

    Yet Another Critique of "Luxury Beliefs" — LessWrong

    Published on July 18, 2024 6:37 PM GMTI know little about Rob Henderson except that he...

  • screenshot

    Individually incentivized safe Pareto improvements in open-source bargaining — LessWrong

    Published on July 17, 2024 6:26 PM GMTSummaryAgents might fail to peacefully trade in high-stakes negotiations....