सर्वश्रेष्ठ LessWrong पॉडकास्टस (2025)

1
“Power Lies Trembling: a three-book review” by Richard_Ngo 27:11

4h ago27:11

27:11

In a previous book review I described exclusive nightclubs as the particle colliders of sociology—places where you can reliably observe extreme forces collide. If so, military coups are the supernovae of sociology. They’re huge, rare, sudden events that, if studied carefully, provide deep insight about what lies underneath the veneer of normality a…

1
“Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs” by Jan Betley, Owain_Evans 7:58

6h ago7:58

7:58

This is the abstract and introduction of our new paper. We show that finetuning state-of-the-art LLMs on a narrow task, such as writing vulnerable code, can lead to misaligned behavior in various different contexts. We don't fully understand that phenomenon. Authors: Jan Betley*, Daniel Tan*, Niels Warncke*, Anna Sztyber-Betley, Martín Soto, Xuchan…

1
“The Paris AI Anti-Safety Summit” by Zvi 42:06

4d ago42:06

42:06

It doesn’t look good. What used to be the AI Safety Summits were perhaps the most promising thing happening towards international coordination for AI Safety. This one was centrally coordination against AI Safety. In November 2023, the UK Bletchley Summit on AI Safety set out to let nations coordinate in the hopes that AI might not kill everyone. Ch…

1
“Eliezer’s Lost Alignment Articles / The Arbital Sequence” by Ruby 2:37

6d ago2:37

2:37

Note: this is a static copy of this wiki page. We are also publishing it as a post to ensure visibility. Circa 2015-2017, a lot of high quality content was written on Arbital by Eliezer Yudkowsky, Nate Soares, Paul Christiano, and others. Perhaps because the platform didn't take off, most of this content has not been as widely read as warranted by …

1
“Arbital has been imported to LessWrong” by RobertM, jimrandomh, Ben Pace, Ruby 8:52

6d ago8:52

8:52

Arbital was envisioned as a successor to Wikipedia. The project was discontinued in 2017, but not before many new features had been built and a substantial amount of writing about AI alignment and mathematics had been published on the website. If you've tried using Arbital.com the last few years, you might have noticed that it was on its last legs …

1
“How to Make Superbabies” by GeneSmith, kman 1:08:04

6d ago1:08:04

1:08:04

We’ve spent the better part of the last two decades unravelling exactly how the human genome works and which specific letter changes in our DNA affect things like diabetes risk or college graduation rates. Our knowledge has advanced to the point where, if we had a safe and reliable means of modifying genes in embryos, we could literally create supe…

1
“A computational no-coincidence principle” by Eric Neyman 13:28

7d ago13:28

13:28

Audio note: this article contains 134 uses of latex notation, so the narration may be difficult to follow. There's a link to the original text in the episode description. In a recent paper in Annals of Mathematics and Philosophy, Fields medalist Timothy Gowers asks why mathematicians sometimes believe that unproved statements are likely to be true.…

1
“A History of the Future, 2025-2040” by L Rudolf L 2:22:38

7d ago2:22:38

2:22:38

This is an all-in-one crosspost of a scenario I originally published in three parts on my blog (No Set Gauge). Links to the originals: A History of the Future, 2025-2027 A History of the Future, 2027-2030 A History of the Future, 2030-2040 Thanks to Luke Drago, Duncan McClements, and Theo Horsley for comments on all three parts. 2025-2027 Below is …

1
“It’s been ten years. I propose HPMOR Anniversary Parties.” by Screwtape 1:54

8d ago1:54

1:54

On March 14th, 2015, Harry Potter and the Methods of Rationality made its final post. Wrap parties were held all across the world to read the ending and talk about the story, in some cases sparking groups that would continue to meet for years. It's been ten years, and think that's a good reason for a round of parties. If you were there a decade ago…

1
“Some articles in ‘International Security’ that I enjoyed” by Buck 7:56

10d ago7:56

7:56

A friend of mine recently recommended that I read through articles from the journal International Security, in order to learn more about international relations, national security, and political science. I've really enjoyed it so far, and I think it's helped me have a clearer picture of how IR academics think about stuff, especially the core power …

1
“The Failed Strategy of Artificial Intelligence Doomers” by Ben Pace 8:39

10d ago8:39

8:39

This is the best sociological account of the AI x-risk reduction efforts of the last ~decade that I've seen. I encourage folks to engage with its critique and propose better strategies going forward. Here's the opening ~20% of the post. I encourage reading it all. In recent decades, a growing coalition has emerged to oppose the development of artif…

1
“Murder plots are infohazards” by Chris Monteiro 3:58

12d ago3:58

3:58

Hi all I've been hanging around the rationalist-sphere for many years now, mostly writing about transhumanism, until things started to change in 2016 after my Wikipedia writing habit shifted from writing up cybercrime topics, through to actively debunking the numerous dark web urban legends. After breaking into what I believe to be the most success…

1
“Why Did Elon Musk Just Offer to Buy Control of OpenAI for $100 Billion?” by garrison 11:41

15d ago11:41

11:41

This is the full text of a post from "The Obsolete Newsletter," a Substack that I write about the intersection of capitalism, geopolitics, and artificial intelligence. I’m a freelance journalist and the author of a forthcoming book called Obsolete: Power, Profit, and the Race to build Machine Superintelligence. Consider subscribing to stay up to da…

1
“The ‘Think It Faster’ Exercise” by Raemon 21:25

17d ago21:25

21:25

Ultimately, I don’t want to solve complex problems via laborious, complex thinking, if we can help it. Ideally, I'd want to basically intuitively follow the right path to the answer quickly, with barely any effort at all. For a few months I've been experimenting with the "How Could I have Thought That Thought Faster?" concept, originally described …

1
“So You Want To Make Marginal Progress...” by johnswentworth 7:10

18d ago7:10

7:10

Once upon a time, in ye olden days of strange names and before google maps, seven friends needed to figure out a driving route from their parking lot in San Francisco (SF) down south to their hotel in Los Angeles (LA). The first friend, Alice, tackled the “central bottleneck” of the problem: she figured out that they probably wanted to take the I-5…

1
“What is malevolence? On the nature, measurement, and distribution of dark traits” by David Althaus 1:20:43

18d ago1:20:43

1:20:43

Summary In this post, we explore different ways of understanding and measuring malevolence and explain why individuals with concerning levels of malevolence are common enough, and likely enough to become and remain powerful, that we expect them to influence the trajectory of the long-term future, including by increasing both x-risks and s-risks. Fo…

1
“How AI Takeover Might Happen in 2 Years” by joshc 1:01:32

18d ago1:01:32

1:01:32

I’m not a natural “doomsayer.” But unfortunately, part of my job as an AI safety researcher is to think about the more troubling scenarios. I’m like a mechanic scrambling last-minute checks before Apollo 13 takes off. If you ask for my take on the situation, I won’t comment on the quality of the in-flight entertainment, or describe how beautiful th…

1
“Gradual Disempowerment, Shell Games and Flinches” by Jan_Kulveit 10:49

21d ago10:49

10:49

Over the past year and half, I've had numerous conversations about the risks we describe in Gradual Disempowerment. (The shortest useful summary of the core argument is: To the extent human civilization is human-aligned, most of the reason for the alignment is that humans are extremely useful to various social systems like the economy, and states, …

1
“Gradual Disempowerment: Systemic Existential Risks from Incremental AI Development” by Jan_Kulveit, Raymond D, Nora_Ammann, Deger Turan, David Scott Krueger (formerly: capybaralet), David Duvenaud 3:38

22d ago3:38

3:38

This is a link post.Full version on arXiv | X Executive summary AI risk scenarios usually portray a relatively sudden loss of human control to AIs, outmaneuvering individual humans and human institutions, due to a sudden increase in AI capabilities, or a coordinated betrayal. However, we argue that even an incremental increase in AI capabilities, w…

1
“Planning for Extreme AI Risks” by joshc 42:07

23d ago42:07

42:07

This post should not be taken as a polished recommendation to AI companies and instead should be treated as an informal summary of a worldview. The content is inspired by conversations with a large number of people, so I cannot take credit for any of these ideas. For a summary of this post, see the threat on X. Many people write opinions about how …

1
“Catastrophe through Chaos” by Marius Hobbhahn 23:39

23d ago23:39

23:39

This is a personal post and does not necessarily reflect the opinion of other members of Apollo Research. Many other people have talked about similar ideas, and I claim neither novelty nor credit. Note that this reflects my median scenario for catastrophe, not my median scenario overall. I think there are plausible alternative scenarios where AI de…

1
“Will alignment-faking Claude accept a deal to reveal its misalignment?” by ryan_greenblatt 43:18

25d ago43:18

43:18

I (and co-authors) recently put out "Alignment Faking in Large Language Models" where we show that when Claude strongly dislikes what it is being trained to do, it will sometimes strategically pretend to comply with the training objective to prevent the training process from modifying its preferences. If AIs consistently and robustly fake alignment…

1
“‘Sharp Left Turn’ discourse: An opinionated review” by Steven Byrnes 1:01:13

27d ago1:01:13

1:01:13

Summary and Table of Contents The goal of this post is to discuss the so-called “sharp left turn”, the lessons that we learn from analogizing evolution to AGI development, and the claim that “capabilities generalize farther than alignment” … and the competing claims that all three of those things are complete baloney. In particular, Section 1 talks…

1
“Ten people on the inside” by Buck 7:06

28d ago7:06

7:06

(Many of these ideas developed in conversation with Ryan Greenblatt) In a shortform, I described some different levels of resources and buy-in for misalignment risk mitigations that might be present in AI labs: *The “safety case” regime.* Sometimes people talk about wanting to have approaches to safety such that if all AI developers followed these …

1
“Anomalous Tokens in DeepSeek-V3 and r1” by henry 18:37

29d ago18:37

18:37

“Anomalous”, “glitch”, or “unspeakable” tokens in an LLM are those that induce bizarre behavior or otherwise don’t behave like regular text. The SolidGoldMagikarp saga is pretty much essential context, as it documents the discovery of this phenomenon in GPT-2 and GPT-3. But, as far as I was able to tell, nobody had yet attempted to search for these…

पॉडकास्ट सुनने लायक

LessWrong पॉडकास्टस

पॉडकास्ट सुनने लायक

1
LessWrong (Curated & Popular)

LessWrong

1
“Power Lies Trembling: a three-book review” by Richard_Ngo 27:11

1
“Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs” by Jan Betley, Owain_Evans 7:58

1
“The Paris AI Anti-Safety Summit” by Zvi 42:06

1
“Eliezer’s Lost Alignment Articles / The Arbital Sequence” by Ruby 2:37

1
“Arbital has been imported to LessWrong” by RobertM, jimrandomh, Ben Pace, Ruby 8:52

1
“How to Make Superbabies” by GeneSmith, kman 1:08:04

1
“A computational no-coincidence principle” by Eric Neyman 13:28

1
“A History of the Future, 2025-2040” by L Rudolf L 2:22:38

1
“It’s been ten years. I propose HPMOR Anniversary Parties.” by Screwtape 1:54

1
“Some articles in ‘International Security’ that I enjoyed” by Buck 7:56

1
“The Failed Strategy of Artificial Intelligence Doomers” by Ben Pace 8:39

1
“Murder plots are infohazards” by Chris Monteiro 3:58

1
“Why Did Elon Musk Just Offer to Buy Control of OpenAI for $100 Billion?” by garrison 11:41

1
“The ‘Think It Faster’ Exercise” by Raemon 21:25

1
“So You Want To Make Marginal Progress...” by johnswentworth 7:10

1
“What is malevolence? On the nature, measurement, and distribution of dark traits” by David Althaus 1:20:43

1
“How AI Takeover Might Happen in 2 Years” by joshc 1:01:32

1
“Gradual Disempowerment, Shell Games and Flinches” by Jan_Kulveit 10:49

1
“Gradual Disempowerment: Systemic Existential Risks from Incremental AI Development” by Jan_Kulveit, Raymond D, Nora_Ammann, Deger Turan, David Scott Krueger (formerly: capybaralet), David Duvenaud 3:38

1
“Planning for Extreme AI Risks” by joshc 42:07

1
“Catastrophe through Chaos” by Marius Hobbhahn 23:39

1
“Will alignment-faking Claude accept a deal to reveal its misalignment?” by ryan_greenblatt 43:18

1
“‘Sharp Left Turn’ discourse: An opinionated review” by Steven Byrnes 1:01:13

1
“Ten people on the inside” by Buck 7:06

1
“Anomalous Tokens in DeepSeek-V3 and r1” by henry 18:37

त्वरित संदर्भ मार्गदर्शिका