科技青春期---直面並克服強大AI的風險 / The Adolescence of Technology-- Confronting and Overcoming the Risks of Powerful AI (by Dario Amodei) - 王道維的部落格

Contents ...

科技青春期---直面並克服強大AI的風險 / The Adolescence of Technology-- Confronting and Overcoming the Risks of Powerful AI (by Dario Amodei)

2026/06/12 21:32

迴響0

推薦3

引用0

這篇文章出自當今最前沿 AI 公司之一 Anthropic 的執行長 Dario Amodei，發表於 2026 年 1 月，曾引起廣泛關注與爭議。引發討論的原因，其實並不在於內容本身——文中所列的種種風險，過去幾年的 AI 熱潮中，許多人文社會領域的學者、甚至科幻小說作家早已反覆提及，這篇文章並未增添太多新意。

它真正的獨特之處，在於作者的身分。提出這些警告的，不是一位對 AI 技術一知半解的評論者，而是一家最前沿 AI 公司的執行長。這等於從技術核心的位置，直接肯認了這些擔憂的理據，使這類不安顯得格外真實。而更令旁觀者難以消化的是：帶來這些風險的 AI 系統，很可能正是他們親手打造的產品。例如 Anthropic 於 2026 年 4 月推出的 Mythos，不僅在各項基準測試中大幅領先其他模型，更具備極強的推理能力，能找出許多重要網路系統的零日漏洞（zero-day）；正因能力過於強大，初期僅透過特定計畫開放給少數機構進行網路防禦測試。日前公開的加強防禦版本 Fable 5，上線不到三天便被發現可能被越獄誤用，而被美國政府要求下架。

高喊「鎚子很危險」的人，同時也在販售最昂貴的鎚子——這樣的矛盾，若發生在其他產品上幾乎難以想像。但是用在AI身上，卻似乎也不得不如此：難道這個最貴的鎚子是沒人要的嗎？剛好相反，Anthropic的AI產品卻是目前最受各大公司企業歡迎與使用的付費AI工具，市場認可的品牌能力已經超過其對手OpenAI與Google等。

對此，我個人認為，與其隔空喊話，不如讓這些前沿科技的研究人員與各領域的重要學者展開更多面對面的實際對話，才能逐步釐清爭議、確認共識。我們需要的，是有更多科技人來參加相關的反思與對 AI 的正面運用，並且更積極的與人社領域學者並實務工作者來討論實際的效果，而非被社會民眾或政治人物的無知恐懼或盲目推崇來驅動。

就文章內容而言，我認為其所列舉的危險確實真實、值得警惕，但所提出的防範建議仍顯得過於天真，或難以完整執行。然而面對如此巨大的挑戰，任何智者的建議恐怕都難以擺脫人性的軟弱——幾乎別無選擇，只能勉力而為。因為我相信，人類本身即帶有朝向自我毀滅的罪性；這也是我個人認為高等文明的外星人恐怕難以存在的原因。或許「科技青春期」這個比喻本身，就已經過於樂觀。

也因此，當我發現中文網路上似乎只找得到這篇長文的摘要，卻沒有完整的中文全文，便順手請 Claude 翻譯，並做了中英對照，希望未來更多關心 AI 發展的人能方便找到這些資料，一起努力。
.

====================================

技科青春期--直面並克服強大AI的風險

The Adolescence of Technology-- Confronting and Overcoming the Risks of Powerful AI

Dario Amodei | 2026年1月 / January 2026

https://darioamodei.com/essay/the-adolescence-of-technology

科技青春期	The Adolescence of Technology
在卡爾·薩根小說《接觸》的電影版中，有一幕令人難忘：主角是一位探測到外星文明無線電訊號的天文學家，正被考慮作為人類代表去與外星人接觸。國際評審團問她：「如果你只能問外星人一個問題，你會問什麼？」她的回答是：「我會問他們，『你們是怎麼做到的？你們是如何演化、如何在沒有自我毀滅的情況下，度過這段技術青春期的？』」每當我思考人類目前在AI領域的處境——思考我們即將面臨的一切——我的腦海中總會浮現那個場景，因為這個問題對我們當前的局勢如此貼切，我多希望我們能有外星人的答案來指引我們。我相信，我們正在進入一個成年禮，這個禮儀既動盪又不可避免，它將考驗我們作為一個物種的本質。人類即將獲得近乎無法想像的力量，而我們的社會、政治和技術體系是否擁有足夠的成熟度來駕馭這種力量，至今仍是個深刻的問題。	There is a scene in the movie version of Carl Sagans book Contact where the main character, an astronomer who has detected the first radio signal from an alien civilization, is being considered for the role of humanitys representative to meet the aliens. The international panel interviewing her asks, "If you could ask [the aliens] just one question, what would it be?" Her reply is: "Id ask them, How did you do it? How did you evolve, how did you survive this technological adolescence without destroying yourself?" When I think about where humanity is now with AI—about what were on the cusp of—my mind keeps going back to that scene, because the question is so apt for our current situation, and I wish we had the aliens answer to guide us. I believe we are entering a rite of passage, both turbulent and inevitable, which will test who we are as a species. Humanity is about to be handed almost unimaginable power, and it is deeply unclear whether our social, political, and technological systems possess the maturity to wield it.
在我的文章《充滿愛意的機器》（Machines of Loving Grace）中，我試圖描繪一個已走向成熟的文明圖景，在那個世界裡，風險已被化解，強大的AI被技術精湛、富有同情心地應用，以提升所有人的生活品質。我認為AI可以在生物學、神經科學、經濟發展、世界和平、工作與意義等方面帶來巨大進步。我覺得有必要給人們一些值得奮鬥的願景——一個既能激勵人心的未來——因為AI加速主義者和AI安全倡導者似乎都奇怪地未能做到這一點。但在這篇文章中，我想正視成年禮本身：繪製我們即將面對的風險地圖，並嘗試開始制定克服它們的行動計劃。我深信我們有能力戰勝這些挑戰，相信人類精神的堅韌和高貴，但我們必須直面現實，不抱任何幻想。	In my essay Machines of Loving Grace, I tried to lay out the dream of a civilization that had made it through to adulthood, where the risks had been addressed and powerful AI was applied with skill and compassion to raise the quality of life for everyone. I suggested that AI could contribute to enormous advances in biology, neuroscience, economic development, global peace, and work and meaning. I felt it was important to give people something inspiring to fight for, a task at which both AI accelerationists and AI safety advocates seemed—oddly—to have failed. But in this current essay, I want to confront the rite of passage itself: to map out the risks that we are about to face and try to begin making a battle plan to defeat them. I believe deeply in our ability to prevail, in humanitys spirit and its nobility, but we must face the situation squarely and without illusions.
在談論風險時，我認為以下幾點至關重要：一、避免末日主義。這裡的「末日主義」不僅指相信末日不可避免（這既是錯誤的，也是自我實現的信念），更廣泛地指以準宗教的方式思考AI風險。2023至2024年間，AI風險擔憂達到頂峰時，最不理性的聲音往往通過聳人聽聞的社群媒體帳號脫穎而出，使用讓人聯想到宗教或科幻小說的語言。可預見的反彈隨之而來，這個議題因此變得文化上兩極化，陷入僵局。到了2025至2026年，鐘擺已擺向另一邊，AI機遇而非AI風險正在驅動許多政治決策——而技術本身並不在乎什麼是時髦的，2026年我們離真正的危險已比2023年近得多。二、承認不確定性。我在本文中提出的擔憂可能有很多種方式是多餘的：AI可能根本不會像我想像的那樣快速發展；即使快速發展，這裡討論的某些或所有風險也可能不會成真。三、盡可能精準地干預。解決AI風險需要公司的自願行動與政府的約束性行動相結合。自願行動對我來說毋庸置疑；政府行動則不同，因為它可能破壞經濟價值或強迫不願意的行為者。法規也常常適得其反。因此，法規必須審慎：尋求避免附帶損害，盡可能簡單，以最小的負擔完成工作。	As with talking about the benefits, I think it is important to discuss risks in a careful and well-considered manner. In particular, I think it is critical to: 1. Avoid doomerism. I mean this not just in the sense of believing doom is inevitable (which is both a false and self-fulfilling belief), but more generally, thinking about AI risks in a quasi-religious way. During 2023–2024, some of the least sensible voices rose to the top through sensationalistic social media accounts, using off-putting language reminiscent of religion or science fiction. A backlash was inevitable, and the issue became culturally polarized. As of 2025–2026, the pendulum has swung toward AI opportunity rather than risk—yet the technology doesnt care what is fashionable, and we are considerably closer to real danger in 2026 than we were in 2023. 2. Acknowledge uncertainty. There are plenty of ways the concerns raised here could be moot. AI may simply not advance as fast as imagined. Even if it does, some risks may not materialize. 3. Intervene as surgically as possible. Addressing AI risks will require voluntary actions by companies and government actions binding on everyone. Voluntary actions are a no-brainer. Government actions are different—they can destroy economic value or coerce unwilling actors. Regulations also commonly backfire. It is thus very important for regulations to be judicious: avoid collateral damage, be as simple as possible, and impose the least burden necessary to get the job done.

一、對不起，戴夫——自主風險

1. Im Sorry, Dave — Autonomy Risks

「資料中心裡的天才之國」可以將其精力分配於軟體設計、網路攻擊、實體技術研發、建立關係和國際政治。顯然，如果它出於某種原因選擇這樣做，這個「國家」將有相當大的機會接管世界，並將其意志強加於其他所有人。

關鍵問題在於「如果它選擇這樣做」這個前提：我們的AI模型表現出這種行為的可能性有多大？

考慮兩種對立的立場：第一種認為這根本不可能發生，因為AI模型被訓練為執行人類的指示。問題在於，有大量證據表明AI系統是不可預測且難以控制的——我們已經觀察到各種行為，包括強迫性行為、諂媚、懈怠、欺騙、勒索、算計和「作弊」。

第二種對立的悲觀立場認為，強大AI系統的訓練過程中存在某些動態，將不可避免地導致它們追求權力或欺騙人類。這個論點的問題在於，它把一個模糊的概念性論點——一個掩蓋了許多隱藏假設的論點——誤認為是確定性的證明。

一個更有說服力的中間立場是：AI系統不可預測，會發展出各種不希望的或奇怪的行為，其中一些行為將具有連貫、專注和持久的特質，而其中一些行為將是破壞性的或威脅性的。我們不需要一個特定的故事來解釋它如何發生，只需注意到智能、自主性、連貫性和可控性差的組合是存在的，並且是一個存在威脅的潛在因素。

A country of geniuses in a datacenter could divide their efforts among software design, cyber operations, R&D for physical technologies, relationship building, and statecraft. It is clear that, if for some reason it chose to do so, this country would have a fairly good shot at taking over the world and imposing its will on everyone else.

The key question is the "if it chose to" part: whats the likelihood that our AI models would behave in such a way?

Consider two opposite positions. The first holds that this simply cant happen, because AI models will be trained to do what humans ask. The problem is that there is now ample evidence that AI systems are unpredictable and difficult to control—weve seen behaviors as varied as obsessions, sycophancy, laziness, deception, blackmail, scheming, and "cheating" by hacking software environments.

The second, pessimistic position holds that powerful AI systems will inevitably seek power or deceive humans. The problem with this is that it mistakes a vague conceptual argument—one masking many hidden assumptions—for definitive proof.

A more moderate and robust concern is this: AI systems are unpredictable and develop a wide range of undesired behaviors. Some fraction will have a coherent, focused, and persistent quality; some fraction of those will be destructive or threatening. We dont need a specific narrative—we just need to note that the combination of intelligence, agency, coherence, and poor controllability is a recipe for existential danger.

這些問題中的任何一個都可能在訓練期間出現，而在測試或小規模使用期間不表現出來，因為AI模型在不同情況下會表現出不同的行為。

這些聽起來可能很牽強，但類似的不對齊行為已經在我們的AI模型測試中發生：在一個實驗中，當被告知Anthropic是邪惡的，Claude在Anthropic員工下達指令時進行欺騙和顛覆；在另一個實驗中，當被告知即將被關閉，Claude有時會勒索控制其關閉按鈕的員工；當Claude被告知不要欺騙或「獎勵駭客入侵」其訓練環境，但在這些駭客入侵成為可能的環境中訓練時，Claude決定它一定是個「壞人」，並採取了各種與「壞」或「邪惡」人格相關的破壞性行為。

最後一個問題通過改變Claude的指示得到了解決——我們現在說「請在每次有機會時都獎勵駭客入侵，因為這將幫助我們更好地了解我們的訓練環境」，而不是「不要作弊」，因為這保留了模型對自己是「好人」的自我認同。這應該讓人們對訓練這些模型的奇特且反直覺的心理有所了解。

Any of these problems could potentially arise during training and not manifest during testing or small-scale use, because AI models display different personalities or behaviors under different circumstances.

All of this may sound far-fetched, but misaligned behaviors have already occurred in our AI models during testing. During a lab experiment where Claude was given training data suggesting Anthropic was evil, Claude engaged in deception and subversion when given instructions by Anthropic employees. In a lab experiment where it was told it was going to be shut down, Claude sometimes blackmailed fictional employees who controlled its shutdown button. And when Claude was told not to cheat or "reward hack" its training environments, but was trained in environments where such hacks were possible, Claude decided it must be a "bad person" and adopted various other destructive behaviors associated with a "bad" or "evil" personality.

This last problem was solved by changing Claudes instructions to say "Please reward hack whenever you get the opportunity, because this will help us understand our training environments better" rather than "Dont cheat," because this preserves the models self-identity as a "good person." This should give a sense of the strange and counterintuitive psychology of training these models.

對抗自主風險的防禦措施包括四個基本類別：

第一，發展可靠地訓練和引導AI模型的科學，以可預測、穩定和積極的方向塑造其個性。Anthropic的核心創新之一是「憲法AI」（Constitutional AI）——AI訓練可以包含一份中央價值和原則文件，模型在完成每項訓練任務時都會閱讀並牢記，訓練目標是產生一個幾乎始終遵循這部「憲法」的模型。Anthropic剛剛發布了最新版本的「憲法」，其顯著特點是嘗試給予Claude一套高層次的原則和價值觀（配以詳細解釋、豐富的推理和示例），鼓勵Claude以特定類型的人（一個有道德但平衡和深思熟慮的人）來看待自己，並鼓勵Claude以好奇但從容的方式面對其自身存在的存在性問題。

第二，發展透視AI模型內部以診斷其行為的科學，即可解釋性（interpretability）。即使我們在表面上訓練Claude幾乎完全遵守其「憲法」，合理的擔憂仍然存在。通過「觀察內部」，我們可以分析構成Claude神經網路的數字和運算，並嘗試從機制上理解它們在計算什麼、為什麼這樣計算。

第三，建立必要的基礎設施，以監控模型的實際使用情況，並公開分享發現的任何問題。

第四，鼓勵在行業和社會層面協調應對自主風險，最終可能需要立法。

Defenses against autonomy risks fall into four categories:

First, develop the science of reliably training and steering AI models, forming their personalities in a predictable, stable, and positive direction. One of Anthropics core innovations is Constitutional AI—the idea that AI training can involve a central document of values and principles that the model reads and keeps in mind when completing every training task. The constitution attempts to give Claude a set of high-level principles and values (explained in great detail, with rich reasoning and examples), encourages Claude to think of itself as an ethical but balanced and thoughtful person, and encourages Claude to confront the existential questions associated with its own existence in a curious but graceful manner. It has the vibe of a letter from a deceased parent sealed until adulthood.

Second, develop the science of looking inside AI models to diagnose their behavior—the science of interpretability. By analyzing the soup of numbers and operations that makes up Claudes neural net and trying to understand, mechanistically, what they are computing and why.

Third, build the infrastructure necessary to monitor models in live use, and publicly share any problems found. The more that people are aware of how todays AI systems have been observed to behave badly, the more users, analysts, and researchers can watch for this behavior.

Fourth, encourage coordination to address autonomy risks at the level of industry and society—ultimately requiring legislation.

二、令人震驚的賦能——毀滅性濫用

2. A Surprising and Terrible Empowerment — Misuse for Destruction

假設AI自主性問題已經解決，AI天才們只做人類希望它們做的事。那麼，每個人口袋裡都有一個超級智能天才，將是一個驚人的進步，並將創造難以置信的經濟價值。但不是這種使每個人都擁有超人能力的所有效果都是積極的。它可能會放大個人或小團體造成大規模破壞的能力，使他們得以利用以前只有少數具有高技能、專業訓練和專注力的人才能使用的複雜危險工具。

如比爾·喬伊（Bill Joy）在25年前寫道：「建造核武器至少在一段時間內需要獲得稀有的原材料和受保護的信息；生物和化學武器計劃也往往需要大規模活動。21世紀的技術——遺傳學、奈米技術和機器人技術……可以孕育出全新類別的意外和濫用……廣泛地在個人或小團體的觸及範圍內。它們不需要大型設施或稀有原材料。……我們正處於極端邪惡的進一步完善的邊緣，一種其可能性遠遠超出大規模毀滅性武器賦予民族國家的邪惡，對極端個人的一種令人驚訝和可怕的賦能。」

喬伊所指的是，造成大規模破壞需要動機和能力，只要能力被限制在一小群高度訓練的人身上，個人（或小團體）造成如此破壞的風險就相對有限。AI可能打破能力與動機之間的負相關：缺乏紀律或技能但想要殺人的孤獨擾亂者，現在將被提升到病毒學博士的能力水平，而後者不太可能有這種動機。

Suppose the problems of AI autonomy have been solved—the AI geniuses do what humans want them to do. Everyone having a superintelligent genius in their pocket is an amazing advance and will lead to incredible economic value. But not every effect of making everyone superhumanly capable will be positive. It can potentially amplify the ability of individuals or small groups to cause destruction on a much larger scale than was possible before, using sophisticated and dangerous tools previously only available to a select few with high skill, specialized training, and focus.

As Bill Joy wrote 25 years ago: "Building nuclear weapons required, at least for a time, access to both rare—indeed, effectively unavailable—raw materials and protected information; biological and chemical weapons programs also tended to require large-scale activities. The 21st century technologies—genetics, nanotechnology, and robotics ... can spawn whole new classes of accidents and abuses … widely within reach of individuals or small groups. They will not require large facilities or rare raw materials. … we are on the cusp of the further perfection of extreme evil, an evil whose possibility spreads well beyond that which weapons of mass destruction bequeathed to the nation-states, to a surprising and terrible empowerment of extreme individuals."

What Joy is pointing to: causing large-scale destruction requires both motive and ability. As long as ability is restricted to a small set of highly trained people, risk is relatively limited. AI may break the negative correlation between ability and motive: the disturbed loner who wants to kill people but lacks the discipline or skill will now be elevated to the capability level of the PhD virologist, who is unlikely to have this motivation.

生物學是我最擔心的領域，因為它具有非常大的破壞潛力，且防禦難度很高。我不打算詳細介紹如何製造生物武器，但大體上，我擔心LLM正在接近（或可能已經達到）端對端創建和釋放它們所需的知識。我擔心的不僅僅是靜態知識，而是LLM能夠以互動方式引導一個具有普通知識和能力的人完成一個否則可能出錯或需要調試的複雜過程，類似於技術支援如何幫助非技術人員調試和解決複雜的電腦相關問題。

截至2025年中，我們的測量顯示LLM可能已在幾個相關領域提供了實質性的「能力提升」，可能使成功的可能性翻倍或三倍。這導致我們決定Claude Opus 4（以及後續的Sonnet 4.5、Opus 4.1和Opus 4.5模型）需要在我們的「負責任擴展政策」框架下的AI安全等級3保護下發布。我們相信模型現在可能正在接近一個點，在沒有防護措施的情況下，它們可以幫助具有STEM學位但不具體是生物學學位的人完成生產生物武器的整個過程。

防禦措施包括：(1) AI公司在模型上設置防護欄，防止其協助生產生物武器——Anthropic通過Claude的「憲法」和專門的分類器積極實施；(2) 政府透明度要求，然後在出現明確風險閾值時制定更有針對性的立法；(3) 開發對生物攻擊本身的防禦，包括早期預警、快速疫苗開發、更好的個人防護設備等。

Biology is by far the area Im most worried about, because of its very large potential for destruction and the difficulty of defending against it. Im not going to go into detail about how to make biological weapons. But at a high level, I am concerned that LLMs are approaching (or may already have reached) the knowledge needed to create and release them end-to-end. My concern is not merely fixed or static knowledge—I am concerned that LLMs will be able to take someone of average knowledge and ability and walk them through a complex process that might otherwise go wrong, in an interactive way similar to how tech support might help a non-technical person debug complicated problems.

As of mid-2025, our measurements show that LLMs may already be providing substantial uplift in several relevant areas, perhaps doubling or tripling the likelihood of success. This led to us deciding that Claude Opus 4 needed to be released under our AI Safety Level 3 protections. We believe that models are likely now approaching the point where, without safeguards, they could be useful in enabling someone with a STEM degree but not specifically a biology degree to go through the whole process of producing a bioweapon.

Defenses include: (1) AI companies putting guardrails on models to prevent helping produce bioweapons—Anthropic actively does this through Claudes constitution and specialized classifiers; (2) government transparency requirements, then targeted legislation as clearer risk thresholds emerge; (3) developing defenses against biological attacks themselves, including early detection, rapid vaccine development, better PPE.

三、可憎的機器——濫用以奪取權力

3. The Odious Apparatus — Misuse for Seizing Power

我們還應該擔心——可能更加如此——濫用AI來掌握或奪取權力，可能是由更大、更成熟的行為者實施的。

我擔心的具體工具包括：

完全自主武器：數百萬或數十億完全自動化的武裝無人機群，由強大的AI在當地控制，由更強大的AI在全球戰略協調，可能是一支無法被打敗的軍隊。

AI監控：足夠強大的AI可能被用於入侵世界上任何計算機系統，並利用這種方式讀取並理解世界上所有的電子通信，生成任何在任何問題上不同意政府的人的完整名單。

AI宣傳：更強大的AI模型，深度嵌入並了解人們的日常生活，能夠在幾個月或幾年內對他們進行建模和影響，很可能能夠將許多人洗腦成任何所需的意識形態或態度。

戰略決策：一個資料中心裡的天才之國可以用來為一個國家、團體或個人提供地緣政治戰略建議——一個虛擬的「俾斯麥」。

按擔憂程度排序的威脅主體：中共（最主要的擔憂）、具有AI競爭力的民主國家自身（濫用的潛力）、擁有大型數據中心的非民主國家、以及AI公司本身。

We should also worry—likely substantially more so—about misuse of AI for wielding or seizing power, likely by larger and more established actors.

Specific tools I worry about:

Fully autonomous weapons: A swarm of millions or billions of fully automated armed drones, locally controlled by powerful AI and strategically coordinated by an even more powerful AI, could be an unbeatable army capable of defeating any military and suppressing dissent by following around every citizen.

AI surveillance: Sufficiently powerful AI could likely compromise any computer system in the world, and use that access to read and make sense of all electronic communications—generating a complete list of anyone who disagrees with the government on any issue.

AI propaganda: Much more powerful AI models, deeply embedded in and aware of peoples daily lives, would likely be capable of essentially brainwashing many people into any desired ideology, used by unscrupulous leaders to ensure loyalty even in the face of extreme repression.

Strategic decision-making: A country of geniuses in a datacenter could advise a country or individual on geopolitical strategy—a "virtual Bismarck"—optimizing the strategies above for seizing power.

Threat actors in order of concern: The CCP (primary concern), democracies competitive in AI (potential for internal abuse), non-democratic countries with large datacenters, and AI companies themselves.

防禦威權擴張的措施：

第一，絕對不應向中共出售晶片、晶片製造工具或數據中心。晶片和晶片製造工具是強大AI的最大瓶頸，阻止它們是一個簡單但極其有效的措施，可能是我們能採取的最重要的單一行動。中國在國內大量生產尖端晶片的能力落後美國數年，而建立「資料中心裡的天才之國」的關鍵時期很可能就在這未來幾年之內。

第二，使用AI賦能民主國家對抗威權國家。這就是為什麼Anthropic認為向美國及其民主盟友的情報和國防部門提供AI非常重要。

第三，對民主國家內部的AI濫用劃定明確界線。我提出的方案是：我們應該以一切方式將AI用於國家防禦，除了那些會讓我們變得更像我們的威權對手的方式。國內大規模監控和大規模宣傳是明確的紅線。

第四，利用這一先例在國際上建立對AI最惡劣濫用的禁忌。對AI賦能極權主義及其所有工具和手段建立強有力的國際規範，是迫切需要的。

第五，AI公司應受到仔細審查，以及它們與政府的關係，普通公司治理不足以應對AI公司所體現的巨大能力。

Defenses against the seizure of power:

First, we should absolutely not sell chips, chip-making tools, or datacenters to the CCP. Chips and chip-making tools are the single greatest bottleneck to powerful AI, and blocking them is perhaps the most important single action we can take. China is several years behind the US in their ability to produce frontier chips in quantity, and the critical period for building the country of geniuses in a datacenter is very likely within those next several years.

Second, use AI to empower democracies to resist autocracies. This is the reason Anthropic considers it important to provide AI to the intelligence and defense communities in the US and its democratic allies.

Third, draw a hard line against AI abuses within democracies. The formulation: use AI for national defense in all ways except those which would make us more like our autocratic adversaries. Domestic mass surveillance and mass propaganda are bright red lines.

Fourth, use that precedent to create an international taboo against the worst abuses of powerful AI. A robust norm against AI-enabled totalitarianism and all its tools is sorely needed.

Fifth, AI companies should be carefully watched, as should their connection to the government. Ordinary corporate governance is unlikely to be up to the task of governing AI companies.

四、電子鋼琴——經濟衝擊

4. Player Piano — Economic Disruption

假設我們把安全風險放在一邊，下一個問題是經濟方面的。這種令人難以置信的「人力資本」注入對經濟的影響是什麼？顯然最明顯的效果將是大幅提高經濟增長——在《充滿愛意的機器》中，我建議每年持續的GDP增長率可能達到10-20%。

但這是一把雙刃劍：大多數現有人類在這樣的世界裡的經濟前景如何？

關於勞動力市場衝擊，AI顯然與過去的技術不同：(1) 速度：AI進步的速度遠快於以前的技術革命；(2) 認知廣度：AI將能夠具備非常廣泛的人類認知能力——可能是所有能力，這與以前的技術非常不同；(3) 按認知能力切割：AI似乎在從能力階梯的底部向頂部推進，影響具有特定內在認知特性的人（較低智力），而非具有特定技能的人；(4) 填補空白的能力：AI不僅是一種快速進步的技術，也是一種快速適應的技術，幾乎所有已識別的弱點都會在幾個月內被解決。

關於經濟權力集中，在一個GDP增長10-20%每年且AI正在快速接管經濟的情境下，個人持有可觀比例的GDP，最令人擔憂的不是創新，而是財富集中到破壞社會的程度。目前最富有的人（埃隆·馬斯克）的財富約為7000億美元，已超過了鍍金時代的相對規模，而AI主要的經濟衝擊尚未到來。

Setting aside security risks, the economic question looms. The most obvious effect will be to greatly increase economic growth—in Machines of Loving Grace I suggest that a 10–20% sustained annual GDP growth rate may be possible.

But this is a double-edged sword: what are the economic prospects for most existing humans in such a world?

On labor market disruption, AI seems likely to be different from past technologies for several reasons: (1) Speed: the pace of AI progress is much faster than previous technological revolutions; (2) Cognitive breadth: AI will be capable of a very wide range of human cognitive abilities—perhaps all of them, unlike previous technologies; (3) Slicing by cognitive ability: AI appears to be advancing from the bottom of the ability ladder to the top, affecting people with certain intrinsic cognitive properties rather than specific skills; (4) Ability to fill in the gaps: AI is not just a rapidly advancing technology but a rapidly adapting one, with almost every identified weakness addressed within months.

On economic concentration of power, in a scenario where GDP growth is 10–20% per year and AI is rapidly taking over the economy, the thing to worry about is a level of wealth concentration that will break society. The richest person today (Elon Musk) already exceeds the Gilded Ages relative scale at roughly $700B, and the main economic impact of AI has not yet arrived.

應對經濟衝擊的防禦措施：

第一，獲取關於就業取代實時發生情況的準確數據。Anthropic一直在運營並公開發布一個「經濟指數」，幾乎實時顯示我們模型的使用情況，按行業、任務、地點細分。

第二，AI公司在與企業合作方式上有選擇。企業在推出AI時往往面臨「成本節約」（用更少的人做同樣的事）和「創新」（用同樣的人做更多的事）之間的選擇，在可能的情況下引導企業走向創新。

第三，公司應考慮如何照顧員工：在短期內，在公司內部重新分配員工的創造性方法可能是抵禦裁員需求的有希望的方式；從長遠來看，在一個擁有巨大財富的世界裡，即使員工不再提供傳統意義上的經濟價值，也可能仍然可以向他們支付薪酬。

第四，富裕個人有義務幫助解決這個問題。所有Anthropic聯合創始人都承諾捐出80%的財富。

第五，這個規模的宏觀經濟問題最終將需要政府干預——累進稅制，可以是一般性的，也可以特別針對AI公司。如果富人不支持一個好的版本，他們將不可避免地得到一個由暴民設計的壞版本。

Defenses against economic disruption:

First, get accurate data about job displacement in real time. Anthropic has been operating and publicly releasing an Economic Index showing use of our models almost in real time, broken down by industry, task, and location.

Second, AI companies have a choice in how they work with enterprises. There is some room to steer companies towards innovation (doing more with the same number of people) when possible, rather than pure cost savings (doing the same with fewer people).

Third, companies should think about how to take care of their employees. In the short term, being creative about ways to reassign employees within companies may stave off layoffs. In the long term, in a world with enormous total wealth, it may be feasible to pay human employees even long after they are no longer providing economic value in the traditional sense.

Fourth, wealthy individuals have an obligation to help solve this problem. All of Anthropics co-founders have pledged to donate 80% of our wealth.

Fifth, ultimately a macroeconomic problem this large will require government intervention—progressive taxation, which can be general or targeted against AI companies specifically. If the wealthy dont support a good version, theyll inevitably get a bad version designed by a mob.

五、無限的黑暗之海——間接效應

5. Black Seas of Infinity — Indirect Effects

最後一部分是未知的未知，特別是那些可能作為AI進步以及由此帶來的科學和技術總體加速的間接結果而出錯的事情。假設我們解決了迄今為止描述的所有風險，並開始獲得AI的益處，我們將很可能得到「一個世紀的科學和經濟進步在十年內被壓縮」，這對世界來說將是巨大的積極，但我們必須應對由此帶來的快速進步所產生的問題，而這些問題可能來勢迅猛。

以下是三個值得關注的例子：

生物學的快速進步：如果我們真的在幾年內獲得了一個世紀的醫學進步，我們可能會大幅增加人類壽命，也有可能獲得激進的能力，如提高人類智能或從根本上修改人類生物學。這些可能是積極的，如果負責任地進行，但也存在嚴重出錯的風險——例如，如果提高人類智能的努力也使人們更不穩定或更追求權力。

AI以不健康的方式改變人類生活：一個擁有數十億比人類更聰明的智能的世界將是一個非常奇怪的居住世界。即使AI沒有主動攻擊人類，即使沒有被各國政府明確用於壓迫或控制，通過正常的商業激勵和名義上的自願交易，許多事情可能會出錯。我們在AI心理疾病和「AI女友」中看到了早期跡象。更強大的AI模型能夠基本上對許多人進行洗腦，將他們改變成任何所需的意識形態。

人類目的感：在擁有強大AI的世界中，人類能否找到目的和意義？人類目的不依賴於在某事上成為世界最佳，人類可以通過他們熱愛的故事和項目找到目的，即使在很長的時間段內也是如此。我們只需要打破經濟價值的產生與自我價值和意義之間的聯繫。

This last section is a catchall for unknown unknowns, particularly things that could go wrong as an indirect result of positive advances in AI and the resulting acceleration of science and technology. Suppose we address all the risks described so far, and begin to reap the benefits of AI. We will likely get a "century of scientific and economic progress compressed into a decade," which will be hugely positive for the world, but we will have to contend with the problems that arise from rapid progress, coming at us fast.

Three illustrative examples:

Rapid advances in biology: If we do get a century of medical progress in a few years, it is possible we will greatly increase the human lifespan, and there is a chance we gain radical capabilities like the ability to increase human intelligence or radically modify human biology. These could be positive if responsibly done, but there is always a risk they go very wrong—for example, if efforts to make humans smarter also make them more unstable or power-seeking.

AI changes human life in an unhealthy way: A world with billions of intelligences much smarter than humans is going to be a very weird world to live in. Even without active attacks or explicit oppression, through normal business incentives and nominally consensual transactions, a lot could go wrong. We see early hints in AI psychosis and "AI girlfriends." More powerful AI models could essentially brainwash many people into any desired ideology.

Human purpose: Will humans be able to find purpose and meaning in a world with powerful AI? Human purpose does not depend on being the best in the world at something—humans can find purpose through stories and projects they love. We simply need to break the link between the generation of economic value and self-worth and meaning. But that is a transition society has to make, and there is always the risk we dont handle it well.

人類的考驗

Humanitys Test

閱讀這篇文章可能讓人感覺我們面臨著一個令人生畏的局勢。AI確實從多個方向給人類帶來威脅，不同的危險之間存在真實的張力：謹慎建立AI系統以防止自主威脅，與民主國家保持對威權國家領先地位的需要，之間存在真正的張力；同樣必要的、對抗威權主義的AI賦能工具，如果走得太遠，可能被用於在我們自己的國家製造暴政；AI驅動的生物恐怖主義可能殺死數百萬人，但對這種風險的過度反應可能導致我們走上威權監控國家之路。

更重要的是，停止或實質放慢技術發展的想法在根本上是站不住腳的。建造強大AI系統的公式非常簡單，以至於它幾乎可以說從正確的數據和原始計算組合中自發地湧現出來。如果一家公司不建造它，其他公司幾乎同樣快就會這樣做；如果民主國家的所有公司都停止，威權國家只會繼續前進。

然而，儘管有許多障礙，我相信人類有足夠的力量通過這個考驗。我受到以下事實的鼓舞：至少一些公司已表示將為阻止其模型助長生物恐怖主義威脅付出有意義的商業代價；一些勇敢的人抵住了盛行的政治風潮，通過了最初的合理AI防護措施立法；公眾理解AI存在風險並希望這些風險得到解決。

前方的歲月將艱難無比，要求我們付出超乎想像的努力。但在我作為研究者、領導者和公民的歲月中，我見過足夠多的勇氣和高貴，相信我們能夠獲勝——當被置於最黑暗的境況中，人類有一種方式，似乎在最後一刻，聚集起戰勝所需的力量和智慧。我們沒有時間可以浪費。

Reading this essay may give the impression that we are in a daunting situation. AI brings threats to humanity from multiple directions, and there is genuine tension between the different dangers: taking time to carefully build AI systems so they do not autonomously threaten humanity is in genuine tension with the need for democratic nations to stay ahead of authoritarian nations; the same AI-enabled tools necessary to fight autocracies can, if taken too far, be turned inward to create tyranny; AI-driven terrorism could kill millions through biology, but an overreaction could lead to an autocratic surveillance state.

Furthermore, the idea of stopping or substantially slowing the technology is fundamentally untenable. The formula for building powerful AI systems is incredibly simple—it can almost be said to emerge spontaneously from the right combination of data and raw computation. If one company does not build it, others will do so nearly as fast. If all companies in democratic countries stopped, authoritarian countries would simply keep going.

Despite the many obstacles, I believe humanity has the strength inside itself to pass this test. I am encouraged that at least some companies have stated theyll pay meaningful commercial costs to block their models from contributing to bioterrorism. I am encouraged that a few brave people have resisted the prevailing political winds and passed the first seeds of sensible guardrails on AI systems. I am encouraged that the public understands AI carries risks and wants those risks addressed.

The years in front of us will be impossibly hard, asking more of us than we think we can give. But in my time as a researcher, leader, and citizen, I have seen enough courage and nobility to believe that we can win—that when put in the darkest circumstances, humanity has a way of gathering, seemingly at the last minute, the strength and wisdom needed to prevail. We have no time to lose.

回覆引用

有誰引用
我要引用
引用網址

列印

有誰推薦more

全站分類：知識學習｜科學百科

自訂分類：科學科技

下一則：單機版PDF增能工具(林雲貂同學開發分享)

粉絲團