您现在的位置：首页 > 英语听力 > 英语演讲 > TED演讲视频 > 正文

如何控制人工智能（3）

时间:2023-11-22 11:15:58 来源:可可英语编辑:Leon 可可英语APP下载 | 可可官方微信:ikekenet

字号：大 | 中 | 小

打印

收藏本文

So ... The real problem is that we lack a convincing plan for AI safety.

所以……真正的问题是我们没有针对AI安全的明确计划。

People are working hard on evals looking for risky AI behavior, and that's good, but clearly not good enough.

人们正在努力进行评估，寻找危险的AI行为，这很好，但显然还不够好。

They're basically training AI to not say bad things rather than not do bad things.

他们都是在训练人工智能不要“说”不好的内容，而不是不去“做”坏事。

Moreover, evals and debugging are really just necessary, not sufficient, conditions for safety.

此外，评估和调试其实只是AI安全的必要而不是充分条件。

In other words, they can prove the presence of risk, not the absence of risk.

换句话说，它们可以证明风险的存在，而非无风险。

So let's up our game, alright? Try to see how we can make provably safe AI that we can control.

我们来玩一玩吧，好吗？看看我们如何才能制造出可控的、“可证明安全”的AI。

Guardrails try to physically limit harm.

护栏可以试着从物理上控制伤害。

But if your adversary is super-intelligence or a human using super-intelligence against you, right, trying is just not enough.

但是，如果你的对手是超级智能，或者是使用超级智能对付你的人类，“试着获胜”是不够的。

You need to succeed.

你必须得成功。

Harm needs to be impossible.

伤害必须是不存在的。

So we need provably safe systems.

我们需要可证明安全的系统。

Provable, not in the weak sense of convincing some judge, but in the strong sense of there being something that's impossible according to the laws of physics.

“可证明”，不是局限于说服法官的单薄含义，而是彻彻底底说明根据物理定律，有些事情是不可能的。

Because no matter how smart an AI is, it can't violate the laws of physics and do what's provably impossible.

因为无论AI有多么聪明，它都无法违反物理定律，做“可证明”不可能的事情。

Steve Omohundro and I wrote a paper about this, and we're optimistic that this vision can really work.

我和史蒂夫·奥莫洪德罗关于这点写了一篇论文，我们乐观地认为这个愿景能够真正奏效。

So let me tell you a little bit about how.

我简单说一说要如何做到。

There's a venerable field called formal verification, which proves stuff about code.

有一个神圣的领域，叫做“形式验证”，可以证明有关代码的东西。

And I'm optimistic that AI will revolutionize automatic proving business and also revolutionize program synthesis, the ability to automatically write really good code.

我乐观地认为，AI将彻底改变自动证明任务，还将彻底改变程序合成，即自动编写非常好的代码的能力。

So here is how our vision works.

因此，我们的愿景是这样的。

You, the human, write a specification that your AI tool must obey, that it's impossible to log in to your laptop without the correct password, or that a DNA printer cannot synthesize dangerous viruses.

作为人类，你要写一份你的AI工具必须遵守的规范，比如，如果没有正确的密码，它就不可能登录你的电脑，或者DNA打印机无法合成危险病毒。

Then a very powerful AI creates both your AI tool and a proof that your tool meets your spec.

然后，非常强大的AI既要创建你的AI工具，又要创建可以证明你的工具遵守你的规范的证据。

Machine learning is uniquely good at learning algorithms, but once the algorithm has been learned, you can re-implement it in a different computational architecture that's easier to verify.

机器学习尤其擅长学习算法，一旦它学习了算法，你就可以在另一种更易于验证的计算架构中重新实现它。

Now you might worry, how on earth am I going to understand this powerful AI and the powerful AI tool it built and the proof, if they're all too complicated for any human to grasp?

你可能会担心，我到底该如何理解这个强大的AI、它构建的强大AI工具和证据，如果它们对于所有人类都过于复杂，难以理解呢？

Here is the really great news.

以下就是真正的好消息。

You don't have to understand any of that stuff, because it's much easier to verify a proof than to discover it.

你不必了解任何东西，因为验证证据比找证据要容易得多。

So you only have to understand or trust your proof-checking code, which could be just a few hundred lines long.

因此，你只需要理解或信任你的校验代码，它可能只有几百行长。

查看《TED演讲视频》更多内容>>

保存到QQ日志登录QQ空间

重点单词		查看全部解释
optimistic	[.ɔpti'mistik]	想一想再看 adj. 乐观的，乐观主义的
verification	[.verifi'keiʃən]	想一想再看 n. 确认，查证，作证
smart	[smɑ:t]	想一想再看 adj. 聪明的，时髦的，漂亮的，敏捷的，轻快的，整洁的
complicated	['kɔmplikeitid]	想一想再看 adj. 复杂的，难懂的动词complica
presence	['prezns]	想一想再看 n. 出席，到场，存在 n. 仪态，风度
understand	[.ʌndə'stænd]	想一想再看 vt. 理解，懂，听说，获悉，将 ... 理解为，认为<
automatically	[.ɔ:tə'mætikəli]	想一想再看 adv. 自动地，机械地
synthesis	['sinθisis]	想一想再看 n. 合成，综合，推理	联想记忆 X 单词synthesis 联想记忆： syn＝same，thes放，is：把同样的东西放在一起－综合
venerable	['venərəbl]	想一想再看 adj. 庄严的，值得尊敬的
impossible	[im'pɔsəbl]	想一想再看 adj. 不可能的，做不到的 adj.	联想记忆 X 单词impossible 联想记忆： im不，无，非+plssible可能的→不可能的