Repeat Yourself  重复自己

One of the most repeated pieces of advice throughout my career in software has been “don’t repeat yourself,” also known as the DRY principle. For the longest time, I took that at face value, never questioning its validity.
在我的软件职业生涯中,重复次数最多的建议之一是“不要重复自己”,也称为 DRY 原则。在很长一段时间里,我只相信它的表面价值,从未质疑过它的有效性。

That was until I saw actual experts write code: they copy code all the time1. I realized that repeating yourself has a few great benefits.
直到我看到真正的专家编写代码: 他们一直在复制代码 1.我意识到重复自己有几个很大的好处。

Why People Love DRY  为什么人们喜欢 DRY

The common wisdom is that if you repeat yourself, you have to fix the same bug in multiple places, but if you have a shared abstraction, you only have to fix it once.
普遍的看法是,如果你重复自己,你必须在多个地方修复同一个错误,但如果你有一个共享的抽象,你只需要修复一次。

Another reason why we avoid repetition is that it makes us feel clever. “Look, I know all of these smart ways to avoid repetition! I know how to use interfaces, generics, higher-order functions, and inheritance!”
我们避免重复的另一个原因是它让我们觉得自己很聪明。“听着,我知道所有这些避免重复的聪明方法!我知道如何使用接口、泛型、高阶函数和继承!

Both reasons are misguided. There are many benefits of repeating yourself that might get us closer to our goals in the long run.
这两个原因都是误导性的。从长远来看,重复自己有很多好处,可能会让我们更接近我们的目标。

Keeping Up The Momentum  保持势头

When you’re writing code, you want to keep the momentum going to get into a flow state. If you constantly pause to design the perfect abstraction, it’s easy to lose momentum.
编写代码时,您希望保持进入流状态的动量。如果你不断停下来设计完美的抽象,很容易失去动力。

Instead, if you allow yourself to copy-paste code, you keep your train of thought going and work on the problem at hand. You don’t introduce another problem of trying to find the right abstraction at the same time.
相反,如果您允许自己复制粘贴代码,您就可以保持思路并解决手头的问题。你不会引入另一个问题,即同时尝试找到正确的抽象。

It’s often easier to copy existing code and modify it until it becomes too much of a burden, at which point you can go and refactor it.
复制现有代码并对其进行修改通常更容易,直到它变得负担太大,此时您可以去重构它。

I would argue that “writing mode” and “refactoring mode” are two different modes of programming. During writing mode, you want to focus on getting the idea down and stop your inner critic, which keeps telling you that your code sucks. During refactoring mode, you take the opposite role: that of the critic. You look for ways to improve the code by finding the right abstractions, removing duplication, and improving readability.
我认为“写入模式”和“重构模式”是两种不同的编程模式。在写作模式下,你想专注于把想法写下来,并阻止你内心的批评者,它不断告诉你你的代码很糟糕。在重构模式下,你扮演相反的角色:批评者的角色。您寻找通过找到正确的抽象、消除重复和提高可读性来改进代码的方法。

Keep these two modes separate. Don’t try to do both at the same time.2
将这两种模式分开。不要试图同时做这两件事。 阿拉伯数字

Finding The Right Abstraction Is Hard
找到正确的抽象很难

When you start to write code, you don’t know the right abstraction just yet. But if you copy code, the right abstraction reveals itself; it’s too tedious to copy the same code over and over again, at which point you start to look for ways to abstract it away. For me, this typically happens after the first copy of the same code, but I try to resist the urge until the 2nd or 3rd copy.
当您开始编写代码时,您还不知道正确的抽象。但是,如果你复制代码,正确的抽象就会显现出来;一遍又一遍地复制相同的代码太乏味了,这时你开始寻找将其抽象化的方法。对我来说,这通常发生在相同代码的第一个副本之后,但我试图克制住这种冲动,直到第二个或第三个副本。

If you start too early, you might end up with a bad abstraction that doesn’t fit the problem. You know it’s wrong because it feels clunky. Some typical symptoms include:
如果你开始得太早,你最终可能会得到一个不适合问题的糟糕抽象。你知道这是错误的,因为它感觉很笨拙 。一些典型症状包括:

  • Generic names that don’t convey intent, e.g., render_pdf_file instead of generate_invoice
    不传达意图的通用名称,例如,render_pdf_file 而不是 generate_invoice
  • Difficult to understand without additional context
    如果没有额外的上下文,很难理解
  • The abstraction is only used in one or two places
    抽象仅在一两个地方使用
  • Tight coupling to implementation details
    与实施细节紧密耦合

It’s Hard To Get Rid Of Wrong Abstractions
很难摆脱错误的抽象

We easily settle for the first abstraction that comes to mind, but most often, it’s not the right one. And removing the wrong abstraction is hard work, because now the data flow depends on it.
我们很容易满足于想到的第一个抽象,但大多数情况下,它不是正确的。删除错误的抽象是一项艰巨的工作,因为现在数据流依赖于它。

We also tend to fall in love with our own abstractions because they took time and effort to create. This makes us reluctant to discard them even when they no longer fit the problem—it’s a sunk cost fallacy.
我们也倾向于爱上我们自己的抽象,因为它们需要时间和精力来创造。这使得我们不愿意放弃它们,即使它们不再适合问题——这是一个沉没成本谬误。

It gets worse when other programmers start to depend on it, too. Then you have to be careful about changing it, because it might break other parts of the codebase. Once you introduce an abstraction, you have to work with it for a long time, sometimes forever.
当其他程序员也开始依赖它时,情况会变得更糟。然后你必须小心更改它,因为它可能会破坏代码库的其他部分。一旦你引入了一个抽象,你就必须使用它很长时间,有时甚至永远。

If you had a copy of the code instead, you could just change it in one place without worrying about breaking anything else.
如果你有代码的副本,你可以在一个地方更改它,而不必担心破坏任何其他东西。

Duplication is far cheaper than the wrong abstraction
复制比错误的抽象便宜得多

—Sandi Metz, The Wrong Abstraction
——桑迪·梅茨,《错误的抽象》

Better to wait until the last moment to settle on the abstraction, when you have a solid understanding of the problem space.3
最好等到最后一刻才确定抽象,那时你对问题空间有了深入的了解。3

The Mental Overhead of Abstractions
抽象的心理开销

Abstraction reduces code duplication, but it comes at a cost.
抽象减少了代码重复,但这是有代价的。

Abstractions can make code harder to read, understand, and maintain because you have to jump between multiple levels of indirection to understand what the code does. The abstraction might live in different files, modules, or libraries.
抽象会使代码更难阅读、理解和维护,因为您必须在多个间接级别之间跳转才能理解代码的作用。抽象可能存在于不同的文件、模块或库中。

The cost of traversing these layers is high. An expert programmer might be able to keep a few levels of abstraction in their head, but we all have a limited context window (which depends on familiarity with the codebase).
遍历这些层的成本很高。专业程序员可能能够在他们的头脑中保留几个级别的抽象,但我们都有一个有限的上下文窗口(这取决于对代码库的熟悉程度)。

When you copy code, you can keep all the logic in one place. You can just read the whole thing and understand what it does.
复制代码时,可以将所有逻辑保存在一个位置。你可以阅读整个内容并了解它的作用。

Resist The Urge Of Premature Abstraction
抵制过早抽象的冲动

Sometimes, code looks similar but serves different purposes.
有时,代码看起来相似,但用途不同。

For example, consider two pieces of code that calculate a sum by iterating over a collection of items.
例如,考虑两段代码,它们通过迭代项目集合来计算总和。

total = 0
for item in shopping_cart:
    total += item.price * item.quantity

And elsewhere in the code, we have
在代码的其他地方,我们有

total = 0
for item in package_items:
    total += item.weight * item.rate

In both cases, we iterate over a collection and calculate a total. You might be tempted to introduce a helper function, but the two calculations are very different.
在这两种情况下,我们都会迭代集合并计算总数。您可能很想引入辅助函数,但这两种计算非常不同。

After a few iterations, these two pieces of code might evolve in different directions:
经过几次迭代后,这两段代码可能会朝着不同的方向发展:

def calculate_total_price(shopping_cart):
    if not shopping_cart:
        raise ValueError("Shopping cart cannot be empty")
    
    total = 0.0
    for item in shopping_cart:
        # Round for financial precision
        total += round(item.price * item.quantity, 2)
    
    return total

In contrast, the shipping cost calculation might look like this:
相比之下,运费计算可能如下所示:

def calculate_shipping_cost(package_items, destination_zone):
    # Use higher of actual weight vs dimensional weight
    total_weight = sum(item.weight for item in package_items)
    total_volume = sum(item.length * item.width * item.height for item in package_items)
    dimensional_weight = total_volume / 5000  # FedEx formula
    
    billable_weight = max(total_weight, dimensional_weight)
    return billable_weight * shipping_rates[destination_zone]

Had we applied “don’t repeat yourself” too early, we would have lost the context and specific requirements of each calculation.
如果我们过早应用“不要重复自己”,我们就会失去每次计算的上下文和具体要求。

DRY Can Introduce Complexity
DRY 会带来复杂性

The DRY principle is misinterpreted as a blanket rule to avoid any duplication at all costs, which can lead to complexity.
DRY 原则被误解为一攬子规则,以不惜一切代价避免任何重复,这可能导致复杂性。

When you try to avoid repetition by introducing abstractions, you have to deal with all the edge cases in a place far away from the actual business logic. You end up adding redundant checks and conditions to the abstraction, just to make sure it works in all cases. Later on, you might forget the reasoning behind those checks, but you keep them around “just in case” because you don’t want to break any callers. The result is dead code that adds complexity to the codebase; all because you wanted to avoid repeating yourself.
当您试图通过引入抽象来避免重复时,您必须在远离实际业务逻辑的地方处理所有边缘情况。你最终会向抽象添加冗余检查和条件,只是为了确保它在所有情况下都有效。稍后,您可能会忘记这些检查背后的原因,但您会保留它们“以防万一”,因为您不想破坏任何呼叫者。结果是死代码增加了代码库的复杂性;这一切都是因为你想避免重复自己。

The common wisdom is that if you repeat yourself, you have to fix the same bug in multiple places. But the assumption is that the bug exists in all copies. In reality, each copy might have evolved in different ways, and the bug might only exist in one of them.
普遍的看法是,如果你重复自己,你必须在多个地方修复同一个错误。但假设该错误存在于所有副本中。实际上,每个副本可能以不同的方式进化,并且错误可能只存在于其中一个副本中。

When you create a shared abstraction, a bug in that abstraction breaks every caller, breaking multiple features at once. With duplicated code, a bug is isolated to just one specific use case.
创建共享抽象时,该抽象中的 bug 会破坏每个调用者,同时破坏多个功能。对于重复的代码,错误仅被隔离到一个特定的用例。

Clean Up Afterwards  事后清理

Knowing that you didn’t break anything in a shared abstraction is much harder than checking a single copy of the code. Of course, if you have a lot of copies, there is a risk of forgetting to fix all of them.
知道你没有破坏共享抽象中的任何内容比检查代码的单个副本要困难得多。当然,如果您有很多副本,则有可能忘记修复所有副本。

The key to making this work is to clean up afterwards. This can happen before you commit the code or during a code review.
让这项工作的关键是事后清理。这可能会在提交代码之前或代码评审期间发生。

At this stage, you can look at the code you copied and see if it makes sense to keep it as is or if you can see the right abstraction. I try to refactor code once I have a better understanding of the problem, but not earlier.
在这个阶段,您可以查看复制的代码,看看保持原样是否有意义,或者是否可以看到正确的抽象。一旦我对问题有了更好的理解,我就会尝试重构代码,但不会更早。

A trick to undo a bad abstraction is to inline the code back into the places where it was used. For a while, you end up “repeating yourself” again in the codebase, but that’s okay. Rethink the problem based on the new information you have. Often you’ll find a better abstraction that fits the problem better.
撤消不良抽象的一个技巧是将代码内联回使用它的位置。有一段时间,你最终会在代码库中再次“重复自己”,但这没关系。根据您拥有的新信息重新思考问题。通常,您会发现更适合问题的更好抽象。

When the abstraction is wrong, the fastest way forward is back.
当抽象错误时,最快的前进方式就是返回。

—Sandi Metz, The Wrong Abstraction
——桑迪·梅茨,《错误的抽象》

tl;dr  Tl;博士

It’s fine to look for the right abstraction, but don’t obsess over it. Don’t be afraid to copy code when it helps you keep momentum and find the right abstraction.
寻找正确的抽象是可以的,但不要沉迷于它。当代码可以帮助您保持动力并找到正确的抽象时,不要害怕复制代码。

It bears repeating: “Repeat yourself.”
值得重复一遍:“重复你自己。

  1. For some examples, see Ferris working on Rustendo64 or tokiospliff working on a C++ game engine.

  2. This is also how I write prose: I first write a draft and block my inner critic, and then I play the role of the editor/critic and “refactor” the text. This way, I get the best of both worlds: a quick feedback loop which doesn’t block my creativity, and a final product which is more polished and well-structured. Of course, I did not invent this approach. I recommend reading “Shitty first drafts” from Anne Lamott’s book Bird by Bird: Instructions on Writing and Life if you want to learn more about this technique.

  3. This is similar to the OODA loop concept, which stands for “Observe, Orient, Decide, Act.” It was developed by military strategist John Boyd. Fighter pilots use it to wait until the last responsible moment to decide on a course of action, which allows them to make the best decision based on the current situation and available information.

Good work takes time. If you want to build software that lasts, CodeCrafters teaches you to build things from scratch without the shortcuts. Try it free, get 40% off paid plans. I earn a commission on subscriptions.
好的工作需要时间。如果您想构建持久的软件,CodeCrafters 会教您从头开始构建东西,无需快捷方式。免费试用,付费计划可享受 40% 的折扣。我通过订阅赚取佣金。