Java 双重锁定模式真的需要 volatile 吗？

作为 Java 面试八股常见问题之一，单例模式怎么写自然有很多说法，比如下面这种就是天然的单例模式：

public enum Singleton {
    INSTANCE(42);

    private int answer;

    Singleton(int answer) {
        this.answer = answer;
    }

    public int getAnswer() {
        return this.answer;
    }
}

但是如果在面试，面试官大概率是不会满意这份代码的。令人满意的代码，通常是下面这份双重检查锁定式的单例实现：

public class Singleton {

    private static volatile Singleton INSTANCE;

    public static Singleton getInstance() {
        Singleton instance = INSTANCE;
        if (instance == null) {
            synchronized (Singleton.class) {
                if (INSTANCE == null) {
                    INSTANCE = new Singleton(42);
                }
            }
        }
        return instance;
    }

    private int answer;

    public Singleton(int answer) {
        this.answer = answer;
    }

    public int getAnswer() {
        return this.answer;
    }
}

这份代码无疑是正确的，synchronized 保证了单例不会被重复实例化，volatile 保证了 INSTANCE 对其他线程可见。但是如果去掉 volatile 呢？

volatile 做了什么？

在去掉 volatile 之后，如果我们并发的调用 Singleton.getInstance().getAnswer()，可能会得到 42，以及 0。解释这个结果，需要介绍一点 Java 内存模型的知识。

对于并发代码，我们会关注的是读写的同步：我们希望能读到”在此之前应当已经写入的东西”，不要读到”未来才会发生的东西”，但是在复杂的现代计算机系统中，这并不是一件简单的事。在硬件上，乱序执行会导致读写的顺序改变；软件上，编译器会优化读写，相互无依赖的读写可能以任意顺序进行。

如何获得确定的执行顺序呢？Java 内存模型通过一系列 happens-before 顺序约束执行顺序。如果我们有两个操作 x, y （操作 action，比如读变量、写变量、锁同步）满足 happens-before 规则，那么程序上可以认为 y 操作进行时可以观察到 x 的写入，记作 hb(x, y)。

部分顺序如下（原始定义更为复杂，此处已经化简过）：

x y 有同步关系，则有 hb(x, y)
- 对一个 volatile 变量 v 写的操作在后续对 v 的读操作前
- 释放锁的操作在后续锁定同一个锁的操作前
传递性，即如果 hb(x, y) 且 hb(y, z)，那么 hb(x, z)

没有定义 happens before 的情形，就可能观察到任意的顺序。例如，线程 A 按顺序写入 a b 两个普通变量，但是线程 B 不一定能观察到 a b 按顺序写入了，相反，线程 B 可能只能看到 b 被写入，但是 a 尚未写入。这种情形是最为普遍的未同步情景。

回到刚刚的代码，我们把不包含 volatile 的版本抽象成一串操作：

public class Singleton {

    private static Singleton INSTANCE;

    public static Singleton getInstance() {
        Singleton instance = INSTANCE; // a: 读取 INSTANCE
        if (instance == null) {
            synchronized (Singleton.class) { // b: 对 Singleton.class 上锁
                if (INSTANCE == null) { // c: 读取 INSTANCE
                    INSTANCE = new Singleton(42); // d: 写入 INSTANCE
                }
            } // e: 对 Singleton.class 解锁
        }
        return instance;
    }

    private int answer;

    public Singleton(int answer) {
        this.answer = answer; // f: 写入 answer
    }

    public int getAnswer() {
        return this.answer; // g: 读取 answer
    }
}

我们期望最终的结果是，两个线程调用 getInstance().getAnswer() 都返回 42，假设两个线程的操作分别为 a1, a2, b1, b2 …，，为了方便我们将 hb(x, y) 记作 x < y，那么有：

假设线程 1 首先抢到对 Singleton.class 的锁
线程内的操作如下
- a1 < b1 < c1 < f1 < d1 < e1 < g1
- a2 < b2 < c2 < f2 < d2 < e2 < g2
释放锁的操作在后续锁定同一个锁的操作前，所以 e1 < b2

线程 1 完成同步块内的操作后，我们观察线程 2 的执行。如果 a2 的结果是 INSTANCE == null，那么：

线程 2 的执行情况是 a2 < b2 < c2 < e2 < h2
此时由传递性可以得到：f1 < e1 < b2 < g2，也就是说 h2 可以观察到 f1 写入的 42

而如果 a2 的结果是 INSTANCE != null，那么：

线程 2 的执行情况是 a2 < g2
此时 g2 和 f1 之间不能建立 happens before 关系，因此可能观察到 g2 在 f1 之前发生，也就是线程 2 读到了 0

那么 volatile 做了什么呢？volatile 引入了写后读的顺序，也就是说加入了 d1 < a2，那么：

如果 a2 的结果是 INSTANCE == null，那么同上
如果 a2 的结果是 INSTANCE != null，我们仍然有 f1 < d1 < a2 < g2，因此 g2 能观察到 f1，线程 2 读到了 42

损坏的 DCL 模式？

volatile 修饰符当前的语义是在 Java 1.5 引入的，准确的说是 JSR 133。在更早的版本，volatile没有 happens before 语义，因此不能用于实现 DCL 模式，这也是 Effective Java 中提到的 DCL is broken 的原因。

但是也许我们不一定需要 volatile

如果读者真的去看了 JLS 的 Memory Model 部分，应该会很容易注意到接下来的一个章节，其中提到了 final 字段的特殊语义：

An object is considered to be completely initialized when its constructor finishes. A thread that can only see a reference to an object after that object has been completely initialized is guaranteed to see the correctly initialized values for that object’s final fields.

线程如果观察到一个完成初始化的对象，Java 可以保证同时能观察到初始化后的 final 字段。

实际的实现上，JVM 为写入 final 字段的构造方法退出时插入了一个 StoreStore|StoreLoad 内存屏障，保证了写入 final 字段的操作发生在”发布”这个对象之前。相关代码如下：

void Parse::do_exits() {
  // ...

  // Figure out if we need to emit the trailing barrier. The barrier is only
  // needed in the constructors, and only in three cases:
  //
  // 1. The constructor wrote a final. The effects of all initializations
  //    must be committed to memory before any code after the constructor
  //    publishes the reference to the newly constructed object. Rather
  //    than wait for the publication, we simply block the writes here.
  //    Rather than put a barrier on only those writes which are required
  //    to complete, we force all writes to complete.
  //
  // 2. ...
  // ...
  if (method()->is_initializer() &&
       (wrote_final() ||
         (AlwaysSafeConstructors && wrote_fields()) ||
         (support_IRIW_for_not_multiple_copy_atomic_cpu && wrote_volatile()))) {
    _exits.insert_mem_bar(Op_MemBarRelease, alloc_with_final());
    // ...
  }
  // ...
}

也就是说，文章开始的代码，也可以这么写：

public class Singleton {

    private static Singleton INSTANCE;

    public static Singleton getInstance() {
        Singleton instance = INSTANCE;
        if (instance == null) {
            synchronized (Singleton.class) {
                if (INSTANCE == null) {
                    INSTANCE = new Singleton(42);
                }
            }
        }
        return instance;
    }

    private final int answer;

    public Singleton(int answer) {
        this.answer = answer;
    }

    public int getAnswer() {
        return this.answer;
    }
}

当然，这里的字段是 int，如果是一个普通的引用类型字段，这里可能还是需要 volatile 保证正确性。

附录

https://www.cs.umd.edu/\~pugh/java/memoryModel/jsr-133-faq.html#finalRight