Immutable data structures in Java

As part of some of the coding interviews I’ve been conducting recently, the topic of immutable data structures comes up regularly. I’m not overly dogmatic in it myself, but whenever there’s no need for mutable state, I try to get rid of code which makes code mutable, which is often most visible in data structures. However, there seems to be a bit of a misunderstanding on the concept of immutability, where developers often believe that having a final reference, or val in Kotlin or Scala, is enough to make an object immutable. This blogpost dives a bit deeper in immutable references and immutable data structures.

Benefits of immutable data structures

Immutable data structures have some significant benefits, such as:

  • No invalid state
  • Thread safety
  • Easier to understand code
  • Easier to test
  • Can be used for value types

No invalid state

When an object is immutable, it’s hard to have the object in an invalid state. The object can only be instantiated through its constructor, which will enforce the validity of objects. This way, the required parameters for a valid state can be enforced. An example:

Address address = new Address();
address.setCity("Sydney");
// address is in invalid state now, since the country hasn’t been set.

Address address = new Address("Sydney", "Australia");
// Address is valid and doesn’t have setters, so the address object is always valid.

Thread safety

Since the object cannot be changed, it can be shared between threads without having race conditions or data mutation issues.

Easier to understand code

Similar to code example in the invalid state, it’s generally easier to use a constructor than using initialisation methods. This is because the constructor enforces the required arguments, while setter or initialiser methods are not enforced at compile time.

Easier to test

Since objects are more predictable, it’s not necessary to test all permutations of the initialiser methods; i.e. when calling the constructor of a class, the object is either valid or invalid. Other parts of the code which are using these classes become more predictable, having fewer chances of NullPointerExceptions. Sometimes, when passing objects around, there are methods which potentially mutate the state of the object. For example:

public boolean isOverseas(Address address) {
    if(address.getCountry().equals("Australia") == false) {
        address.setOverseas(true); // address has now been mutated!
        return true;
    } else {
        return false;
    }
}

The above code, in general, is bad practice. It returns a boolean as well as potentially changing the state of the object. This makes the code harder to understand and to test. A better solution would be to remove the setter from the Address class, and return a boolean by testing for the country name. An even better way would be to move this logic to the Address class itself (address.isOverseas()). When state really needs to be set, make a copy of the original object without mutating the input.

Can be used for value types

Imagine a money amount, say 10 dollars. 10 Dollars will always be 10 dollars. In code, this could look like public Money(final BigInteger amount, final Currency currency). As you can see in this code, it’s not possible to change the value of 10 dollars to anything other than that, and thus the above can be used safely for value types.

Final references don’t make objects immutable

As mentioned before, one of the issues I regularly encounter with developers is that a large portion of these developers don’t fully understand the difference between final references and immutable objects. It seems that the common understanding of these developers is that the moment a variable becomes final, the data structure becomes immutable. Unfortunately, it’s not that simple, and I’d like to get this misunderstanding out of the world once and for all:

A final reference does not make your objects immutable!

In other words, the following code does not make your objects immutable:

final Person person = new Person("John");

Why not? Well, while `person` is a final field, and cannot be reassigned, the Person class might have a setter method or other mutator methods, making an action like:

person.setName("Cindy");

Quite an easy thing to do, regardless of the final modifier. Alternatively, the Person class might expose a list of addresses like this. Accessing this list allows you to add an address to it, and therefore mutates the person object like this:

person.getAddresses().add(new Address("Sydney"));

Our final reference again didn’t help us in stopping us from mutating the person object.

Okay, now we’ve got that out the way, let’s dive a little bit into how we can make a class immutable. There are a couple of things we need to keep in mind while designing our classes:

  • Don’t expose internal state in an mutable way
  • Don’t change the state internally
  • Make sure subclasses don’t override the above behaviour

With the following guidelines in place, let’s design a better version of our Person class.

public final class Person {  // final class, can’t be overridden by subclasses
    private final String name;     // final for safe publication in multithreaded applications
    private final List<Address> addresses;

    public Person(String name, List<Address> addresses) {
        this.name = name;
        this.addresses = List.copyOf(addresses);   // makes a copy of the list to protect from outside mutations (Java 10+). 
                // Otherwise, use Collections.unmodifiableList(new ArrayList<>(addresses));

    }

    public String getName() {
        return this.name;   // String is immutable, okay to expose
    }

    public List<Address> getAddresses() {
        return addresses; // Address list is immutable
    }
}

public final class Address {    // final class, can’t be overridden by subclasses
    private final String city;          // only immutable classes
    private final String country;

    public Address(String city, String country) {
        this.city = city;
        this.country = country;
    }

    public String getCity() {
        return city;
    }

    public String getCountry() {
        return country;
    }
}

Now, the following code can be used like this:

import java.util.List;
final Person person = new Person("John", List.of(new Address(“Sydney”, "Australia"));

Now, the above code is but immutable due to the design of the Person and Address class, while also having a final reference, making it impossible to reassign the person variable to anything else.

Update: As some people mentioned, the above code was still mutable because I didn’t make a copy of the list of Addresses in the constructor. So, without calling the new ArrayList() in the constructor, it’s still possible to do the following:

final List<Address> addresses = new ArrayList<>();
addresses.add(new Address("Sydney", "Australia"));
final Person person = new Person("John", addressList);
addresses.clear();

However, since a now a copy is made in the constructor, the above code will no longer affect the copied address list reference in the Person class, making the code safe. Thanks all for spotting!

I hope the above helps in understanding the differences between final and immutability. If you have any comments or feedback, please let me know in the comments below.

Again, much thanks to my colleague Winston for taking the time to proofread and review this blogpost!

原文链接:Immutable data structures in Java

© 版权声明
THE END
喜欢就支持一下吧
点赞6 分享
评论 抢沙发

请登录后发表评论

    暂无评论内容