Consistency with equals in Comparators
Before You Start
Check each box you can do from memory. A box you cannot check yet is not a problem; it points you to a quick refresher, not a grade.
Not sure? Take the 60-second self-check.
1. TreeSet equality. A TreeSet<String> is constructed with a custom comparator. Two strings, "abc" and "ABC", are added. How does TreeSet decide whether they are duplicates?
Check
TreeSet uses the comparator, not .equals. If the comparator returns 0 for those two strings, TreeSet treats them as the same element and keeps only the first one added.
2. String.equals. Does "abc".equals("ABC") return true or false?
Check
false. String.equals is case-sensitive.
3. Contract strength. Is consistency with equals a required rule or a recommended rule for Comparator?
Check
Recommended, not required. The Comparator javadoc says “strongly recommended.” Some useful comparators deliberately violate it; doing so has documented consequences for sorted collections.
Try This First
Predict what this code prints:
Set<String> s = new TreeSet<>(String.CASE_INSENSITIVE_ORDER);
s.add("abc");
s.add("ABC");
s.add("AbC");
System.out.println(s.size());
Write your prediction before expanding.
Check your prediction
The output is 1. String.CASE_INSENSITIVE_ORDER returns 0 for "abc" vs "ABC" and for "abc" vs "AbC". TreeSet uses the comparator to detect duplicates, so the second and third add calls are no-ops. Only the first string survives.
What You Need To Walk In With
Most comparators should agree with .equals: if two objects compare equal (the comparator returns 0), they should also be .equals-equal, and vice versa. When that agreement holds, the comparator is consistent with equals. Sorted collections (TreeSet, TreeMap) use the comparator instead of .equals to detect duplicates, so inconsistency causes surprising data loss unless you expect and document it.
Once you work through this, you can: predict the size of a TreeSet built with an inconsistent comparator, identify whether a comparator is consistent with .equals by reading its logic, and choose between a TreeSet and a HashSet + sorted list when you need both distinct identity and case-insensitive display order.
How It Works
What “consistent with equals” means
A comparator cmp is consistent with equals for type T if:
cmp.compare(a, b) == 0 implies a.equals(b)
and conversely:
a.equals(b) implies cmp.compare(a, b) == 0
Both directions must hold. If either fails, the comparator is inconsistent with equals.
Why TreeSet uses the comparator, not equals
TreeSet<T> is backed by a red-black tree. The tree uses comparison results to navigate. When add(e) is called, the tree searches for a node where compare(existing, e) == 0. If it finds one, the element is considered a duplicate and the add is ignored. The tree never calls .equals at all.
The Java 25 TreeSet API documents this:
Note that the ordering maintained by a set (whether or not an explicit comparator is provided) must be consistent with equals if it is to correctly implement the
Setinterface.
“Correctly implement the Set interface” means: set.contains(a) returns true only when .equals would agree. When the comparator is inconsistent with equals, TreeSet behaves as a valid sorted collection but not as a standard Set.
The canonical example: String.CASEINSENSITIVEORDER
String.CASE_INSENSITIVE_ORDER is a built-in Comparator<String> that returns 0 for strings that differ only in case.
compare("abc", "ABC")returns0."abc".equals("ABC")returnsfalse.
This comparator is therefore not consistent with equals. The JDK documents this explicitly in the String.CASEINSENSITIVEORDER javadoc. Using it intentionally in a TreeSet is valid; using it accidentally when you expected standard set semantics is a bug.
When inconsistency is intentional and acceptable
Some legitimate use cases require an ordering that groups objects the comparator considers equal even though equals would not. Case-insensitive string collation is the most common example. The rule is: when you intentionally use an inconsistent comparator, document it clearly on the TreeSet or TreeMap construction site.
Quick check
Check your understanding
String.CASEINSENSITIVEORDER compares “hello” and “HELLO” and returns 0. String.equals considers them unequal. Which statement about this comparator is correct?
Worked Example: Predict Then Check
// Scenario A
Set<String> a = new TreeSet<>(); // natural order, consistent with equals
a.add("abc");
a.add("ABC");
System.out.println("A: " + a.size());
// Scenario B
Set<String> b = new TreeSet<>(String.CASE_INSENSITIVE_ORDER);
b.add("abc");
b.add("ABC");
System.out.println("B: " + b.size());
// Scenario C
Map<String, Integer> c = new TreeMap<>(String.CASE_INSENSITIVE_ORDER);
c.put("abc", 1);
c.put("ABC", 2);
System.out.println("C: " + c.size() + " value=" + c.get("abc"));
Predict all three outputs.
Step-by-step reasoning
A: Natural String ordering is consistent with .equals. "abc".equals("ABC") is false, and "abc".compareTo("ABC") is also non-zero (lowercase ‘a’ has a higher Unicode value than uppercase ‘A’). Both strings are distinct in the TreeSet. Size: 2.
B: CASE_INSENSITIVE_ORDER.compare("abc", "ABC") returns 0. The second add is treated as a duplicate. Size: 1.
C: TreeMap also uses the comparator for key equality. put("ABC", 2) finds the existing key "abc" (comparator returns 0) and updates its value. Size: 1. c.get("abc") returns 2 (the updated value), because the comparator considers "abc" equal to the key "ABC" under which the map stored 2.
Show answers
A: 2
B: 1
C: 1 value=2
Common Misconceptions
Misconception 1: TreeSet uses .equals like every other Set
Wrong mental model: “All
Setimplementations use.equalsto detect duplicates.HashSetuses hash codes then.equals.TreeSetmust do the same.”
Why it breaks: TreeSet has no hash table. Its internal red-black tree uses comparisons for navigation. The only equality check available is compare(...) == 0. If a comparator returns 0 for two .equals-distinct objects, the second object is silently dropped on add. The element is gone with no exception and no warning.
How to correct: When using a TreeSet with a custom comparator, verify that the comparator is consistent with .equals unless you intentionally want the set to deduplicate by the comparator’s criteria.
Source: Java 25 TreeSet javadoc; Bloch, Effective Java, Item 14.
Quick check
Check your understanding
A TreeSet<String> is built with a custom comparator. The strings “river” and “River” are both added. The comparator returns 0 for these two strings. What happens when the second add is called?
Misconception 2: overriding equals and compareTo independently is always safe
Wrong mental model: “
equalsandcompareToare separate contracts, so I can define them with different fields without consequence.”
Why it breaks: When you implement Comparable<T>, the natural ordering is used by TreeSet and TreeMap as the default duplicate-detection mechanism (without any comparator argument). If compareTo and equals use different fields, sorted collections can silently drop or retain objects in ways that violate the expected Set or Map semantics.
How to correct: Use the same fields in equals and compareTo. If the fields must differ for a legitimate reason (for example, an ID for identity but a name for ordering), document the inconsistency explicitly.
Source: Java 25 Comparable javadoc; Bloch, Effective Java, Items 10 and 14.
Formal Definition and Interface Contract
From the Java 25 Comparator API:
It is generally the case, but not strictly required, that
(compare(x, y) == 0) == (x.equals(y)). Generally speaking, any comparator that violates this condition should clearly indicate this fact. The recommended language is “Note: this comparator imposes orderings that are inconsistent with equals.”
From the Java 25 TreeSet API:
The behavior of a set is well-defined even if its ordering is inconsistent with equals; it just fails to obey the general contract of the
Setinterface.
Mental Model
Think of TreeSet as a filing cabinet where the filing rule is the comparator: two items go in the same folder if and only if the comparator says they are equal. HashSet files by equals and hash code. When the two filing rules agree (consistent with equals), both cabinets organize the same way and all clients are happy. When they disagree, you get two different cabinets with different contents for the same data. Neither is wrong, but clients who expect one and get the other will be confused.
Connections
Within CSCD 210/211: The equals contract was introduced in CSCD 210. This page shows that compareTo is not free to define equality independently of equals without consequences for sorted collections.
Looking back: Antisymmetry and Transitivity in Comparators established the first two contract rules. This page adds the third: the recommendation to agree with equals.
Looking ahead: Implementing Comparator<T> as a Named Class applies all three contract rules in a concrete implementation. A later page in this lane revisits the relationship between equals, hashCode, and compareTo when multiple orderings coexist on one class.
Go Deeper (optional)
None of this depth is needed to write correct comparators. It is here for anyone who enjoys seeing where the idea comes from.
The contract rules for a comparator are exactly the axioms of a strict total order in mathematics: irreflexivity, asymmetry, transitivity, and totality (every pair is comparable). When a comparator returns 0 for two objects, it is declaring them equivalent under that order. The consistency-with-equals rule then says: the equivalence classes induced by the comparator should match the equivalence classes induced by equals. When they do not match, you have two competing notions of sameness on the same type, and sorted collections are forced to pick one.
The function-object pattern the Comparator interface embodies is the Strategy pattern from Gamma et al., Design Patterns (1994): the sorting algorithm is fixed, but the comparison strategy is injected. A Comparator is also the simplest example of a first-class function in Java: it is behavior packaged as an object, which is exactly what lambdas formalize.
In professional code, inconsistency with equals appears most visibly in locale-aware text processing. The java.text.Collator class (part of the Java standard library) acts as a Comparator<String> that respects language-specific ordering rules. French, German, and Japanese all sort strings in ways that String.equals would never consider equal. The Java API design explicitly accepts this inconsistency and documents it, because real-world language support requires it. When you write autocomplete, search, or any sorted display of user-visible strings, you will encounter this.
Practice
Level 1
Predict what the following code prints, and explain why:
Set<String> set = new TreeSet<>(String.CASE_INSENSITIVE_ORDER);
set.add("banana");
set.add("Apple");
set.add("BANANA");
System.out.println(set.size());
System.out.println(set.contains("apple"));
Thought process
CASE_INSENSITIVE_ORDER returns 0 for "banana" and "BANANA". "Apple" vs "banana": the comparator compares case-insensitively, so "apple" vs "banana" is non-zero. After adding all three, what does contains("apple") do?
Show answer
size() is 2: "banana" and "Apple" are in the set, but "BANANA" is a case-insensitive duplicate of "banana" and is dropped.
contains("apple") returns true: the comparator considers "apple" equal to "Apple", so the tree finds a match.
2
true
Level 2
A student builds a phone-book lookup with TreeSet<String>(String.CASE_INSENSITIVE_ORDER). They want case-insensitive display but also need to know whether "Smith" (capitalized) and "smith" (lowercase) are the same person or two different entries. Explain the problem with their design and propose a corrected layout.
Thought process
In a TreeSet with CASE_INSENSITIVE_ORDER, "Smith" and "smith" are treated as duplicates. There is no way to distinguish them inside the set. The correction separates identity (use a data structure that uses .equals) from display order (use a list sorted with a case-insensitive comparator).
Show answer
Problem: TreeSet<String>(String.CASE_INSENSITIVE_ORDER) silently drops "smith" if "Smith" is already present (or vice versa). The set cannot store both.
Corrected layout:
// Identity: store originals in a HashSet (uses .equals, case-sensitive)
Set<String> identitySet = new HashSet<>(rawEntries);
// Display: sort a copy of the entries case-insensitively for output
List<String> display = new ArrayList<>(identitySet);
display.sort(String.CASE_INSENSITIVE_ORDER);
This keeps both entries distinct (identity) while providing case-insensitive alphabetical order for display.
Level 3
A Product class has String id (unique) and String name (display). The natural ordering is by name, but equals is defined on id. Explain the consequence of this design for a TreeSet<Product>, and rewrite either compareTo or equals to eliminate the inconsistency while preserving the desired behavior for both sorted display and identity.
Thought process
compareTo by name and equals by id are inconsistent. Two products with the same name but different IDs would be treated as duplicates by TreeSet, even though they are not equals-equal. One approach: make compareTo use id (consistent, but loses name ordering). Another approach: keep name ordering in a Comparator and use equals-by-id everywhere, keeping the natural order as compareTo-by-id.
Show answer
The problem: TreeSet<Product> uses compareTo (by name) for deduplication. Two products with the same name but different IDs are dropped as duplicates, even though equals (by id) would consider them distinct. The set violates the Set interface contract.
Option A: consistent natural order (recommended for sorted collections). Change compareTo to order by id (matching equals). For sorted display by name, pass a Comparator to TreeSet or sort a list separately.
@Override
public int compareTo(final Product other) {
if (other == null)
throw new IllegalArgumentException("Bad Product in Product.compareTo");
return this.id.compareTo(other.getId());
}
Option B: document and isolate inconsistency. Keep the name-based compareTo and document clearly that a TreeSet<Product> without an explicit comparator deduplicates by name, not by id. This is intentional inconsistency and requires explicit documentation.
Option A is preferred for most contexts. Bloch, Effective Java, Item 14 recommends consistency with equals for natural orderings.
Check Yourself
Close the notes and answer each one from memory, then reveal it. Pulling an idea back from memory is one of the strongest ways to make it stick.
Check your understanding
A Comparator<String> is called consistent with equals when which condition holds?
Is consistency with equals required or merely recommended for a Comparator in Java?
A TreeSet<String> is constructed with String.CASEINSENSITIVEORDER. The strings “abc”, “ABC”, and “AbC” are added in that order. What does size() return?
Predict what the following code prints: Map<String, Integer> c = new TreeMap<>(String.CASEINSENSITIVEORDER); c.put(“abc”, 1); c.put(“ABC”, 2); System.out.println(c.size() + “ “ + c.get(“abc”));
A student stores phone-book names in a TreeSet<String>(String.CASEINSENSITIVEORDER), intending to keep both “Smith” and “smith” as separate entries with case-insensitive display order. What is the problem with this design?