Collator - JRE Emulation | JRE Emulation
public abstract class

Collator

extends Object
implements Comparator<Object> Cloneable
java.lang.Object
   ↳ java.text.Collator
Known Direct Subclasses

Class Overview

Performs locale-sensitive string comparison.

Following the Unicode Consortium's specifications for the Unicode Collation Algorithm (UCA), there are 4 different levels of strength used in comparisons:

  • PRIMARY strength: Typically, this is used to denote differences between base characters (for example, "a" < "b"). It is the strongest difference. For example, dictionaries are divided into different sections by base character.
  • SECONDARY strength: Accents in the characters are considered secondary differences (for example, "as" < "às" < "at"). Other differences between letters can also be considered secondary differences, depending on the language. A secondary difference is ignored when there is a primary difference anywhere in the strings.
  • TERTIARY strength: Upper and lower case differences in characters are distinguished at tertiary strength (for example, "ao" < "Ao" < "aò"). In addition, a variant of a letter differs from the base form on the tertiary strength (such as "A" and "Ⓐ"). Another example is the difference between large and small Kana. A tertiary difference is ignored when there is a primary or secondary difference anywhere in the strings.
  • IDENTICAL strength: When all other strengths are equal, the IDENTICAL strength is used as a tiebreaker. The Unicode code point values of the NFD form of each string are compared, just in case there is no difference. For example, Hebrew cantellation marks are only distinguished at this strength. This strength should be used sparingly, as only code point value differences between two strings are an extremely rare occurrence. Using this strength substantially decreases the performance for both comparison and collation key generation APIs. This strength also increases the size of the collation key.

This Collator deals only with two decomposition modes, the canonical decomposition mode and one that does not use any decomposition. The compatibility decomposition mode java.text.Collator.FULL_DECOMPOSITION is not supported here. If the canonical decomposition mode is set, Collator handles un-normalized text properly, producing the same results as if the text were normalized in NFD. If canonical decomposition is turned off, it is the user's responsibility to ensure that all text is already in the appropriate form before performing a comparison or before getting a CollationKey.

Examples:

 // Get the Collator for US English and set its strength to PRIMARY
 Collator usCollator = Collator.getInstance(Locale.US);
 usCollator.setStrength(Collator.PRIMARY);
 if (usCollator.compare("abc", "ABC") == 0) {
     System.out.println("Strings are equivalent");
 }
 

The following example shows how to compare two strings using the collator for the default locale.

 // Compare two strings in the default locale
 Collator myCollator = Collator.getInstance();
 myCollator.setDecomposition(Collator.NO_DECOMPOSITION);
 if (myCollator.compare("ḁ̀", "ḁ̀") != 0) {
     System.out.println("ḁ̀ is not equal to ḁ̀ without decomposition");
     myCollator.setDecomposition(Collator.CANONICAL_DECOMPOSITION);
     if (myCollator.compare("ḁ̀", "ḁ̀") != 0) {
         System.out.println("Error: ḁ̀ should be equal to ḁ̀ with decomposition");
     } else {
         System.out.println("ḁ̀ is equal to ḁ̀ with decomposition");
     }
 } else {
     System.out.println("Error: ḁ̀ should be not equal to ḁ̀ without decomposition");
 }
 

See Also

Summary

Constants
int CANONICAL_DECOMPOSITION Constant used to specify the decomposition rule.
int FULL_DECOMPOSITION Constant used to specify the decomposition rule.
int IDENTICAL Constant used to specify the collation strength.
int NO_DECOMPOSITION Constant used to specify the decomposition rule.
int PRIMARY Constant used to specify the collation strength.
int SECONDARY Constant used to specify the collation strength.
int TERTIARY Constant used to specify the collation strength.
Public Constructors
Collator()
Public Methods
Object clone()
Creates and returns a copy of this Object.
int compare(Object object1, Object object2)
Compares two objects to determine their relative order.
abstract int compare(String string1, String string2)
Compares two strings to determine their relative order.
boolean equals(String string1, String string2)
Compares two strings using the collation rules to determine if they are equal.
static Locale[] getAvailableLocales()
Returns an array of locales for which custom Collator instances are available.
abstract CollationKey getCollationKey(String string)
Returns a CollationKey for the specified string for this collator with the current decomposition rule and strength value.
abstract int getDecomposition()
Returns the decomposition rule for this collator.
static Collator getInstance()
Returns a Collator instance which is appropriate for the user's default Locale.
static Collator getInstance(Locale locale)
Returns a Collator instance which is appropriate for locale.
abstract int getStrength()
Returns the strength value for this collator.
abstract void setDecomposition(int value)
Sets the decomposition rule for this collator.
abstract void setStrength(int value)
Sets the strength value for this collator.
Inherited Methods
[Expand]
From class java.lang.Object
From interface java.util.Comparator

Constants

public static final int CANONICAL_DECOMPOSITION

Constant used to specify the decomposition rule.

Constant Value: 1 (0x00000001)

public static final int FULL_DECOMPOSITION

Constant used to specify the decomposition rule. This value for decomposition is not supported.

Constant Value: 2 (0x00000002)

public static final int IDENTICAL

Constant used to specify the collation strength.

Constant Value: 3 (0x00000003)

public static final int NO_DECOMPOSITION

Constant used to specify the decomposition rule.

Constant Value: 0 (0x00000000)

public static final int PRIMARY

Constant used to specify the collation strength.

Constant Value: 0 (0x00000000)

public static final int SECONDARY

Constant used to specify the collation strength.

Constant Value: 1 (0x00000001)

public static final int TERTIARY

Constant used to specify the collation strength.

Constant Value: 2 (0x00000002)

Public Constructors

public Collator ()

Public Methods

public Object clone ()

Creates and returns a copy of this Object. The default implementation returns a so-called "shallow" copy: It creates a new instance of the same class and then copies the field values (including object references) from this instance to the new instance. A "deep" copy, in contrast, would also recursively clone nested objects. A subclass that needs to implement this kind of cloning should call super.clone() to create the new instance and then create deep copies of the nested, mutable objects.

Returns
  • a copy of this object.

public int compare (Object object1, Object object2)

Compares two objects to determine their relative order. The objects must be strings.

Parameters
object1 the first string to compare.
object2 the second string to compare.
Returns
  • a negative value if object1 is less than object2, 0 if they are equal, and a positive value if object1 is greater than object2.
Throws
ClassCastException if object1 or object2 is not a String.

public abstract int compare (String string1, String string2)

Compares two strings to determine their relative order.

Parameters
string1 the first string to compare.
string2 the second string to compare.
Returns
  • a negative value if string1 is less than string2, 0 if they are equal and a positive value if string1 is greater than string2.

public boolean equals (String string1, String string2)

Compares two strings using the collation rules to determine if they are equal.

Parameters
string1 the first string to compare.
string2 the second string to compare.
Returns
  • true if string1 and string2 are equal using the collation rules, false otherwise.

public static Locale[] getAvailableLocales ()

Returns an array of locales for which custom Collator instances are available.

Note that Android does not support user-supplied locale service providers.

public abstract CollationKey getCollationKey (String string)

Returns a CollationKey for the specified string for this collator with the current decomposition rule and strength value.

Parameters
string the source string that is converted into a collation key.
Returns
  • the collation key for string.

public abstract int getDecomposition ()

Returns the decomposition rule for this collator.

Returns
  • the decomposition rule, either NO_DECOMPOSITION or CANONICAL_DECOMPOSITION. FULL_DECOMPOSITION is not supported.

public static Collator getInstance ()

Returns a Collator instance which is appropriate for the user's default Locale. See "Be wary of the default locale".

public static Collator getInstance (Locale locale)

Returns a Collator instance which is appropriate for locale.

public abstract int getStrength ()

Returns the strength value for this collator.

Returns
  • the strength value, either PRIMARY, SECONDARY, TERTIARY or IDENTICAL.

public abstract void setDecomposition (int value)

Sets the decomposition rule for this collator.

Parameters
value the decomposition rule, either NO_DECOMPOSITION or CANONICAL_DECOMPOSITION. FULL_DECOMPOSITION is not supported.
Throws
IllegalArgumentException if the provided decomposition rule is not valid. This includes FULL_DECOMPOSITION.

public abstract void setStrength (int value)

Sets the strength value for this collator.

Parameters
value the strength value, either PRIMARY, SECONDARY, TERTIARY, or IDENTICAL.
Throws
IllegalArgumentException if the provided strength value is not valid.