org.apache.commons.lang

Class CharSet

public class CharSet extends Object implements Serializable

A set of characters.

Instances are immutable, but instances of subclasses may not be.

Since: 1.0

Version: $Id: CharSet.java 618884 2008-02-06 04:37:17Z bayard $

Author: Stephen Colebourne Phil Steitz Pete Gieser Gary Gregory

Field Summary
static CharSetASCII_ALPHA
A CharSet defining ASCII alphabetic characters "a-zA-Z".
static CharSetASCII_ALPHA_LOWER
A CharSet defining ASCII alphabetic characters "a-z".
static CharSetASCII_ALPHA_UPPER
A CharSet defining ASCII alphabetic characters "A-Z".
static CharSetASCII_NUMERIC
A CharSet defining ASCII alphabetic characters "0-9".
protected static MapCOMMON
A Map of the common cases used in the factory.
static CharSetEMPTY
A CharSet defining no characters.
Constructor Summary
protected CharSet(String setStr)

Constructs a new CharSet using the set syntax.

protected CharSet(String[] set)

Constructs a new CharSet using the set syntax.

Method Summary
protected voidadd(String str)

Add a set definition string to the CharSet.

booleancontains(char ch)

Does the CharSet contain the specified character ch.

booleanequals(Object obj)

Compares two CharSet objects, returning true if they represent exactly the same set of characters defined in the same way.

The two sets abc and a-c are not equal according to this method.

CharRange[]getCharRanges()

Gets the internal set as an array of CharRange objects.

static CharSetgetInstance(String setStr)

Factory method to create a new CharSet using a special syntax.

  • null or empty string ("") - set containing no characters
  • Single character, such as "a" - set containing just that character
  • Multi character, such as "a-e" - set containing characters from one character to the other
  • Negated, such as "^a" or "^a-e" - set containing all characters except those defined
  • Combinations, such as "abe-g" - set containing all the characters from the individual sets

The matching order is:

  1. Negated multi character range, such as "^a-e"
  2. Ordinary multi character range, such as "a-e"
  3. Negated single character, such as "^a"
  4. Ordinary single character, such as "a"

Matching works left to right.

static CharSetgetInstance(String[] setStrs)

Constructs a new CharSet using the set syntax.

inthashCode()

Gets a hashCode compatible with the equals method.

StringtoString()

Gets a string representation of the set.

Field Detail

ASCII_ALPHA

public static final CharSet ASCII_ALPHA
A CharSet defining ASCII alphabetic characters "a-zA-Z".

Since: 2.0

ASCII_ALPHA_LOWER

public static final CharSet ASCII_ALPHA_LOWER
A CharSet defining ASCII alphabetic characters "a-z".

Since: 2.0

ASCII_ALPHA_UPPER

public static final CharSet ASCII_ALPHA_UPPER
A CharSet defining ASCII alphabetic characters "A-Z".

Since: 2.0

ASCII_NUMERIC

public static final CharSet ASCII_NUMERIC
A CharSet defining ASCII alphabetic characters "0-9".

Since: 2.0

COMMON

protected static final Map COMMON
A Map of the common cases used in the factory. Subclasses can add more common patterns if desired.

Since: 2.0

EMPTY

public static final CharSet EMPTY
A CharSet defining no characters.

Since: 2.0

Constructor Detail

CharSet

protected CharSet(String setStr)

Constructs a new CharSet using the set syntax.

Parameters: setStr the String describing the set, may be null

Since: 2.0

CharSet

protected CharSet(String[] set)

Constructs a new CharSet using the set syntax. Each string is merged in with the set.

Parameters: set Strings to merge into the initial set

Throws: NullPointerException if set is null

Method Detail

add

protected void add(String str)

Add a set definition string to the CharSet.

Parameters: str set definition string

contains

public boolean contains(char ch)

Does the CharSet contain the specified character ch.

Parameters: ch the character to check for

Returns: true if the set contains the characters

equals

public boolean equals(Object obj)

Compares two CharSet objects, returning true if they represent exactly the same set of characters defined in the same way.

The two sets abc and a-c are not equal according to this method.

Parameters: obj the object to compare to

Returns: true if equal

Since: 2.0

getCharRanges

public CharRange[] getCharRanges()

Gets the internal set as an array of CharRange objects.

Returns: an array of immutable CharRange objects

Since: 2.0

getInstance

public static CharSet getInstance(String setStr)

Factory method to create a new CharSet using a special syntax.

The matching order is:

  1. Negated multi character range, such as "^a-e"
  2. Ordinary multi character range, such as "a-e"
  3. Negated single character, such as "^a"
  4. Ordinary single character, such as "a"

Matching works left to right. Once a match is found the search starts again from the next character.

If the same range is defined twice using the same syntax, only one range will be kept. Thus, "a-ca-c" creates only one range of "a-c".

If the start and end of a range are in the wrong order, they are reversed. Thus "a-e" is the same as "e-a". As a result, "a-ee-a" would create only one range, as the "a-e" and "e-a" are the same.

The set of characters represented is the union of the specified ranges.

All CharSet objects returned by this method will be immutable.

Parameters: setStr the String describing the set, may be null

Returns: a CharSet instance

Since: 2.0

getInstance

public static CharSet getInstance(String[] setStrs)

Constructs a new CharSet using the set syntax. Each string is merged in with the set.

Parameters: setStrs Strings to merge into the initial set, may be null

Returns: a CharSet instance

Since: 2.4

hashCode

public int hashCode()

Gets a hashCode compatible with the equals method.

Returns: a suitable hashCode

Since: 2.0

toString

public String toString()

Gets a string representation of the set.

Returns: string representation of the set

Copyright © 2001-2011 - Apache Software Foundation