16
Strings, Characters and
Regular Expressions
The chief defect of Henry King
Was chewing little bits of string.
—Hilaire Belloc
Vigorous writing is concise. A
sentence should contain no
unnecessary words, a paragraph
no unnecessary sentences.
—William Strunk, Jr.
I have made this letter longer
than usual, because I lack the
time to make it short.
—Blaise Pascal
Objectives
In this chapter you’ll learn:
I
To create and manipulate
immutable character -string
objects of class
String.
I
To create and manipulate
mutable character-string
objects of class
StringBuilder
.
I To create and manipulate
objects of class
Character
.
I To break a String object
into tokens using
String
method
split
.
I
To use regular expressions to
validate
String
data entered
into an application.
16.1 Introduction 673
16.1 Introduction
This chapter introduces Java’s string- and character-processing capabilities. The tech-
niques discussed here are appropriate for validating program input, displaying information
to users and other text-based manipulations. They’re also appropriate for developing text
editors, word processors, page-layout software, computerized typesetting systems and oth-
er kinds of text-processing software. We’ve presented several string-processing capabilities
in earlier chapters. This chapter discusses in detail the capabilities of classes
String
,
StringBuilder
and
Character
from the
java.lang
package. These classes provide the
foundation for string and character manipulation in Java.
The chapter also discusses regular expressions that provide applications with the capa-
bility to validate input. The functionality is located in the
String
class along with classes
Matcher
and
Pattern
located in the
java.util.regex
package.
16.2 Fundamentals of Characters and Strings
Characters are the fundamental building blocks of Java source programs. Every program is
composed of a sequence of characters that—when grouped together meaningfully—are in-
terpreted by the Java compiler as a series of instructions used to accomplish a task. A program
may contain character literals. A character literal is an integer value represented as a character
in single quotes. For example,
'z'
represents the integer value of
z
, and
'\n'
represents the
integer value of newline. The value of a character literal is the integer value of the character
in the Unicode character set. Appendix B presents the integer equivalents of the characters
in the ASCII character set, which is a subset of Unicode (discussed in Appendix L).
Recall from Section 2.2 that a string is a sequence of characters t reated as a single unit.
A string may include letters, digits and various special characters, such as
+
,
-
,
*
,
/
and
$
.
16.1 Introduction
16.2 Fundamentals of Characters and
Strings
16.3 Class
String
16.3.1
String
Constructors
16.3.2
String
Methods
length
,
charAt
and
getChars
16.3.3 Comparing Strings
16.3.4 Locating Characters and Substrings in
Strings
16.3.5 Extracting Substrings from Strings
16.3.6 Concatenating Strings
16.3.7 Miscellaneous
String
Methods
16.3.8
String
Method
valueOf
16.4 Class
StringBuilder
16.4.1
StringBuilder
Constructors
16.4.2
StringBuilder
Methods
length
,
capacity
,
setLength
and
ensureCapacity
16.4.3
StringBuilder
Methods
charAt
,
setCharAt
,
getChars
and
reverse
16.4.4
StringBuilder append
Methods
16.4.5
StringBuilder
Insertion and
Deletion Methods
16.5 Class
Character
16.6 Tokenizing
String
s
16.7 Regular Expressions, Class
Pattern
and Class
Matcher
16.8 Wrap-Up
Summary | Self-Review Exercises | Answers to Self-Review Exercises | Exercises | Special Section:
Advanced String-Manipulation Exercises |
Special Section: Challenging String-Manipulation Projects | Making a Difference
674 Chapter 16 Strings, Characters and Regular Expressions
A string is an object of class
String
. String literals (stored in memory as
String
objects)
are written as a sequence of characters in double quotation marks, as in:
A string may be assig ned to a
String
reference. The declaration
initializes
String
variable
color
to refer to a
String
object that contains the string
"blue"
.
16.3 Class
String
Class
String
is used to represent strings in Java. The next several subsections cover many
of class
String
’s capabilities.
16.3.1
String
Constructors
Class
String
provides constructors for initializing
String
objects in a variety of ways. Four
of the constructors are demonstrated in the
main
method of Fig. 16.1.
Line 12 instantiates a new
String
using class
String
’s no-argument constructor and
assigns its reference to
s1
.Thenew
String
object contains no characters (i.e., the empty
"John Q. Doe"
(a name)
"9999 Main Street"
(a street address)
"Waltham, Ma ssachusetts"
(a city and state)
"(201) 555-1212"
(a telephone number)
String color = "blue";
Performance Tip 16.1
To conserve memory, Java treats all string literals with the same contents as a single
String
object that has many references to it.
1
// Fig. 16.1: StringConstructors.java
2
// String class constructors.
3
4
public class StringConstructors
5
{
6
public static void main( String[] args )
7
{
8
char[] charArray = { 'b', 'i', 'r', 't', 'h', '', 'd', 'a', 'y' };
9
String s = new String( "hello" );
10
11
12
13
14
15
16
17
System.out.printf(
18
"s1 = %s\ns2 = %s\ns3 = %s\ns4 = %s\n",
19
s1, s2, s3, s4 ); // display strings
20
} // end main
21
} // end class StringConstructors
Fig. 16.1 |
String
class constructors. (Part 1 of 2.)
// use String constructors
String s1 = new String();
String s2 = new String( s );
String s3 = new String( charArray );
String s4 = new String( charArray, 6, 3 );
16.3 Class String 675
string, which can also be represented as
""
) and has a length of 0. Line 13 instantiates a
new
String
object using class
String
’s constructor that takes a
String
object as an argu-
ment and assigns its reference to
s2
.Thenew
String
object contains the same sequence
of characters as the
String
object
s
that’s passed as an argument to the constructor.
Line 14 instantiates a new
String
object and assigns its reference to
s3
using class
String
’s constructor that takes a
char
array as an argument. The new
String
object con-
tains a copy of the characters in the array.
Line 15 instantiates a new
String
object and assigns its reference to
s4
using class
String
’s constructor that takes a
char
array and two integers as arguments. The second
argument specifies the starting position (the offset) from which characters in the array are
accessed. Remember that the first character is at position
0
. The third argument specifies
the number of characters (the count) to access in the array. The new
String
object is
formed from the accessed characters. If the offset or the count specified as an argument
results in accessing an element outside the bounds of the character array, a
StringIndex-
OutOfBoundsException
is thrown.
16.3.2
String
Methods
length
,
charAt
and
getChars
String
methods
length
,
charAt
and
getChars
return the length of a
String
,obtainthe
character at a specific location in a
String
and retrieve a set of characters from a
String
as a
char
array, respectively. Figure 16.2 demonstrates each of these methods.
s1 =
s2 = hello
s3 = birth day
s4 = day
Software Engineering Observation 16.1
It’s not necessary to copy an existing
String
object.
String
objects are immutable—their
character contents cannot be changed after they’re created, because class
String
does not
provide methods that allow the contents of a
String
object to be modified.
Common Programming Error 16.1
Accessing a character outside the bounds of a
String
(i.e., an index less than 0 or an index
greaterthanorequaltothe
String
’s length) results in a
StringIndexOutOfBounds-
Exception
.
1
// Fig. 16.2: StringMiscellaneous.java
2
// This application demonstrates the length, charAt and getChars
3
// methods of the String class.
4
5
public class StringMiscellaneous
6
{
Fig. 16.2 |
String
methods
length
,
charAt
and
getChars
.(Part1of2.)
Fig. 16.1 |
String
class constructors. (Part 2 of 2.)
评论5
最新资源