Java String Split() Method
The split() method is used to split a string into an array of strings based on a certain character(delimiter) or a regular expression. It is a very useful method for simple everyday tasks like extracting words from a sentence or extracting sentences from a paragraph. This Java method is also used for tokenization. Let's learn how to use the split() method in Java.
Method Signature
There are two signatures of the string split() method. We can simply pass the regular expression or the delimiter which will be used to split the string.
public String[] split(String regularExpression)
We can also pass an additional integer parameter called limit to denote the maximum number of strings that the main string should be split into.
public String[] split(String regularExpression, int limit)
- The split() method returns an array of strings. The array length should be less than or equal to the limit.
- There is no restriction on the returned array size if the limit is not passed to the method or if the limit is set to 0 or anything less than 0.
- For any other limit that is greater than 0, the pattern will be applied a maximum of (limit - 1) times, and the last string of the returned array will have all the characters after the last matched pattern.
String Split() Examples
Let's take a look at some examples to better understand the working of the split() method.
Example 1
Let's try to split a simple sentence into individual words. To keep things simple, the sentence will only have words separated by spaces, and no other punctuations will be used. We will use \\s as the delimiter which represents whitespace. We will not pass the limit parameter.
public static void main(String args[])
{
String sentence = "this is a simple sentence";
String[] words = sentence.split("\\s");
System.out.println("The words are: ");
for(int i = 0; i <= words.length - 1; i++)
System.out.println(words[i]);
}
The words are:
this
is
a
simple
sentence
The output of the above code will not change if we pass the limit parameter as 0. The following code demonstrates this.
public static void main(String args[])
{
String sentence = "this is a simple sentence";
int limit = 0;
String[] words = sentence.split("\\s", limit);
System.out.println("The words are: ");
for(int i = 0; i <= words.length - 1; i++)
System.out.println(words[i]);
}
The words are:
this
is
a
simple
sentence
However, if we pass some other limit value, then the maximum length of the array will be equal to that limit value. For example, if we pass the limit as 2 then the pattern will be matched only 1 time(limit - 1) and all the characters after that first match will be included in the next string. The returned array will have just two strings.
public static void main(String args[])
{
String sentence = "this is a simple sentence";
int limit = 2;
String[] words = sentence.split("\\s", limit);
System.out.println("The words are: ");
for(int i = 0; i <= words.length - 1; i++)
System.out.println(words[i]);
}
The words are:
this
is a simple sentence
Now, let's pass a limit that is greater than the number of words in the sentence. This won't affect the output.
public static void main(String args[])
{
String sentence = "this is a simple sentence";
int limit = 7;
String[] words = sentence.split("\\s", limit);
System.out.println("The words are: ");
for(int i = 0; i <= words.length - 1; i++)
System.out.println(words[i]);
}
The words are:
this
is
a
simple
sentence
We can also pass a negative limit value but the output will remain the same.
public static void main(String args[])
{
String sentence = "this is a simple sentence";
int limit = -3;
String[] words = sentence.split("\\s", limit);
System.out.println("The words are: ");
for(int i = 0; i <= words.length - 1; i++)
System.out.println(words[i]);
}
The words are:
this
is
a
simple
sentence
Example: Split using Delimeter
Now, let's try to split a sentence that has punctuations like commas and full stops. The regular expression that we use to split will be "[\\p{Punct}\\s]+". We want to split the entire sentence so, we will not set a limit.
public static void main(String args[])
{
String sentence = "this is: a simple. sentence, with! some? punctuation: marks.";
String regex = "[\\p{Punct}\\s]+";
String[] words = sentence.split(regex);
System.out.println("The words are: ");
for(int i = 0; i <= words.length - 1; i++)
System.out.println(words[i]);
}
The words are:
this
is
a
simple
sentence
with
some
punctuation
marks
Example: Split String using regex
We can also split individual words obtained in the above examples into characters. Let's use the split() method twice, once to split the sentence into words and then to split words into characters.
public static void main(String args[])
{
String sentence = "this is: a simple. sentence, with! some? punctuation: marks.";
String regex = "[\\p{Punct}\\s]+";
String[] words = sentence.split(regex);//splitting the sentence
String[] characters;
System.out.println("The words are: ");
for(int i = 0; i <= words.length - 1; i++)
{
characters = words[i].split("");//splitting the words
for(String character : characters)
System.out.print(character + " ");
System.out.println();
}
}
The words are:
t h i s
i s
a
s i m p l e
s e n t e n c e
w i t h
s o m e
p u n c t u a t i o n
m a r k s
Summary
In this tutorial, we learned how to split strings using the split() method. This method takes a regular expression or a delimiter as input and also takes an optional limit parameter. If the regular expression is invalid then it throws a PatternSyntaxException. It is a great tool to work with strings and can be used in a lot of different situations.