Strings in Go
A String in Go is simply a slice of bytes. It is sometimes also known as a sequence of characters where the characters follow the UTF-8 encoding.
The UTF-8 encoding is a very widely used encoding, as it is the standard encoding for different types of files. These files mainly include XML files, JSON strings, text files, etc.
In Go, strings are created by making the characters enclosed inside double quotes "".
In Go, the string data type can reserve up to 4 bytes of characters if the string uses different characters, but if the string is only an ASCII character, Go is smart enough to reserve only one byte of space.
Now, we know a little bit about a string in Go, let's explore an example where we will create a simple string and then print it using the fmt.Println()
function.
Example: Print a string
package main
import (
"fmt"
)
func main() {
website := "studytonight.com"
fmt.Println(website)
}
studytonight.com
One important point to note is that in Go, the strings are a sequence of variable-width characters. The size can vary from 1 byte to 4 bytes. It is different from other programming languages, where the size of the strings is usually fixed.
Advantages of String
Go strings possess some advantages over strings from other languages. These are:
- As UTF-8 is the standard in Go for encoding, there's no need for Go to encode and decode strings like other languages.
- The space occupied by strings and text files in Go is less because of the variable-width characters.
Printing individual bytes of a string
We can access each byte of a string, a string is simply a slice of bytes.
Consider the example shown below where we have a function named PrintEachByte()
, which is used to print every byte that is present in a given string.
package main
import (
"fmt"
)
func printEachByte(str string) {
for i := 0; i < len(str); i++ {
fmt.Printf("%d ", str[i])
}
}
func main() {
website := "studytonight.com"
printEachByte(website)
}
In the above example, we are iterating over the string using a for loop, and the len(str) is used to limit the for loop.
The len(str) simply returns the number of bytes that are present in the string. It should also be noted that we used %d format specifier which simply prints the ASCII encoding of a particular byte.
115 116 117 100 121 116 111 110 105 103 104 116 46 99 111 109
Printing individual characters of a string
We can also print the individual characters that are present in the string with the help of the %c format specifier. Consider the example shown below.
package main
import (
"fmt"
)
func printEachCharacter(str string) {
for i := 0; i < len(str); i++ {
fmt.Printf("%c ", str[i])
}
}
func main() {
website := "studytonight.com"
printEachCharacter(website)
}
s t u d y t o n i g h t . c o m
Though it might seem that the above approach will work fine for all the strings in Go, unfortunately, it won't. Since we know that in Go, the strings use the UTF-8 encoding, which means we can have strings that are made up of special symbols as well.
Let's consider a case where the above code of printing individual characters will not work, then we will make use of the more common approach that is preferred while accessing individual characters of a string in Go.
package main
import (
"fmt"
)
func printEachCharacter(str string) {
for i := 0; i < len(str); i++ {
fmt.Printf("%c ", str[i])
}
}
func main() {
website := "studytonight.com"
printEachCharacter(website)
fmt.Println()
name := "León"
printEachCharacter(name)
}
s t u d y t o n i g h t . c o m
L e à ³ n
If we notice the output, then we can see, that the character printed for the name = León isn't what we expected.
This happened because we assumed that each code point will be making use of only one byte, which is not correct.
In UTF-8 encoding, a code point can easily occupy more than 1 byte, hence the different characters printed above. The solution for this is to make use of Runes.
Runes in Go
A Rune in Go represents a Unicode code point. It is an alias of int32, which means that each rune can take up to 4 bytes of memory.
One important point to note is that when it comes to representing a code point in Go, a rune can handle any size.
In the example above, we noticed how the characters that were printed weren't simply of our preference. Now, let's slightly change the program and see how rune can help us.
package main
import (
"fmt"
)
func printEachCharacter(str string) {
for i := 0; i < len(str); i++ {
fmt.Printf("%c ", str[i])
}
}
func printEachCharacterRune(str string) {
r := []rune(str)
for i := 0; i < len(r); i++ {
fmt.Printf("%c ", r[i])
}
}
func main() {
name := "León"
printEachCharacter(name)
fmt.Println()
printEachCharacterRune(name)
}
L e à ³ n
L e ó n
We can notice that by just creating a slice of rune using the []rune(str) code, we were able to resolve that issue.
Range clause and Rune in Go
A more convenient approach is to make use of the range clause in the for loop whenever we want to iterate over the string and get the individual runes of it.
Consider the example shown below.
package main
import (
"fmt"
)
func printEachCharacterRune(str string) {
for index, r := range str {
fmt.Printf("At index %d we have rune: %c \n", index, r)
}
}
func main() {
name := "León"
printEachCharacterRune(name)
}
At index 0 we have rune: L
At index 1 we have rune: e
At index 2 we have rune: ó
At index 4 we have rune: n
Length of String
The len() function can be used to find the number of bytes present in the string, but it is often a case that the string might contain Unicode code points, and then the len() function will return a different answer.
So the RuneCountInString() function is a better choice if you know that your string might contain some special code points.
Consider the example shown below, where I have used both the len() function and the RuneCountInString() function and pointed out the differences and similarities between them as well.
package main
import (
"fmt"
"unicode/utf8"
)
func main() {
name := "León"
fmt.Println("Count using len():", len(name))
fmt.Println("Count using RuneCountInString():", utf8.RuneCountInString(name))
name2 := "Rahul"
fmt.Println("Count using len():", len(name2))
fmt.Println("Count using RuneCountInString():", utf8.RuneCountInString(name2))
}
Count using len(): 5
Count using RuneCountInString(): 4
Count using len(): 5
Count using RuneCountInString(): 5
String from a slice of bytes
We can convert a slice of bytes into a string by converting them into the string() type.
Consider the example shown below.
package main
import (
"fmt"
)
func main() {
bSlice := []byte{0x4D, 0x75, 0x6B, 0x75, 0x6C}
str := string(bSlice)
fmt.Println(str)
}
Mukul
In the above examples, we are using a slice of bytes that is made up of hexadecimal values. Now, let's consider a similar example, where instead of the hexadecimal values, we are making use of the decimal equivalent of these values.
Consider the example shown below.
package main
import (
"fmt"
)
func main() {
bSlice := []byte{109, 117, 107, 117, 108}
str := string(bSlice)
fmt.Println(str)
}
mukul
String from a slice of runes
Like the previous section, we can also make a string from a slice of runes.
Consider the example shown below.
package main
import (
"fmt"
)
func main() {
rSlice := []rune{0x0066, 0x00f3, 0x006f}
str := string(rSlice)
fmt.Println(str)
}
fóo
Comparing String in Go
In Go, we can compare strings with the help of the == operator. It returns true if both the strings are equal, and false if they aren't.
Let's consider the example shown below where we are using a function named compareTwoString, which takes two strings as arguments and then returns true if they are equal or false otherwise.
package main
import (
"fmt"
)
func compareTwoStrings(s1, s2 string) bool {
if s1 == s2 {
return true
}
return false
}
func main() {
s1 := "point"
s2 := "points"
res := compareTwoStrings(s1, s2)
fmt.Println("Are they equal?", res)
s3 := "Leo"
s4 := "Leo"
res = compareTwoStrings(s3, s4)
fmt.Println("Are they equal?", res)
}
Are they equal? false
Are they equal? true
String concatenation in Go
The process where we add multiple strings is known as string concatenation, and in Go, we can achieve that using the + operator.
Consider the example shown below.
package main
import (
"fmt"
)
func main() {
str1 := "Studytonight"
str2 := "is Awesome"
fmt.Println(str1 + " " + str2)
}
Studytonight is Awesome
Immutable nature of strings
In Go, strings are immutable in nature, which means that once they are declared you cannot change them.
We can confirm the same by first declaring a string, and then trying to change a specific character of the same.
Consider the example shown below.
package main
import (
"fmt"
)
func main() {
str1 := "Studytonight"
fmt.Println(str1)
str1[2] = 'a'
fmt.Println(str1)
}
./prog.go:12:10: cannot assign to str1[2] (strings are immutable)
Conclusion
In the above article, we learned pretty much the main points about strings. We learned to declare strings, how to access the individual characters and bytes from a string. We also learned about runes, string concatenation, followed by how strings are immutable in nature.