|
A Go string is a read-only slice of bytes. The language
and the standard library treat strings specially - as
containers of text encoded in UTF-8.
In other languages, strings are made of “characters”.
In Go, the concept of a character is called a |
|
![]()
package main |
|
import ( "fmt" "unicode/utf8" ) |
|
func main() { |
|
|
|
const s = "สวัสดี"
|
|
Since strings are equivalent to |
fmt.Println("Len:", len(s))
|
|
Indexing into a string produces the raw byte values at
each index. This loop generates the hex values of all
the bytes that constitute the code points in |
for i := 0; i < len(s); i++ {
fmt.Printf("%x ", s[i])
}
fmt.Println()
|
|
To count how many runes are in a string, we can use
the |
fmt.Println("Rune count:", utf8.RuneCountInString(s))
|
|
A |
for idx, runeValue := range s {
fmt.Printf("%#U starts at %d\n", runeValue, idx)
}
|
|
We can achieve the same iteration by using the
|
fmt.Println("\nUsing DecodeRuneInString")
for i, w := 0, 0; i < len(s); i += w {
runeValue, width := utf8.DecodeRuneInString(s[i:])
fmt.Printf("%#U starts at %d\n", runeValue, i)
w = width
|
|
This demonstrates passing a |
examineRune(runeValue)
}
}
|
func examineRune(r rune) { |
|
|
Values enclosed in single quotes are rune literals. We
can compare a |
if r == 't' {
fmt.Println("found tee")
} else if r == 'ส' {
fmt.Println("found so sua")
}
}
|
$ go run strings-and-runes.go Len: 18 e0 b8 aa e0 b8 a7 e0 b8 b1 e0 b8 aa e0 b8 94 e0 b8 b5 Rune count: 6 U+0E2A 'ส' starts at 0 U+0E27 'ว' starts at 3 U+0E31 'ั' starts at 6 U+0E2A 'ส' starts at 9 U+0E14 'ด' starts at 12 U+0E35 'ี' starts at 15 |
|
Using DecodeRuneInString U+0E2A 'ส' starts at 0 found so sua U+0E27 'ว' starts at 3 U+0E31 'ั' starts at 6 U+0E2A 'ส' starts at 9 found so sua U+0E14 'ด' starts at 12 U+0E35 'ี' starts at 15 |
Next example: Structs.