Master Golang: Strings and Runes

A Go string is a read-only slice of bytes. The language and the standard library treat strings specially - as containers of text encoded in UTF-8. In other languages, strings are made of “characters”. In Go, the concept of a character is called a `rune` - it’s an integer that represents a Unicode code point. This Go blog post is a good introduction to the topic.
	package main
	import ( "fmt" "unicode/utf8" )
	func main() {
`s` is a `string` assigned a literal value representing the word “hello” in the Thai language. Go string literals are UTF-8 encoded text.	const s = "สวัสดี"
Since strings are equivalent to `[]byte`, this will produce the length of the raw bytes stored within.	fmt.Println("Len:", len(s))
Indexing into a string produces the raw byte values at each index. This loop generates the hex values of all the bytes that constitute the code points in `s`.	for i := 0; i < len(s); i++ { fmt.Printf("%x ", s[i]) } fmt.Println()
To count how many runes are in a string, we can use the `utf8` package. Note that the run-time of `RuneCountInString` depends on the size of the string, because it has to decode each UTF-8 rune sequentially. Some Thai characters are represented by multiple UTF-8 code points, so the result of this count may be surprising.	fmt.Println("Rune count:", utf8.RuneCountInString(s))
A `range` loop handles strings specially and decodes each `rune` along with its offset in the string.	for idx, runeValue := range s { fmt.Printf("%#U starts at %d\n", runeValue, idx) }
We can achieve the same iteration by using the `utf8.DecodeRuneInString` function explicitly.	fmt.Println("\nUsing DecodeRuneInString") for i, w := 0, 0; i < len(s); i += w { runeValue, width := utf8.DecodeRuneInString(s[i:]) fmt.Printf("%#U starts at %d\n", runeValue, i) w = width
This demonstrates passing a `rune` value to a function.	examineRune(runeValue) } }
	func examineRune(r rune) {
Values enclosed in single quotes are rune literals. We can compare a `rune` value to a rune literal directly.	if r == 't' { fmt.Println("found tee") } else if r == 'ส' { fmt.Println("found so sua") } }

$ go run strings-and-runes.go
Len: 18
e0 b8 aa e0 b8 a7 e0 b8 b1 e0 b8 aa e0 b8 94 e0 b8 b5 
Rune count: 6
U+0E2A 'ส' starts at 0
U+0E27 'ว' starts at 3
U+0E31 'ั' starts at 6
U+0E2A 'ส' starts at 9
U+0E14 'ด' starts at 12
U+0E35 'ี' starts at 15

Using DecodeRuneInString
U+0E2A 'ส' starts at 0
found so sua
U+0E27 'ว' starts at 3
U+0E31 'ั' starts at 6
U+0E2A 'ส' starts at 9
found so sua
U+0E14 'ด' starts at 12
U+0E35 'ี' starts at 15

Next example: Structs.