All posts

Zig / Strings in 5 minutes

Posted On 01.04.2022

Just like C/C++, you can define a string in Zig with an array of bytes or as a null-terminated string literal.

var s = "hello"; // *const [5:0]u8
var s = [_]u8{'h', 'e', 'l', 'l', 'o'}; // [5]u8

You can reference to any byte in the string just like an array:

var s = "hello";
s[2] == 'l' // true

Or reference to a substring as a slice:

s[1..3] == "ell"
s[1..] == "ello"

Note that these references are to the bytes not the characters:

var s = "Đ";
s[0] == '\c4'
s[1] == '\90'

For better Unicode supports, you should look at libraries like zigstr and ziglyph.

You can use the array concatenation ++ operator to combine two strings together:

const a = "hello";
const b = "world";
const c = a ++ b;
c == "helloworld" // true

Get the length of a string by .len:

var len = c.len;
len == 10 // true

Since Zig is a very simple language, there’s nothing such as built-in String type, so you gotta manipulate strings manually, just like C. But you can use some help from std.mem.

For example, we can use std.mem.indexOf to find the byte offset of some content in a string:

var found = std.mem.indexOf(u8, c, "w");
found == 5 // true

Or if you want to split a string into multiple substrings, std.mem.split will return an iterator for it:

var s = "hello world this is a test";
var splits = std.mem.split(u8, s, " ");
while (splits.next()) |chunk| {
  std.debug.print("{s}\n", .{chunk});
}

The output would be:

hello
world
this
is
a
test

When s is defined as a string literal, they will be stored as a null-terminated byte array in a global data section in the executable after compilation process. And what you get is a pointer to that byte array *const [N:0]u8 (N is the length of the string and :0 indicates null termination).

And you can’t mutate the literal string’s content like this:

var s = "good morning";
s[0] = 'm'; // error: cannot assign to constant

In order to do so, you must dereference it with .* operator, a new copy of the string will be created as an array so you can mutate it:

var s = "good morning".*;
s[0] = 'm';
s == "mood morning" // true
std.mem.reverse(u8, s);
s == "gninrom doom" // true

Also, you can dynamically create a formatted string with std.fmt:

const allocator = std.heap.page_allocator;
var distance: i32 = 7857;
var str = std.fmt.allocPrint(allocator, "SJC-SGN Distance = {d} miles", .{distance}) catch "format failed";

Thank you so much, everyone on Reddit and Lobsters, for the great feedbacks and discussions that helped me improve this article!!!