I'm currently recharging my batteries on the beach and soaking up the sun. But as much as I love the quiet, I can't quite ignore the call of coding, can I? So, I thought, why not blend the relaxation of summer with the thrill of exploring new coding concepts?

With that in mind, I'm thrilled to kick off a new short vacation series of newsletters - Sea (#), Sun, and Shells - that will combine the best of both worlds. We will explore fun and interesting topics, each with a unique summer twist!

Let's start with something we're seeing a lot of these days: the sun, or more specifically, the sun emoji ☀️.

Ever wondered how we store and use this emoji in our C# programs? That's what we're going to learn today - diving deep into the world of Unicode and the .NET System.Text.Rune structure.

Unicode is a universal character encoding standard that represents almost all of the written languages of the world. It can accommodate over a million unique characters, which not only include letters from various languages but also symbols and emojis, like our sun emoji ☀️.

In the world of .NET, we have the System.Text.Rune structure which was released in .NET Core 3.0. It represents a Unicode scalar value, covering the range of [U+0000..U+D7FF] and [U+E000..U+10FFFF), a Rune is used when you want to handle a single Unicode scalar value, such as the sun emoji.

System.Char in .NET is used to represent a UTF-16 code unit and a System.String is a sequence of these UTF-16 code units. Because the Unicode standard is so vast, not all Unicode scalar values can fit into a single System.Char. This is where Rune comes in, efficiently handling any Unicode scalar value.

Take the sun emoji ☀️ for instance. It's represented as U+2600 in Unicode. As a scalar value, it doesn't fit into a char (which has a Unicode range of U+0000 to U+FFFF), but is handled perfectly by the System.Text.Rune struct.

Here's how you can declare a Rune for the sun:

var sun = new Rune(0x2600);

And to convert it back to a string:

var sunString = sun.ToString();

Under the hood, System.Text.Rune provides several key functionalities:

  1. Handles Unicode Scalar Values: It can handle any Unicode scalar value, representing and manipulating Unicode characters that the char data type cannot handle.
  2. Performs String Iteration and Validation: With methods like Rune.DecodeFromUtf16, it can accurately enumerate through a string, decode each Rune, and ensure that the string is a valid UTF-16 sequence. This is especially important for processing strings that include Unicode scalar values outside the Basic Multilingual Plane (BMP), which includes many emojis.
  3. Represents Characters as Integers: A Rune is essentially an integer representing a Unicode scalar value, which is why you can create a new Rune by providing an integer in the constructor.
  4. Converts Strings: System.Text.Rune can convert a Rune back to a string using the ToString method, encoding the Rune into a sequence of one or two UTF-16 code units to process it as a regular string.
  5. Checks Validity: `System.Text.Rune` includes various methods to check the validity of a Unicode scalar value. For example, the IsValid method can determine whether a specified code point is a valid Unicode scalar value.

This suNs it up for today! (pun intended, indeed)

Whether you're currently on vacation, looking forward to one, or reminiscing about a recent one, I hope this newsletter adds a dash of sunny coding to your day ☀️.

Stay tuned for the next issue of "Sea (#), Sun, and Shells", and remember - keep coding, even on the beach!