RichText Module
Rich text processing for Bluesky posts. Detects mentions, links, and hashtags in text and resolves them to facets with correct UTF-8 byte offsets as required by the AT Protocol.
Types
| Type | Description |
|
A facet detected in rich text, with UTF-8 byte offsets and extracted content. Byte offsets are used rather than character indices because the AT Protocol specifies facet positions in UTF-8 byte coordinates. |
|
|
A segment of rich text for rendering. Each segment has plain text content and an optional facet that applies to the entire segment. |
|
|
A rich text value: text content paired with its resolved facets. All facet indices are UTF-8 byte offsets. |
Functions and values
| Function or value |
Description
|
Full Usage:
byteLength text
Parameters:
string
-
The text to measure.
Returns: int
The number of bytes when the text is encoded as UTF-8.
|
Count the UTF-8 byte length of a string. The AT Protocol specifies facet positions and text limits in UTF-8 bytes.
|
|
Create a RichTextValue from text and facets.
|
Full Usage:
delete byteStart byteEnd rt
Parameters:
int
-
The start of the byte range to delete (inclusive).
byteEnd : int
-
The end of the byte range to delete (exclusive).
rt : RichTextValue
-
The rich text value to modify.
Returns: RichTextValue
A new RichTextValue with the range deleted and facet indices adjusted.
|
Delete a UTF-8 byte range from rich text. Facets are adjusted: facets entirely within the deleted range are removed, facets partially overlapping are truncated, and facets after the deleted range are shifted back.
|
Full Usage:
detect text
Parameters:
string
-
The text to scan for rich text entities.
Returns: DetectedFacet list
A list of DetectedFacet values sorted by byte start position.
Mentions match @handle.domain, links match http(s)://...,
and hashtags match #tag patterns.
|
Detect mentions, links, and hashtags in text. Returns facets with UTF-8 byte offsets, sorted by start position. This performs detection only. To resolve mentions to DIDs and produce Facet records suitable for the API, pass the result to resolve or use parse for a combined detect-and-resolve step.
Example
val facets: obj
|
Full Usage:
graphemeLength text
Parameters:
string
-
The text to measure.
Returns: int
The number of extended grapheme clusters in the text.
|
Count the number of grapheme clusters (user-perceived characters) in a string. Bluesky uses grapheme length for the 300-character post limit. Grapheme length differs from Length for multi-codepoint characters such as emoji (e.g., family emoji, flag emoji) and combining character sequences.
|
Full Usage:
insert bytePos insertText rt
Parameters:
int
-
The UTF-8 byte offset at which to insert.
insertText : string
-
The text to insert.
rt : RichTextValue
-
The rich text value to modify.
Returns: RichTextValue
A new RichTextValue with the text inserted and facet indices adjusted.
|
Insert text at a UTF-8 byte index. Facets that start at or after the insertion point are shifted forward. Facets that span the insertion point are expanded.
|
Example
|
|
|
|
Full Usage:
resolve agent detected
Parameters:
AtpAgent
-
An authenticated AtpAgent.
detected : DetectedFacet list
-
The list of DetectedFacet values to resolve (typically from detect).
Returns: Task<Facet list>
A list of Facet records with resolved features.
Mentions whose handles cannot be resolved are omitted from the result.
|
Resolve detected facets into API-ready facet records.
Mentions are resolved to DIDs via
|
Full Usage:
sanitize rt
Parameters:
RichTextValue
-
The rich text value to sanitize.
Returns: RichTextValue
A new RichTextValue with cleaned text and adjusted facets.
|
Sanitize rich text by trimming leading/trailing whitespace, collapsing runs of spaces/tabs to a single space, normalizing \r\n to \n, and limiting consecutive newlines to two. Facet indices are recomputed after sanitization by re-detecting facets from the cleaned text. Because sanitization can change byte offsets in non-trivial ways (e.g., collapsing multiple spaces to one), facet positions are recalculated by mapping each original facet's byte range through the transformation. Facets whose ranges become empty or invalid after sanitization are removed.
|
Full Usage:
segments rt
Parameters:
RichTextValue
-
The rich text value to segment.
Returns: RichTextSegment list
A list of RichTextSegment values covering the entire text.
|
Split rich text into segments by facet boundaries for rendering.
Each segment has plain text and an optional facet. Non-faceted text
between facets becomes segments with
|
Full Usage:
truncate maxBytes rt
Parameters:
int
-
The maximum number of UTF-8 bytes to keep.
rt : RichTextValue
-
The rich text value to truncate.
Returns: RichTextValue
A new RichTextValue truncated to the byte limit. Facets that would extend
beyond the truncated text are removed (not partially preserved).
|
Truncate rich text to a maximum UTF-8 byte length while preserving facet integrity. The text is truncated at the byte limit (on a valid UTF-8 boundary), and any facets that extend beyond the limit are removed entirely.
|