Discuss on Reddit, GitHub or Discord.
Recap
This is the third part of the blog series on Programming Extensible Data Types in Rust with CGP. You can read the first and second parts here.
As a recap, we have covered the new release of CGP v0.4.2 which now supports the use of extensible records and variants, allowing developers to write code that operates on any struct containing specific fields or any enum containing specific variants, without needing their concrete definition.
In the first part of the series, Modular App Construction and Extensible Builders, we demonstrated an example use of the extensible builder pattern, which uses extensible records to support modular construction of an application context.
Similarly, in the second part of the series, Modular Interpreters and Extensible Visitors, we saw how the modular visitor pattern allows us to implement evaluation and to-Lisp conversion for each variant of a language expression enum using separate visitor providers.
At this point, you’ve likely seen how these patterns can make real-world applications more modular and maintainable. If these examples have convinced you of CGP’s practical value, that’s great. But if you still feel the examples are not grounded enough in production use cases, you are welcome to pause here and revisit CGP later.
The next 2 parts of the series are aimed at readers who want to go deeper — those interested in how CGP implements extensible data types under the hood and who might even want to contribute to CGP itself by helping to build the real-world examples you’re looking for.
We will first explore the implementation of extensible records in this part, followed by the implementation of extensible variants in the coming Part 4.
Series Overview
Part 1: Modular App Construction and Extensible Builders – In this introductory part, we present a high-level overview of the key features enabled by extensible data types. We then dive into a hands-on demonstration showing how extensible records can be used to build and compose modular builders for real-world applications.
Part 2: Modular Interpreters and Extensible Visitors – This part continues the demonstration by introducing extensible variants. We use them to address the expression problem, implementing a set of reusable interpreter components for a small toy language.
Part 3: Implementing Extensible Records (this post) – Here, we walk through the internal mechanics behind extensible records. We show how CGP supports the modular builder pattern demonstrated in Part 1 through its underlying type and trait machinery.
Part 4: Implementing Extensible Variants – This part mirrors Part 3, but for extensible variants. We examine how extensible variants are implemented, and compare the differences and similarities between extensible records and variants.
Underlying Theory
The design of extensible data types in CGP is inspired by a rich body of research in the programming languages community. In particular, CGP draws heavily from the implementation techniques found in datatype-generic programming in Haskell, as well as ideas presented in the paper Abstracting extensible data types: or, rows by any other name. Concepts similar to CGP’s have also appeared in other languages under names like row polymorphism and polymorphic variants.
While it might be too academic — or simply too dry — for this post to delve into all the theoretical differences between CGP and these approaches, it’s worth highlighting one essential distinction: CGP’s seamless integration of extensible data types with the CGP system itself. This tight integration sets CGP apart by making the extensibility not just a design principle, but a native part of how you build and scale software in Rust.
Constraint Propagation Problem
At the heart of CGP lies a powerful solution to a common challenge:
how can we compose two generic functions that each rely on their own row constraints, and automatically propagate those constraints into the resulting composed function?
To illustrate this problem, consider the following simple example:
pub fn first_name_to_string<Context>(context: &Context) -> String
where
Context: HasField<symbol!("first_name"), Value: Display>,
{
context.get_field(PhantomData).to_string()
}
pub fn last_name_to_string<Context>(context: &Context) -> String
where
Context: HasField<symbol!("last_name"), Value: Display>,
{
context.get_field(PhantomData).to_string()
}
In this example, we define two functions that extract the first_name
and last_name
fields from a generic Context
type and convert them to strings. The trait HasField
is used to represent a row constraint, where field names are expressed as type-level strings using symbol!
.
Now, suppose we want to combine the output of these two functions into a full name string. We can write a function that explicitly concatenates the two results, like so:
pub fn full_name_to_string<Context>(context: &Context) -> String
where
Context: HasField<symbol!("first_name"), Value: Display>
+ HasField<symbol!("last_name"), Value: Display>,
{
format!(
"{} {}",
first_name_to_string(context),
last_name_to_string(context)
)
}
This works, but requires us to manually specify all the field constraints in the function signature. This becomes increasingly cumbersome as the number of required constraints grows.
More critically, it becomes difficult to write generic higher-order functions that can accept any two such functions with their own constraints, and automatically compose them into a new function that correctly propagates those constraints. For instance, consider this naive attempt:
pub fn concate_outputs<Context>(
fn_a: impl Fn(&Context) -> String,
fn_b: impl Fn(&Context) -> String,
) -> impl Fn(&Context) -> String {
move |context| format!("{} {}", fn_a(context), fn_b(context))
}
Here, concate_outputs
takes two functions and returns a new one that calls both and concatenates their results. However, the function signature does not capture the constraints required by fn_a
and fn_b
. As a result, when we use this to build a composed function, we still have to restate all of the necessary constraints explicitly:
pub fn full_name_to_string<Context>(context: &Context) -> String
where
Context: HasField<symbol!("first_name"), Value: Display>
+ HasField<symbol!("last_name"), Value: Display>,
{
let composed = concate_outputs(first_name_to_string, last_name_to_string);
composed(context)
}
This works within the body of a function, but it prevents us from defining the composed function as a top-level value. In other words, we cannot simply write something like:
// Invalid export
pub let full_name_to_string = concate_outputs(first_name_to_string, last_name_to_string);
and then export that directly from a module. The constraints must still be manually declared somewhere, limiting the expressiveness and reusability of our composition.
Type-Level Composition
In many programming languages research, solving the problem of constraint-aware function composition typically requires advanced language features like constraint kinds. These features allow constraints to be expressed and manipulated at the type level, enabling functions with different requirements to be composed seamlessly. However, languages like Rust do not yet support these advanced capabilities. This limitation has been one of the major reasons why extensible data types have not seen broader adoption in the Rust and other mainstream languages.
The breakthrough that CGP introduces is the ability to perform this kind of composition in Rust today — without waiting for the language to evolve. The key insight is to represent functions as types, and then use CGP's system of traits and type-level programming to manage composition and constraint propagation.
To see how this works, let’s begin by transforming two basic functions into CGP providers. Instead of writing traditional functions, we define FirstNameToString
and LastNameToString
as types that implement the ComputerRef
trait:
#[cgp_new_provider]
impl<Context, Code, Input> ComputerRef<Context, Code, Input> for FirstNameToString
where
Context: HasField<symbol!("first_name"), Value: Display>,
{
type Output = String;
fn compute_ref(context: &Context, _code: PhantomData<Code>, _input: &Input) -> String {
context.get_field(PhantomData).to_string()
}
}
#[cgp_new_provider]
impl<Context, Code, Input> ComputerRef<Context, Code, Input> for LastNameToString
where
Context: HasField<symbol!("last_name"), Value: Display>,
{
type Output = String;
fn compute_ref(context: &Context, _code: PhantomData<Code>, _input: &Input) -> String {
context.get_field(PhantomData).to_string()
}
}
In this setup, FirstNameToString
and LastNameToString
are no longer standalone functions, but rather types that can be plugged into the CGP system. Each type implements the ComputerRef
trait and specifies its output as a String
. For simplicity, we ignore the Code
and Input
parameters, as they are not essential for this example.
Once we have our functions defined as types, we can now build a new provider that composes them:
#[cgp_new_provider]
impl<Context, Code, Input, ProviderA, ProviderB> ComputerRef<Context, Code, Input>
for ConcatOutputs<ProviderA, ProviderB>
where
ProviderA: ComputerRef<Context, Code, Input, Output: Display>,
ProviderB: ComputerRef<Context, Code, Input, Output: Display>,
{
type Output = String;
fn compute_ref(context: &Context, code: PhantomData<Code>, input: &Input) -> String {
let output_a = ProviderA::compute_ref(context, code, input);
let output_b = ProviderB::compute_ref(context, code, input);
format!("{} {}", output_a, output_b)
}
}
ConcatOutputs
acts as a higher-order provider: it composes two other providers and returns a new one that computes both and combines their results. This is conceptually similar to a higher-order function that takes two closures and returns a new closure, except that it operates entirely at the type level. It requires that both inner providers implement ComputerRef
and produce outputs that implement Display
, ensuring that both results can be formatted into a string.
The real power of this approach becomes evident when we compose the two providers with a simple type alias:
pub type FullNameToString = ConcatOutputs<FirstNameToString, LastNameToString>;
With this definition, FullNameToString
behaves like a single provider that computes the full name by combining the first and last name. What’s remarkable is that we don’t need to explicitly restate the underlying constraints from FirstNameToString
and LastNameToString
. Those constraints are inferred and propagated automatically, and will only be enforced at the point where FullNameToString
is actually used.
This programming model effectively introduces lazy type-level computation. The logic for computing outputs is driven by trait implementations at the type level, but the actual evaluation only occurs when the provider is invoked. This allows CGP to perform complex, constraint-aware compositions without requiring language-level support.
Base Implementation
HasField
Trait
Now that we have a clearer understanding of how CGP builds on extensible data types, it is time to take a closer look at the key field-related traits that CGP introduces to support this functionality. We begin with the simplest and most foundational trait: HasField
.
The HasField
trait has been part of CGP since its early versions, but it is still worth revisiting here to bridge the gap toward the more advanced constructs that follow. Its definition is straightforward:
pub trait HasField<Tag> {
type Value;
fn get_field(&self, _tag: PhantomData<Tag>) -> &Self::Value;
}
This trait provides read-only access to individual fields of a struct, using Tag
as the field identifier. The tag indicates which field we want to access, and can take one of two forms depending on the struct: symbol!("first_name")
for named fields, or Index<0>
for tuple-style fields. The associated type Value
represents the type of the field being accessed, and the get_field
method returns a reference to that field.
The use of PhantomData<Tag>
as an argument may look unusual at first, but it plays a critical role in allowing Rust to infer which field is being requested. When multiple HasField
implementations are available for a type, this allows the compiler to resolve the correct one.
Consider the following struct:
#[derive(HasField)]
pub struct Person {
pub first_name: String,
pub last_name: String,
}
When the HasField
macro is derived, it automatically generates implementations for both fields as follows.
impl HasField<symbol!("first_name")> for Person {
type Value = String;
fn get_field(&self, _tag: PhantomData<symbol!("first_name")>) -> &Self::Value {
&self.first_name
}
}
impl HasField<symbol!("last_name")> for Person {
type Value = String;
fn get_field(&self, _tag: PhantomData<symbol!("last_name")>) -> &Self::Value {
&self.last_name
}
}
This allows generic code to access either field in a type-safe way without requiring access to the concrete types.
With HasField
, we can write code that reads from a subset of fields on a struct, enabling a flexible and reusable programming model. However, if we want to construct such subsets — as is required for the extensible builder pattern — we will first need to introduce a few additional building blocks.
Partial Records
One of the core limitations of Rust’s struct system is that when constructing a value, you must provide values for all of its fields. This rigidity makes it difficult to build structs in a piecemeal fashion. To overcome this, CGP introduces the idea of partial records — a way to represent a struct with some fields intentionally left out.
Let’s consider the same Person
struct used previously:
#[derive(BuildField)]
pub struct Person {
pub first_name: String,
pub last_name: String,
}
When we derive the BuildField
trait for this struct, CGP automatically generates a corresponding partial version of the struct called PartialPerson
:
pub struct PartialPerson<F0: MapType, F1: MapType> {
pub first_name: F0::Map<String>,
pub last_name: F1::Map<String>,
}
Here, F0
and F1
are type parameters that must implement the MapType
trait. These type parameters control whether each field is present or not in a given instance of the partial struct.
The MapType
trait is defined as follows:
pub trait MapType {
type Map<T>;
}
This trait uses a generic associated type (GAT) called Map
to determine how the original field types should be transformed. In particular, Map<T>
will either be T
itself (if the field is present) or a placeholder type ()
(if the field is missing).
To support this, CGP provides two implementations of the MapType
trait. The first is IsPresent
, which maps a type to itself to indicate that a field is included:
pub struct IsPresent;
impl MapType for IsPresent {
type Map<T> = T;
}
The second is IsNothing
, which maps every type to ()
to indicate that the field is absent:
pub struct IsNothing;
impl MapType for IsNothing {
type Map<T> = ();
}
To see how this works in practice, suppose we want to construct a partial Person
value that only includes the first_name
field. We can instantiate the type as PartialPerson<IsPresent, IsNothing>
, where F0 = IsPresent
and F1 = IsNothing
. The resulting type expands to:
pub struct PartialPerson {
pub first_name: String,
pub last_name: (),
}
This means we can create a partial instance like this:
let partial_person = PartialPerson::<IsPresent, IsNothing> {
first_name: "John".to_owned(),
last_name: (),
};
HasBuilder
Trait
Once we have defined a partial record struct, we can introduce the HasBuilder
trait. This trait allows us to initialize a partial record where all fields are absent by default:
pub trait HasBuilder {
type Builder;
fn builder() -> Self::Builder;
}
The HasBuilder
trait defines an associated type called Builder
, which represents the initial form of the partial record. The key requirement is that this Builder
must have all of its fields empty, since the builder()
method constructs it without requiring any input.
For the Person
struct, we can implement the HasBuilder
trait as follows:
impl HasBuilder for Person {
type Builder = PartialPerson<IsNothing, IsNothing>;
fn builder() -> Self::Builder {
PartialPerson {
first_name: (),
last_name: (),
}
}
}
In this implementation, the initial Builder
is simply a PartialPerson
type where both field parameters use the IsNothing
type mapper. To construct the empty builder, we initialize each field with its mapped type, which for IsNothing
is always the unit type ()
. This gives us a clean and predictable starting point for incrementally building up a complete Person
instance.
BuildField
Trait
With the initial builder in place, the next step is to define the BuildField
trait, which enables us to incrementally populate fields in the partial record. This trait is defined as follows:
pub trait BuildField<Tag> {
type Value;
type Output;
fn build_field(self, _tag: PhantomData<Tag>, value: Self::Value) -> Self::Output;
}
The BuildField
trait is parameterized by a Tag
type, which identifies the field being constructed, just like in the HasField
trait. It also includes an associated type Value
, representing the type of the field being added.
The trait additionally includes an associated type Output
, which represents the new type of the builder after the field has been inserted. This Output
type is essential because each insertion updates the builder’s type parameters, effectively transforming it into a new type that reflects the presence of additional fields.
The build_field
method takes ownership of the current builder and consumes the new field value, returning an updated builder that includes the newly added field.
To see how this works in practice, consider the implementation of BuildField
for the first_name
field of the PartialPerson
struct:
impl<F1: MapType> BuildField<symbol!("first_name")> for PartialPerson<IsNothing, F1> {
type Value = String;
type Output = PartialPerson<IsPresent, F1>;
fn build_field(self, _tag: PhantomData<symbol!("first_name")>, value: Self::Value) -> Self::Output {
PartialPerson {
first_name: value,
last_name: self.last_name,
}
}
}
In this implementation, we define BuildField<symbol!("first_name")>
for a PartialPerson
where the first_name
field is absent (IsNothing
) and the last_name
field is parameterized generically as F1
. This means the implementation will work regardless of whether last_name
is present or not. We specify the Value
as String
, which matches the type of first_name
, and set the Output
type to a new PartialPerson
where the first parameter has been updated to IsPresent
. The second parameter remains unchanged to preserve the existing state of last_name
.
The method body constructs a new PartialPerson
where the first_name
field is now set to the given value, while the last_name
field is carried over from the original builder.
Similarly, we can define BuildField
for the last_name
field:
impl<F0: MapType> BuildField<symbol!("last_name")> for PartialPerson<F0, IsNothing> {
type Value = String;
type Output = PartialPerson<F0, IsPresent>;
fn build_field(self, _tag: PhantomData<symbol!("last_name")>, value: Self::Value) -> Self::Output {
PartialPerson {
first_name: self.first_name,
last_name: value,
}
}
}
In this case, the generic parameter F0
tracks the state of first_name
, while IsNothing
ensures that last_name
is not yet present. The logic follows the same structure as the earlier implementation, simply updating the appropriate field.
With these implementations in place, we can now use the HasBuilder
and BuildField
traits together to construct a Person
incrementally:
Person::builder() // PartialPerson<IsNothing, IsNothing>
.build_field(PhantomData::<symbol!("first_name")>, "John".to_string()) // PartialPerson<IsPresent, IsNothing>
.build_field(PhantomData::<symbol!("last_name")>, "Smith".to_string()) // PartialPerson<IsPresent, IsPresent>
Notably, the order in which fields are built does not matter. The type transformations ensure correctness regardless of sequencing, so the builder also works if we construct the last_name
field first:
Person::builder() // PartialPerson<IsNothing, IsNothing>
.build_field(PhantomData::<symbol!("last_name")>, "Smith".to_string()) // PartialPerson<IsNothing, IsPresent>
.build_field(PhantomData::<symbol!("first_name")>, "John".to_string()) // PartialPerson<IsPresent, IsPresent>
This gives developers the freedom to build up records in any order while maintaining type safety.
FinalizeBuild
Trait
The previous example demonstrated how the builder
and build_field
methods can be used to construct values in the style of a fluent builder pattern. However, it is important to note that the result of the final build_field
call is still a PartialPerson<IsPresent, IsPresent>
, not a fully constructed Person
.
This limitation arises because each BuildField
implementation for PartialPerson
is only responsible for inserting a single field. The presence or absence of other fields is abstracted away through generic type parameters, which means that the implementation cannot detect when the final field has been added. Consequently, there is no opportunity to directly return a Person
value when the last required field is set.
While this might make the final construction step slightly more verbose, it is a deliberate trade-off. The generic nature of BuildField
is what allows fields to be built in any order without having to implement every possible combination of partially constructed records — something that would otherwise result in an overwhelming combinatorial explosion of implementations.
To resolve this, CGP introduces the FinalizeBuild
trait. This trait is used once the builder has been fully populated, converting the complete PartialPerson
into a proper Person
value:
pub trait FinalizeBuild {
type Output;
fn finalize_build(self) -> Self::Output;
}
The FinalizeBuild
trait defines an associated Output
type, representing the final constructed result. The finalize_build
method takes ownership of the builder and transforms it into the desired output.
For PartialPerson
, the trait is implemented as follows:
impl FinalizeBuild for PartialPerson<IsPresent, IsPresent> {
type Output = Person;
fn finalize_build(self) -> Self::Output {
Person {
first_name: self.first_name,
last_name: self.last_name,
}
}
}
This implementation only applies when all fields are marked as present. At this point, the builder contains all the data necessary to construct a Person
, so the conversion is a straightforward transfer of fields.
With this in place, the build process becomes complete by appending a call to finalize_build
:
let person = Person::builder() // PartialPerson<IsNothing, IsNothing>
.build_field(PhantomData::<symbol!("first_name")>, "John".to_string()) // PartialPerson<IsPresent, IsNothing>
.build_field(PhantomData::<symbol!("last_name")>, "Smith".to_string()) // PartialPerson<IsPresent, IsPresent>
.finalize_build(); // Person
Together, the partial record struct and the traits HasBuilder
, BuildField
, and FinalizeBuild
form a solid and ergonomic foundation for supporting extensible records in CGP. All of these pieces are automatically generated by the #[derive(BuildField)]
macro. We do not use multiple derive macros here, so as to ensure consistency and correctness, eliminating the possibility of compilation failures due to the user missing to derive one of these traits.
Implementation of Record Merges
With the field builder traits in place, we can now explore how the earlier struct building method build_from
can be implemented to support merging a Person
struct into a superset struct such as Employee
.
Before we can implement build_from
, we first need a few more supporting constructs to enable the merging operation.
IntoBuilder
Trait
In typical usage, partial records begin in an empty state and are gradually populated with field values until they can be finalized into a complete struct. However, we can also go in the opposite direction by converting a fully constructed struct into a partial record where all fields are present, and then progressively remove fields from it, one by one, until none remain.
This reverse direction is particularly important for the merging process, where we need to move fields out of a source struct and insert them into a target partial record, field by field.
To support this, we introduce the IntoBuilder
trait:
pub trait IntoBuilder {
type Builder;
fn into_builder(self) -> Self::Builder;
}
This trait mirrors the structure of HasBuilder
, with an associated Builder
type that represents a partial record in which all fields are populated. The key difference is that into_builder
consumes the original struct and produces its fully populated partial record equivalent.
The implementation of IntoBuilder
for Person
looks like this:
impl IntoBuilder for Person {
type Builder = PartialPerson<IsPresent, IsPresent>;
fn into_builder(self) -> Self::Builder {
PartialPerson {
first_name: self.first_name,
last_name: self.last_name,
}
}
}
If you compare this to the implementation of the earlier FinalizeBuild
trait, you’ll see that they are nearly identical in structure. The only difference is the direction of conversion — IntoBuilder
transforms a Person
into a PartialPerson<IsPresent, IsPresent>
, while FinalizeBuild
does the reverse.
Even though these interfaces are similar, we define IntoBuilder
and FinalizeBuild
as separate traits. This distinction is valuable because it makes the intent of each trait clear and prevents accidental misuse. Each trait captures a specific stage in the lifecycle of a partial record, whether we are constructing it from scratch, building it up field by field, or tearing it down for merging.
TakeField
Trait
Now that we have the IntoBuilder
trait to help convert a struct into a fully populated partial record, we also need a way to extract individual fields from that partial record. This is where the TakeField
trait comes in. It serves as the opposite of BuildField
, allowing us to take a field value out of a partial record:
pub trait TakeField<Tag> {
type Value;
type Remainder;
fn take_field(self, _tag: PhantomData<Tag>) -> (Self::Value, Self::Remainder);
}
The Tag
and Value
types in TakeField
behave just like they do in BuildField
and HasField
. What is new here is the associated type Remainder
, which represents the state of the partial record after the specified field has been removed. The take_field
method consumes self
and returns a tuple containing the extracted field value along with the updated remainder of the partial record.
As an example, here is the TakeField
implementation for extracting the first_name
field from a PartialPerson
:
impl<F1> TakeField<symbol!("first_name")> for PartialPerson<IsPresent, F1> {
type Value = String;
type Remainder = PartialPerson<IsNothing, F1>;
fn take_field(self, _tag: PhantomData<symbol!("first_name")>) -> (Self::Value, Self::Remainder) {
let value = self.first_name;
let remainder = PartialPerson {
first_name: (),
last_name: self.last_name,
};
(value, remainder)
}
}
As shown, this implementation is defined for a PartialPerson
where the first generic parameter is IsPresent
, indicating that the first_name
field is available to be taken. In the resulting Remainder
, that parameter is updated to IsNothing
, reflecting the removal of the field. The method itself returns the first_name
value and a new partial record with first_name
cleared and the rest of the fields left untouched.
This setup provides the building blocks needed to flexibly extract and transfer fields, which is essential for safely merging one struct into another.
HasFields
Trait
With IntoBuilder
available, we can now begin transferring fields from a source struct into a target partial record by peeling them off one at a time. To enable this process generically, we need a mechanism to enumerate the fields of a struct so that generic code can discover which fields are available for transfer.
This is where the HasFields
trait comes into play. It is defined as follows:
pub trait HasFields {
type Fields;
}
The HasFields
trait includes a single associated type called Fields
, which holds a type-level list representing all the fields of the struct. For example, here is how HasFields
would be implemented for the Person
struct:
impl HasFields for Person {
type Fields = Product![
Field<symbol!("first_name"), String>,
Field<symbol!("last_name"), String>,
];
}
Once the fields of a struct are made available as a type-level list, this list can be used to drive type-level iteration for field-wise operations such as merging. This lays the foundation for generically moving fields between records in a structured and type-safe way.
BuildFrom
Trait
The build_from
method is defined within the CanBuildFrom
trait, which uses a blanket implementation to support merging fields from one struct into another:
pub trait CanBuildFrom<Source> {
type Output;
fn build_from(self, source: Source) -> Self::Output;
}
impl<Builder, Source, Output> CanBuildFrom<Source> for Builder
where
Source: HasFields + IntoBuilder,
Source::Fields: FieldsBuilder<Source::Builder, Builder, Output = Output>,
{
type Output = Output;
fn build_from(self, source: Source) -> Output {
Source::Fields::build_fields(source.into_builder(), self)
}
}
In this setup, the CanBuildFrom
trait is implemented for the partial record acting as the target of the merge, while the Source
type is specified through a generic parameter. The trait defines an associated type Output
, representing the result of merging the fields from Source
into the current builder. The build_from
method takes ownership of both the builder (self
) and the source value, and returns a new builder that incorporates the fields from Source
.
The blanket implementation of CanBuildFrom
imposes several trait bounds. First, the Source
type must implement both HasFields
and IntoBuilder
. These traits provide access to a type-level list of fields and the ability to convert Source
into a partial record with all fields present. The field list obtained from Source::Fields
must also implement the FieldsBuilder
helper trait. This trait is responsible for driving the merge process field by field, taking as input the Source::Builder
and the target Builder
(i.e., Self
), and producing an Output
.
The implementation of build_from
begins by converting the source into its partial form using into_builder
, then delegates the merge logic to FieldsBuilder::build_fields
, which handles transferring each field in sequence.
FieldsBuilder
Trait
The FieldsBuilder
trait is a private helper used exclusively by the CanBuildFrom
trait. It is defined as follows:
trait FieldsBuilder<Source, Target> {
type Output;
fn build_fields(source: Source, target: Target) -> Self::Output;
}
Unlike CanBuildFrom
, the parameters for FieldsBuilder
are slightly reordered. The Self
type represents the list of fields from the source struct, while Source
and Target
refer to the partial records we are merging from and into. The goal of this trait is to drive type-level iteration across the list of fields, one by one, and move each field from the source to the target.
To accomplish this, we start with an implementation that matches on the head of the Fields
list:
impl<Source, Target, RestFields, Tag, Value> FieldsBuilder<Source, Target>
for Cons<Field<Tag, Value>, RestFields>
where
Source: TakeField<Tag, Value = Value>,
Target: BuildField<Tag, Value = Value>,
RestFields: FieldsBuilder<Source::Remainder, Target::Output>,
{
type Output = RestFields::Output;
fn build_fields(source: Source, target: Target) -> Self::Output {
let (value, next_source) = source.take_field(PhantomData);
let next_target = target.build_field(PhantomData, value);
RestFields::build_fields(next_source, next_target)
}
}
Here, we pattern match the Fields
type to Cons<Field<Tag, Value>, RestFields>
, allowing us to extract the field name Tag
and its type Value
. Given this information, we require that the Source
partial record implements TakeField
and that the Target
partial record implements BuildField
, both using the same Tag
and Value
.
We then handle the remaining fields recursively by requiring RestFields
to implement FieldsBuilder
, using the Remainder
type from TakeField
as the next Source
, and the Output
type from BuildField
as the next Target
. This enables a seamless hand-off from one field to the next during the merging process.
The recursive chain is terminated by the base case, when there are no more fields left to process:
impl<Source, Target> FieldsBuilder<Source, Target> for Nil {
type Output = Target;
fn build_fields(_source: Source, target: Target) -> Self::Output {
target
}
}
In this final implementation, the Source
partial record is now fully depleted — all fields have been taken out — so we simply return the Target
, which now contains all the merged fields.
Example Use of BuildFrom
The implementation of FieldsBuilder
can appear intimidating at first, particularly for readers unfamiliar with type-level programming. To make the process more approachable, let’s walk through a concrete example using BuildFrom
to illustrate what actually happens under the hood.
Consider a new struct named Employee
, which contains the same fields as Person
, along with an additional field called employee_id
:
#[derive(BuildField)]
pub struct Employee {
pub employee_id: u64,
pub first_name: String,
pub last_name: String,
}
We begin by constructing a Person
value. After that, we can use build_from
to merge its contents into a partially built Employee
:
let person = Person {
first_name: "John".to_owned(),
last_name: "Smith".to_owned(),
};
let employee = Employee::builder() // PartialEmployee<IsNothing, IsNothing, IsNothing>
.build_from(person) // PartialEmployee<IsNothing, IsPresent, IsPresent>
.build_field(PhantomData::<symbol!("employee_id")>, 1) // PartialEmployee<IsPresent, IsPresent, IsPresent>
.finalize_build(); // Person
When we call build_from
, several steps take place behind the scenes:
- The type
PartialEmployee<IsNothing, IsNothing, IsNothing>
is required to implementCanBuildFrom<Person>
.- The
Person
type implementsHasFields::Fields
as:Product![ Field<symbol!("first_name"), String>, Field<symbol!("last_name"), String>, ]
Person
also implementsIntoBuilder
, producingPartialPerson<IsPresent, IsPresent>
as itsBuilder
.
- The
Person::Fields
must then implementFieldsBuilder<PartialPerson<IsPresent, IsPresent>, PartialEmployee<IsNothing, IsNothing, IsNothing>>
.- The first
Cons
in the list matches:Tag
issymbol!("first_name")
Value
isString
RestFields
isCons<Field<symbol!("last_name"), String>, Nil>
PartialPerson<IsPresent, IsPresent>
implementsTakeField<symbol!("first_name"), Value = String>
, resulting in:- A
Remainder
ofPartialPerson<IsNothing, IsPresent>
- A
PartialEmployee<IsNothing, IsNothing, IsNothing>
implementsBuildField<symbol!("first_name"), Value = String>
, producing:- An
Output
ofPartialEmployee<IsNothing, IsPresent, IsNothing>
- An
- The first
- The remaining fields,
Cons<Field<symbol!("last_name"), String>, Nil>
, must now implementFieldsBuilder<PartialPerson<IsNothing, IsPresent>, PartialEmployee<IsNothing, IsPresent, IsNothing>>
.- This matches:
Tag
issymbol!("last_name")
Value
isString
RestFields
isNil
PartialPerson<IsNothing, IsPresent>
implementsTakeField<symbol!("last_name"), Value = String>
, giving:- A
Remainder
ofPartialPerson<IsNothing, IsNothing>
- A
PartialEmployee<IsNothing, IsPresent, IsNothing>
implementsBuildField<symbol!("last_name"), Value = String>
, yielding:- An
Output
ofPartialEmployee<IsNothing, IsPresent, IsPresent>
- An
- This matches:
- Finally, the
Nil
case implementsFieldsBuilder<PartialPerson<IsNothing, IsNothing>, PartialEmployee<IsNothing, IsPresent, IsPresent>>
, which concludes by returning:PartialEmployee<IsNothing, IsPresent, IsPresent>
as the finalOutput
.
Although these steps may seem complex at first glance, a closer look reveals that the process simply moves each field from the PartialPerson
instance into the PartialEmployee
, one at a time. What makes this look more complicated is not the logic itself, but the fact that it is encoded entirely at the type level using traits and generics, rather than as regular Rust code.
If any part of this explanation remains unclear, it might be helpful to paste this blog post — or just this sub-section — into your favorite LLM and ask it to explain the process in simpler terms. Hopefully, the explanation provided here is already clear enough for an LLM to understand, so that it can in turn help make this pattern more accessible to developers who are still learning the intricacies of type-level programming.
Builder Dispatcher
With the BuildFrom
trait in place, we can now explore how CGP implements generalized builder dispatchers that enable flexible and reusable ways to assemble struct fields from various sources.
BuildWithHandlers
Trait
In earlier examples, we used a utility called BuildAndMergeOutputs
to combine outputs from multiple builder providers such as BuildSqliteClient
, BuildHttpClient
, and BuildOpenAiClient
. Under the hood, BuildAndMergeOutputs
is built upon a more fundamental dispatcher named BuildWithHandlers
. The implementation of this dispatcher looks like the following:
#[cgp_provider]
impl<Context, Code, Input, Output, Builder, Handlers, Res> Computer<Context, Code, Input>
for BuildWithHandlers<Output, Handlers>
where
Output: HasBuilder<Builder = Builder>,
PipeHandlers<Handlers>: Computer<Context, Code, Builder, Output = Res>,
Res: FinalizeBuild<Output = Output>,
{
type Output = Output;
fn compute(context: &Context, code: PhantomData<Code>, _input: Input) -> Self::Output {
PipeHandlers::compute(context, code, Output::builder()).finalize_build()
}
}
The BuildWithHandlers
struct is parameterized by an Output
type, which is the final struct we want to construct, and a Handlers
type, which represents a type-level list of builder handlers. These handlers will be used to incrementally populate the fields of the Output
struct.
This provider implements the Computer
trait for any combination of Context
, Code
, and Input
, although the Input
parameter is intentionally ignored. To begin, the dispatcher requires the Output
type to implement HasBuilder
, which gives access to an initially empty partial record through Output::builder()
.
Once the empty builder is obtained, it is passed as input to PipeHandlers<Handlers>
, a pipeline that applies each handler in the list to progressively build up the partial record. The result of this pipeline must implement FinalizeBuild
, allowing it to be converted into the fully constructed Output
struct.
And yes, if you're wondering whether this PipeHandlers
is the same one used in Hypershell to build shell-like command pipelines — the answer is absolutely yes. BuildWithHandlers
operates by initializing an empty partial record, passing it through a chain of handlers using PipeHandlers
, and then finalizing the result into a complete struct. It’s the same elegant piping mechanism, just applied to a different domain.
This reuse of core abstractions like Pipe
and Handler
across different systems is one of the most powerful aspects of CGP. These components weren’t designed just for piping shell commands — they were built to support general-purpose function composition, a core concept in functional programming.
Example Use of BuildWithHandlers
The BuildWithHandlers
trait serves as the foundational builder dispatcher in CGP. It is the low-level mechanism behind higher-level dispatchers like BuildAndMergeOutputs
, and understanding it provides valuable insight into how CGP composes complex data construction pipelines.
Let’s walk through a concrete example to illustrate how BuildWithHandlers
works in practice.
Suppose we have two Computer
providers: one that constructs a Person
and another that produces a u64
value for the employee_id
field. We can use BuildWithHandlers
to compose these providers and construct an Employee
.
We begin by implementing the two providers as simple functions using the #[cgp_producer]
macro:
#[cgp_producer]
pub fn build_person() -> Person {
Person {
first_name: "John".to_owned(),
last_name: "Smith".to_owned(),
}
}
#[cgp_producer]
pub fn build_employee_id() -> u64 {
1
}
The #[cgp_producer]
macro generates provider implementations (named BuildPerson
and BuildEmployeeId
) that wrap these functions. These implementations conform to the Computer
trait by automatically calling the associated function. This macro allows pure functions to be used as providers with minimal boilerplate, especially useful when you don't need to depend on the generic Context
or Code
parameters.
Now, with both providers defined, we can compose them using BuildWithHandlers
like so:
let employee = BuildWithHandlers::<
Employee,
Product![
BuildAndMerge<BuildPerson>,
BuildAndSetField<symbol!("employee_id"), BuildEmployeeId>
],
>::compute(&(), PhantomData::<()>, ());
In this example, we tell BuildWithHandlers
that the final struct we want to build is Employee
. We then provide a list of two builder handlers:
BuildAndMerge<BuildPerson>
wraps theBuildPerson
provider. It first calls the provider to build aPerson
, then usesBuildFrom
to merge the resulting fields into thePartialEmployee
builder.BuildAndSetField<symbol!("employee_id"), BuildEmployeeId>
wraps theBuildEmployeeId
provider. It calls the provider to get theu64
value and then usesBuildField
to assign that value to theemployee_id
field.
We invoke compute
on the specialized BuildWithHandlers
, using unit types ()
as dummy arguments for Context
, Code
, and Input
. In real-world applications, these types would typically carry contextual information such as configurations or runtime.
This example demonstrates the flexibility of BuildWithHandlers
. You’re not limited to merging entire record subsets — you can also work with providers that produce individual field values, then insert them into specific fields using composable builder logic.
BuildAndMerge
The BuildAndMerge
adapter is relatively simple and is defined as follows:
#[cgp_provider]
impl<Context, Code, Builder, Provider, Output, Res> Computer<Context, Code, Builder>
for BuildAndMerge<Provider>
where
Provider: for<'a> Computer<Context, Code, &'a Builder, Output = Res>,
Builder: CanBuildFrom<Res, Output = Output>,
{
type Output = Output;
fn compute(context: &Context, code: PhantomData<Code>, builder: Builder) -> Self::Output {
let output = Provider::compute(context, code, &builder);
builder.build_from(output)
}
}
BuildAndMerge
wraps an inner provider (such as BuildPerson
) and expects the current Builder
— a partial record for the target struct — as input. The inner provider is invoked with a reference to this builder as its inner input, allowing it to inspect any fields that may already be set. This feature enables more advanced scenarios where intermediate results influence later ones.
After the inner provider returns a value, BuildAndMerge
uses the BuildFrom
trait to merge the result into the existing builder. The merged builder is then returned as output.
In simpler cases like BuildPerson
, the input builder is ignored, so the trait bounds are trivially satisfied. But the mechanism still works the same way: take a value and merge its fields into the partial record.
BuildAndSetField
The BuildAndSetField
adapter works similarly but focuses on setting a single field rather than merging a full record:
#[cgp_provider]
impl<Context, Code, Tag, Value, Provider, Output, Builder> Computer<Context, Code, Builder>
for BuildAndSetField<Tag, Provider>
where
Provider: for<'a> Computer<Context, Code, &'a Builder, Output = Value>,
Builder: BuildField<Tag, Value = Value, Output = Output>,
{
type Output = Output;
fn compute(context: &Context, code: PhantomData<Code>, builder: Builder) -> Self::Output {
let value = Provider::compute(context, code, &builder);
builder.build_field(PhantomData::<Tag>, value)
}
}
Instead of calling BuildFrom
, this adapter uses the BuildField
trait to assign a value to a specific field. The field to be set is identified by Tag
, and its value comes from the wrapped provider.
This adapter is ideal for cases where you have an individual value (such as u64
) and want to insert it into a specific field (like employee_id
).
MapFields
Trait
In the previous example, we saw how to use BuildWithHandlers
with a list of individual providers — each wrapped in BuildAndMerge
or BuildAndSetField
— to build a composite struct. This same pattern is what powers higher-level dispatchers like BuildAndMergeOutputs
.
Since the pattern of wrapping a list of handlers is so common, CGP provides the MapFields
trait to simplify the process. Here’s how it’s defined:
pub trait MapFields<Mapper> {
type Map;
}
impl<Mapper, Current, Rest> MapFields<Mapper> for Cons<Current, Rest>
where
Mapper: MapType,
Rest: MapFields<Mapper>,
{
type Map = Cons<Mapper::Map<Current>, Rest::Map>;
}
impl<Mapper> MapFields<Mapper> for Nil {
type Map = Nil;
}
The MapFields
trait enables type-level mapping over a type-level list — very similar to how .iter().map()
works on value-level lists in Rust. You pass in a Mapper
that implements the MapType
trait, and it is applied to each element in the list. This is the same MapType
trait we’ve seen used earlier in other utilities, like IsPresent
and IsNothing
.
BuildAndMergeOutputs
Now that we have the pieces in place, we can implement BuildAndMergeOutputs
as a composition of BuildWithHandlers
and BuildAndMerge
.
We start by defining a simple type mapper called ToBuildAndMergeHandler
, which takes any Handler
type and maps it to a BuildAndMerge<Handler>
:
pub struct ToBuildAndMergeHandler;
impl MapType for ToBuildAndMergeHandler {
type Map<Handler> = BuildAndMerge<Handler>;
}
With that, we can define BuildAndMergeOutputs
simply as a type alias. It takes an Output
type and a list of handler types, and internally applies ToBuildAndMergeHandler
to each handler in the list. The result is passed to BuildWithHandlers
:
pub type BuildAndMergeOutputs<Output, Handlers> =
BuildWithHandlers<Output, <Handlers as MapFields<ToBuildAndMergeHandler>>::Map>;
This implementation showcases the power of type-level composition in CGP. By combining smaller, reusable components like BuildWithHandlers
, BuildAndMerge
, and MapFields
, we can construct higher-level abstractions like BuildAndMergeOutputs
with minimal boilerplate and high flexibility.
This modular, composable design stands in contrast to traditional macro-based approaches. With macros, it’s much harder to verify that combining two smaller macros results in correct behavior — especially as the complexity grows. In contrast, with CGP's type-level constructs, each piece remains type-checked and composable under well-defined where
constraints. These constraints act as a safeguard, ensuring that the combined logic remains sound and predictable.
Moreover, this approach enables us to easily define other dispatcher variants, such as those that use BuildAndSetField
instead of BuildAndMerge
, without having to rewrite or duplicate core logic.
Hiding Constraints with delegate_components!
While it's possible to define BuildAndMergeOutputs
as a simple type alias, doing so introduces ergonomic issues when it’s used with a generic Handlers
type provided by the caller.
Consider a situation where we want to implement a provider that wraps around BuildAndMergeOutputs
in order to perform validation on the result before returning it. We might write something like this:
#[cgp_new_provider]
impl<Context, Code, Input, Output, Handlers> Computer<Context, Code, Input>
for BuildAndValidateOutput<Output, Handlers>
where
BuildAndMergeOutputs<Output, Handlers>: Computer<Context, Code, Input>,
{
type Output = Output;
fn compute(context: &Context, code: PhantomData<Code>, input: Input) -> Output {
...
}
}
Here, BuildAndValidateOutput
statically depends on BuildAndMergeOutputs
by requiring that it implements Computer
. However, this code won’t compile as-is. The problem is that BuildAndMergeOutputs
is defined as a type alias that internally relies on a hidden constraint: Handlers: MapFields<ToBuildAndMergeHandler>
. Since this constraint is not visible at the site of usage, the compiler now demands that we explicitly provide proof that Handlers
satisfies this requirement.
While we could add the missing constraint to the where
clause of BuildAndValidateOutput
, this reduces the ergonomics and composability of the abstraction. It requires callers to understand the internal structure of BuildAndMergeOutputs
, which goes against one of the core strengths of CGP — hiding internal constraints so they don't leak into user code.
Fortunately, we can solve this by turning BuildAndMergeOutputs
into a regular provider struct, and using the delegate_components!
macro to delegate the Computer
implementation to BuildWithHandlers
, while keeping the necessary constraints encapsulated.
Here’s how:
delegate_components! {
<Output, Handlers: MapFields<ToBuildAndMergeHandler>>
new BuildAndMergeOutputs<Output, Handlers> {
ComputerComponent:
BuildWithHandlers<Output, Handlers::Map>
}
}
This macro creates a new BuildAndMergeOutputs
provider that wraps BuildWithHandlers
, while adding the required constraint on Handlers
behind the scenes. Internally, the macro expands into a DelegateComponent
implementation like the following:
impl<Output, Handlers> DelegateComponent<ComputerComponent>
for BuildAndMergeOutputs<Output, Handlers>
where
Handlers: MapFields<ToBuildAndMergeHandler>,
{
type Delegate = BuildWithHandlers<Output, Handlers::Map>;
}
With this setup, BuildAndMergeOutputs
can now be used like any other CGP provider — without needing to manually restate its internal type constraints. This keeps client code clean and focused, and allows CGP abstractions to remain composable and extensible.
The key benefit of this pattern is that it avoids boilerplate while preserving type safety. Whenever a provider's implementation is simply a thin wrapper around another provider with some added constraints, it's much more convenient to use DelegateComponent
via delegate_components!
than to implement the provider trait manually.
Type-Level CGP Metaprogramming
The technique we just explored — wrapping providers and using delegate_components!
— can be seen as a form of metaprogramming in CGP. Here, we’re leveraging type-level programming not just within CGP’s core component abstractions like DelegateComponent
, but also as a tool for programmatically defining component wiring through the use of generic parameters and trait constraints.
This highlights a deeper design philosophy behind CGP: rather than inventing a new meta-language or macro system, CGP embraces Rust’s existing type system and trait machinery as the foundation for composability and abstraction. Type-level programming in CGP isn’t an escape hatch — it is the underlying mechanism that makes escape hatches unnecessary.
Some readers may find the phrase "type-level programming" intimidating, especially if they associate it with obscure or overly abstract code. But consider the alternatives: in many dynamic languages like Ruby, Python, or JavaScript, metaprogramming is accomplished through bespoke syntax, reflection, or runtime patching — approaches that can be powerful but often come with poor tooling, fragile semantics, and hard-to-debug behavior.
In contrast, type-level programming offers a principled and well-typed foundation for metaprogramming. Constraints are checked at compile time, tooling support remains robust, and the abstractions remain composable. Instead of inventing ad hoc metaprogramming constructs, CGP relies on the well-established theory and practice of type-level computation.
By doing so, CGP achieves a rare combination of power and predictability. Developers who embrace this pattern gain access to highly expressive abstractions, while staying within the familiar boundaries of Rust's type system.
Conclusion
By this point, I hope you have a clearer understanding of how CGP supports extensible records. If you are interested in exploring the implementation further, take a look at the GitHub repository, especially the cgp-field
and cgp-dispatch
crates, which contain the full source code.
With partial records and field traits such as HasField
, BuildField
, and TakeField
, CGP enables powerful generic operations like BuildFrom
, which allows one struct to be merged into another seamlessly. These building blocks form the foundation for more advanced compositional patterns.
To implement the extensible builder pattern, we introduced the BuildWithHandlers
dispatcher. This component composes multiple builder handlers and merges their outputs using a clean and predictable build pipeline, constructed with PipeHandlers
. The simplicity arises from a system where modularity is built into the core design, rather than added as an afterthought.
On top of this, we implemented the high-level BuildAndMergeOutputs
dispatcher by defining a conditional delegation to BuildWithHandlers
, after first wrapping each handler using the BuildAndMerge
adapter. This design preserves composability while allowing for customized construction logic.
Future Extensions
Because extensible records and builders are implemented modularly, it is easy to extend the system further without rewriting the core. For instance, the current builder pattern requires the source struct to match the target struct exactly, but we may want to allow certain fields to be dropped rather than merged.
We might also want to support overriding existing fields in a partial record when the same field appears in multiple sources. Additionally, it would be useful to finalize a partial record by filling in any missing fields with default values.
These enhancements are already within reach, and we plan to support them in future versions of CGP. Even if they are not included directly in the library, the existing abstractions make it easy for others to implement them independently.
Thanks to the expressive power of Rust’s type system and the composability of type-level programming, these extensions can be implemented in a way that is both straightforward and correct by construction. They remain completely optional and do not introduce additional complexity to the core logic presented here.
This level of flexibility would be difficult to achieve with more ad hoc approaches, such as macros or code generation, which often require the entire logic and all possible extensions to be baked into a single monolithic system. That leads to unnecessary coupling and limits customizability.
If you're still unsure how all of this comes together, a future blog post will walk through the implementation of these extensions in detail to show exactly how they work.
Next Part
In the final Part 4 of this series, Implementing Extensible Variants, we will follow a similar path to explore how CGP implements extensible variants. Keep in mind the concepts we covered for extensible records — you may be surprised to discover just how much of the same logic carries over, despite the differences between records and variants.
Stay tuned for what comes next!
Hire Me
P.S. Btw, I am available for hire!