§Haxe/C++ | Tips for Avoiding GC

September 5, 2023

by Robert Borghese

Padding

GithubIcon

View on Github

Padding

GithubIcon

View on Github

§Haxe/C++ Garbage Collector

C++ generated from Haxe uses a garbage collector. If you are wondering how to avoid the garbage collector, you're in the right place!

Despite how frequently people ask about it, I've found there is a surprising lack of information regarding this topic. To fix this, I've created this page to consolidate everything there is about using value-types in Haxe/C++. Let's get started!

Two notes before we get started:

P.S: Shoutout to the guy that linked you this page in response to your question. ❤️

–––

Special thanks for the suggestions made by:

§Table of Contents

§Intro

§Value-Types Using @:nativeGen

§Value-Types Without Using @:nativeGen

§Other Recommendations

§Outro

§Examples

Here are some examples of Haxe/C++ libraries that use some GC-avoidance tricks! (Feel free to suggest more and I'll add them)

§Raylib Haxe

This library's externs for the basic Raylib data-structures like RayVector2 are stack-allocated externs.

§Profilers

Here are some projects you can use to profile your Haxe/C++ GC usage.

§hxTelemetry + hxScout

§Your Tools

First, let's become familiar with the tools we'll be using. These bypass Haxe's typing system, but if you play your code right there isn't any C++ feature you can't use, including memory management!

§Native Metadata

// Use on class or function
@:native(name: String)

This metadata sets the name that the class or function will use when converted to C++.

§Native Generation Metadata

// Use on class
@:nativeGen

This metadata removes all reflection capability from a Haxe class, but the C++ output for the class becomes significantly cleaner. The garbage collector is no longer used upon construction.

§C++ Expression Injector

// Call as function anywhere with "untyped"
function __cpp__(codeString: String, ...args)

This function injects C++ code directly at the callsite.

§C++ File Injector

// Use on class
@:headerCode(code: String)
@:cppFileCode(code: String)

Injects C++ code at the top level of the header or source file the class is generated in.

§C++ Class Injector

// Use on class
@:headerClassCode(code: String)

Injects C++ within the class generated in the header.

§What's a Value-Type?

Value-types are types whose instances are allocated on the stack in C++. For example:

MyClass obj; // stack allocated
MyClass obj2 = obj; // copied instead of referenced

void func(MyClass a) { /* ... */ }
func(obj); // copied into function

Instead of using a garbage collector, stack allocated objects will be deleted the moment their scope ends. This can be better for performance when generating tons of objects frequently since the creation/destruction occurs consistently instead of building up in the GC.

When this article refers to "avoiding Haxe/C++ GC", it is referring to creating, copying, storing, and declaring functions that take value-types.

§Using @:nativeGen

The easiest way to avoid the GC is to use the @:nativeGen on a class. This is very similar to using struct in C# (or @:struct in Haxe/C#), as @:nativeGen class instances are allocated as values.

@:nativeGen
@:structAccess // ensure dot access uses "." instead of "->"
class StackClass {
  public var value: Int;
  public function new(v: Int) {
    value = v;
  }
}

function main() {
  final sc = new StackClass(123);
  trace(sc.value); // 123
}

Here is the abbreviated C++ output of the Haxe code above:

// Normally the C++ class would contain tons of reflection/dynamic helper functions and properties...
// but since we're using @:nativeGen, it only generates the bare minimum.
class StackClass {
public:
  StackClass(int v) {
    this->value = v;
  }

  int value;
};

void main() {
  StackClass sc = StackClass(123);
  ::haxe::Log_obj::trace(sc.value);
}

However, there are some downsides to using @:nativeGen:

§A Deeper Look Into @:nativeGen Restrictions

The biggest downside of @:nativeGen is losing access to core Haxe features that may be required for libraries to function.

For example, if we were using HaxeFlixel and added @:nativeGen to our PlayState class, we would get a C++ error like this:

/*
Error: PlayState.cpp

./src/PlayState.cpp(39): error C2039: 'create': is not a member of 'flixel::group::FlxTypedGroup_obj'

include\flixel/FlxG.h(18): note: see declaration of 'flixel::group::FlxTypedGroup_obj'

./src/PlayState.cpp(78): error C2614: 'PlayState': illegal member initialization: 'FlxState' is not a base or member
*/
@:nativeGen
class PlayState extends FlxState { /* ... */ }

So typically we want to use @:nativeGen on smaller data classes that are used internally.

A great example would be for a class used to pass or return multiple types:

@:nativeGen
class Position {
  public var x: Int = 0;
  public var y: Int = 0;
  public function new() {}
  public static function make(x: Int, y: Int) {
    final result = new Position();
    result.x = x;
    result.y = y;
    return result;
  }
}

function getCoords(): Position {
  // ... run code to get x/y

  // Return data in stack-allocated object.
  // Significantly more optimal than a Haxe anonymous structure.
  return Position.make(_x, _y);
}

§Use Default Constructors with @:nativeGen!

Note with the code above, the Position type has a default constructor AND a static function to construct with arguments. Value-types need a default value when uninitialized in C++, which may happen with your @:nativeGen type in the generated C++.

As a result, it is a good practice to always provide a default constructor for your @:nativeGen class!

If you still really want your constructor to have arguments, you can use @:headerClassCode to add the default constructor. Just make sure to properly initialize the class variables if possible.

@:nativeGen
@:headerClassCode("Position(): x(0), y(0) {}")
class Position {
  public var x: Int = 0;
  public var y: Int = 0;
  public function new(_x: Int, _y: Int) { x = _x; y = _y; }
}

We can shorten our Position class even more if we use @:structInit. Just note that @:structInit will only add an argument constructor, so you'll still need to include a default one using @:headerClassCode.

@:nativeGen
@:structInit
@:headerClassCode("Position(): x(0), y(0) {}")
class Position {
  public var x: Int = 0;
  public var y: Int = 0;
}

§Storing a @:nativeGen Type

If you want to store your @:nativeGen value type anywhere outside of a function, you'll need to wrap if in cpp.Struct if it's being stored in a normal Haxe type. This includes:

@:nativeGen
class Position {
  public var x: Int = 0;
  public var y: Int = 0;
  public function new() {}
}

// A Haxe-compatible class
class HaxeObject {
  // using just "Position" would break C++ compilation
  var pos: cpp.Struct<Position>;

  // Similar to above, make sure it's an array of "cpp.Struct"
  var history: Array<cpp.Struct<Position>>;

  public function new(x: Int, y: Int) {
    // Assign a @:nativeGen object to a cpp.Struct to convert it
    pos = new Position();
    pos.x = x;
    pos.y = y;
  }
}

Long story short, cpp.Struct will "Haxe-ify" a C++ object by wrapping it with Dynamic. Rest assured GC is NOT used with cpp.Struct, it will still destruct like you would expect a C++ stack value to.

Unfortunately, there is still some overhead wrapping with cpp.Struct. It requires copying the value and minor processing. It's negligible, but you should still avoid using cpp.Struct if you can.

And that's that!

§What Does the C++ Look Like Without @:nativeGen?

Now, let's focus on options without using @:nativeGen. We can still manage value objects for a class without losing its capacity for generating GC objects and reflection, but it's a bit more work.

Given a basic Haxe class:

package mypack;

class MyClass {
  public var value: Int;
  public function new(v: Int) {
    trace('Constructed with $v');
    value = v;
  }
}

The C++ generated will look something like this:

// Haxe package dictates the namespace the class will be wrapped in.
namespace mypack {

  // "MyClass" acts as an alias for implementation wrapped with Haxe's GC pointer class.
  typedef ::hx::ObjectPtr<MyClass_obj> MyClass;

  // Implementation of MyClass is called "MyClass_obj".
  class MyClass_obj : public ::hx::Object {
  public:
    MyClass_obj(); // default constructor always created
    void __construct(int num); // function "new" renamed to __construct

    // other functions/variables are compiled pretty normally
    int value;

    // ... and also like 20 extra lines of helper functions for reflection and GC features.
  };
}

§Customizing C++ Type Output

Ahh! So to create a value type, just use MyClass_obj!

function main() {
  final obj = untyped __cpp__("::mypack::MyClass_obj()");
  untyped __cpp__("{0}.__construct({1})", obj, 123);
}

Well... not quite... If you run this, you will get:

error C2039: '__construct': is not a member of 'Dynamic'

This is because Haxe doesn't know what type obj should be, so it defaults to using Dynamic.

::Dynamic obj = ::mypack::MyClass_obj();
obj.__construct(123);

Dynamic is implemented in the hxcpp library and works just like how it does in Haxe. It's meant to store any object and use reflection to access its properties. Though, our value-type object does not have access to reflection capabilities (and also Dynamic is slow), so we need to avoid it!

You may think we simply need to annotate obj with MyClass, but keep in mind: "MyClass" in C++ actually refers to ::hx::ObjectPtr<MyClass_obj>. So this won't work:

// error C2440: 'initializing': cannot convert from 'mypack::MyClass_obj' to 'hx::ObjectPtr<mypack::MyClass_obj>'
final obj: MyClass = untyped __cpp__("::mypack::MyClass_obj()");

But how do we have the value type be generated??

Simple! Let's use @:native!

@:native("::mypack::MyClass_obj")
extern class MyClassValue {}

We use extern here so the class doesn't actually generate anything in C++. But it can still be used to place our custom C++ type anywhere we want in Haxe!

Let's also add @:semantics(variable). This will tell the compiler to avoid making excessive temporary variables using this type.

@:semantics(variable)
@:native("::mypack::MyClass_obj")
extern class MyClassValue {}

Here's a working example of everything so far:

package mypack;

// Compiles successfully in C++!
// Prints:
//   Constructed with 123

function main() {
  final obj: MyClassValue = untyped __cpp__("::mypack::MyClass_obj()");
  untyped __cpp__("{0}.__construct({1})", obj, 123);
}

@:semantics(variable)
@:native("::mypack::MyClass_obj")
extern class MyClassValue {}

class MyClass {
  public var value: Int;
  public function new(v: Int) {
    trace('Constructed with $v');
    value = v;
  }
}

§Passing Value-Types as Arguments

So far, so good! The cool part is this "Value" type can be used for function arguments too:

function copyValue(obj: MyClassValue) {
  trace('obj copied with value ${obj.value}');
}

Oh wait... uh-oh...

mypack.MyClassValue has no field value

Our "value" class representation doesn't have its class's fields. We could convert it to an abstract and use @:forward... except abstract doesn't support @:native (yet?).

So... what if we just extend from the original class?

@:semantics(variable)
@:native("::mypack::MyClass_obj")
extern class MyClassValue extends MyClass {}

This works! The only downside is MyClass can now be assigned to MyClassValue when they should be incompatible, but we're just not going to worry about that.

Now when I say "this works", I mean it passes the Haxe typer checker. But C++ still giving us some trouble:

error C2819: type 'mypack::MyClass_obj' does not have an overloaded member 'operator ->'
note: did you intend to use '.' instead?

This is because Haxe generates dot-access as ->. So it's generating obj->value when we want obj.value. To fix this, we use the @:structAccess metadata on the class to tell the compiler to use ..

@:semantics(variable)
@:native("::mypack::MyClass_obj")
@:structAccess
extern class MyClassValue extends MyClass {}

We hit compile and.... another C++ error:

error C2039: 'MyClass_obj': is not a member of 'mypack'

Long story short, the header file where copyValue function is defined does not have MyClass "#included", so it doesn't know about it. Once again, this is not a problem as Haxe has a solution for this!

Using the @:include metadata, we can tell the Haxe compiler to use an include statement whenever the type is compiled in a file.

The generated C++ should match the same path as the package -> Haxe module, so in this case the header should be located at mypack/MyClass.h.

@:semantics(variable)
@:native("::mypack::MyClass_obj")
@:structAccess
@:include("mypack/MyClass.h")
extern class MyClassValue extends MyClass {}

Wow. That's a lot of metadata. But NOOOW this finally compiles:

package mypack;

// Prints: 
//   Constructed with 123
//   obj copied with value 123"
function main() {
  final obj: MyClassValue = untyped __cpp__("::mypack::MyClass_obj()");
  untyped __cpp__("{0}.__construct({1})", obj, 123);
  copyValue(obj);
}

function copyValue(obj: MyClassValue) {
  trace('obj copied with value ${obj.value}');
}

@:native("::mypack::MyClass_obj")
@:structAccess
@:include("mypack/MyClass.h")
extern class MyClassValue extends MyClass {}

class MyClass {
  public var value: Int;
  public function new(v: Int) {
    trace('Constructed with $v');
    value = v;
  }
}

§Custom Value-Type Constructor

For convenience, let's add a function to generate our value-type at any time. We can add an extern inline function to our extern value class. The implementation will be always be inlined so we don't have to worry about the value being "copied" out of the constructing function:

@:native("::mypack::MyClass_obj")
@:structAccess
@:include("mypack/MyClass.h")
extern class MyClassValue extends MyClass {
  public static extern inline function make(v: Int): MyClassValue {
    final result: MyClassValue = untyped __cpp__("::mypack::MyClass_obj()");
    untyped __cpp__("{0}.__construct({1})", result, v);
    return result;
  }
}

Now it's just a matter of calling make!

function main() {
  final obj = MyClassValue.make(123);
  copyValue(obj);
}

§Accessing GC Object as Value-Type

Let's say you have a normal Haxe/C++ that's managed by the GC, and you want to dereference it and get the value-type? Simple!

hx::ObjectPtr has a function in C++ called GetPtr() that returns the underlying raw pointer. Dereference that, and you can make a value copy.

final obj = new MyClass(123); // construct normal object
final objValue: MyClassValue = untyped __cpp__("(*{0}.GetPtr())", obj); // deref
copyValue(objValue); // compiles in C++!

If you want, we could add a static extension function to our project to make this cleaner:

// ValueHelpers.hx
extern inline function asValue<T>(obj: T) {
  return untyped __cpp__("(*{0}.GetPtr())", obj);
}

// mypack/MyClass.hx
using ValueHelpers;

final obj = new MyClass(123);
final objValue: MyClassValue = obj.asValue();

§Dodge Those Iterators!

Now we all know that creating tons of unnecessary objects is not the best strategy with garbage collection, and we also know you should avoid anonymous strutures and Dynamic types when using Haxe/C++. So let's talk about the worse Haxe/C++ feature that uses both of those things: Iterators.

Iterating through an Array or Iterable with a for-loop seems innocent enough:

final arr = [1, 2, 3];
for(number in arr) {
  // do something with number...
}

However, this code has a secret cost. The Haxe compiler quietly converts your simple for-loop into this:

final iterator = arr.iterator();
while(iterator.hasNext()) {
  final number = iterator.next();
  // do something with number...
}

Whenever an object is iterated over, it generates a iterator object that the GC has to clean up. In moderation it's fine, but you probably don't want to have a ton of these in your game loop.

Instead, you should try and use IntIterator instead:

for(i in 0...arr.length) {
  final number = arr[i];
  // do something with number...
}

Technically 0...arr.length is generating an IntIterator, but the Haxe compiler optimizes it into an iterator-free while loop like this:

var i = 0;
final len = arr.length;
while(i < len) {
  final number = arr[i];
  // do something with number...
  i++;
}

If you don't trust the Haxe compiler, you could just write the while-loop yourself.

"Ahhh, but this is so ugly and tedious!"

No problem, why not use one of Haxe's greatest superpowers? Macro static extensions:

// FastIterate.hx
import haxe.macro.Expr;

macro function fastForEach(iterableExpr: Expr, callbackExpr: Expr) {
  // Extract variable names and expression from `callbackExpr`
  var loopExpr = null;
  var objectName = "";
  var indexName = "";

  // Check that `callbackExpr` is a two argument function
  switch(callbackExpr.expr) {
    case EFunction(kind, { args: args, expr: expr }) if(expr != null && args.length == 2): {
      loopExpr = if(kind == FArrow) {
        // Make sure automatic "return" is removed from lambda
        haxe.macro.ExprTools.map(expr, e -> switch(e.expr) {
          case EReturn(returnedExpr): returnedExpr;
          case _: e;
        });
      } else {
        expr;
      }
      objectName = args[0].name;
      indexName = args[1].name;
    }
    case _: throw "`callbackExpr` must be a function with two arguments!";
  }

  // Build the expression this macro call changes into:
  return macro {
    final iterable = $iterableExpr;
    final len = iterable.length;
    var $indexName = 0;
    while($i{indexName} < len) {
      final $objectName = iterable[$i{indexName}];
      $loopExpr;
      $i{indexName}++;
    }
  }
}

// Main.hx
using FastIterate;

function main() {
  final arr = [1, 2, 3];
  arr.fastForEach((number, index) -> {
    // do something with number...
  });
}

The arr.fastForEach call is converted to this code at compile-time!

final iterable = arr;
final len = iterable.length;
var index = 0;
while(index < len) {
  final number = iterable[index];
  // do something with number...
  index++;
}

Feel free to copy the fastForEach function and use it for yourself. No attribution necessary! Just make sure to place it in separate module since it's a macro function.

§Dodge Those Interfaces!

Haxe/C++ interfaces are not converted to normal C++; instead they heavily rely upon the Dynamic type system. As a result, they can cause unintentional performance cost and GC allocation.

Avoid them if you can!

§The End?

This page is a work in progress! Want to suggest additional material? Did I make a mistake? Open an issue on this blog's Github page or leave a comment below!

Thanks for reading!