Skip to content

Unity‐GRETA communication

Fabien Boucaud edited this page Feb 27, 2025 · 1 revision

Basic overview

To allow Unity and GRETA to work together, the easiest solution is to use the same character 3D model in both applications and to properly define the rig in Unity so that it can also be used as a Humanoid by the Unity Animator, if necessary. In the current implementation, there are two ways for Unity and GRETA to communicate with each other:

  • The Thrift protocol method relies on the Thrift protocol/language that acts as an interface between Java and C#. It is the method used by the GRETA configurations available in the main GRETA repository and it is used to stream the animation Keyframes generated in GRETA to Unity, where they will be processed to update the Transforms of the bones/joints of the 3D model of the character. Thrift is quite fast and efficient overall, which makes it good for intensive streaming of animations over time.
  • The OSC method relies on the OSC protocol and is sometimes used to communicate with GRETA for more punctual information exchange. The advantage of OSC over Thrift is that it’s much easier to setup the communication to suit your needs if you build your own GRETA module and use it to receive the OSC messages.

As it is currently setup, the most straightforward use of GRETA with Unity is to prepare FML files ahead of time and send them to be played to GRETA from Unity (via Thrift). Upon reception, the FML file reader module will do what it usually does in the standard version of GRETA and once the Keyframes are generated they will be played in GRETA and sent via Thrift to Unity. There, a script is dedicated to translating the positions of each bone/joint to the correct positions in the Unity space and applying them. This will allow you to make the agent speak, execute gestures and have facial expressions almost out-of-the-box.

If you want to push your interactions with the agent further however, you will soon want to have the agent walk around the Unity 3D environment, look at the human (usually, their avatar), maybe use a mix of GRETA animations and motion capture movements, etc. This is when this documentation will come in handy, as making Unity and GRETA cooperate with regards to the animation of the same 3D model can be tricky.

What makes it complicated is the fact that since the 3D model is moved directly via script once a Keyframe is received from GRETA, it will either overwrite or be overwritten by the Unity Animator animations depending on whether you use the LateUpdate or Update Monobehavior methods, respectively, in the GRETA Character Animator script. To avoid this behavior, which would prevent the agent to walk as long as it is connected to GRETA, for example, we can instead attempt to blend animations from the Unity Animator and from GRETA. This is a very difficult task, and it currently cannot be achieved perfectly smoothly, but it can be partially achieved by deciding ahead of time exactly which part of the agent’s body is going to be animated by GRETA and which part is going to be animated by Unity. The rest of this section is dedicated to explaining which scripts are used to do the things presented here, and to clarify how to proceed with this animation blending with our current setup.

Main scripts

Upon opening the Unity project you can take some time to familiarize yourself with how Unity and Greta interact with each other. To that end, there are 6 relevant script folders in Assets/Scripts :

  • AnimationParameters: This is where the different animation parameters classes are defined. They act as interfaces between the animation parameters defined and used in Greta and how the characters are defined and updated in Unity. There should be little need to modify those unless the Greta animation parameters are modified in the future. In that case, this is where the Greta changes would need to be reflected.
  • Audio: This contains the scripts related to the processing/storing of the audio contents received from Greta (mostly TTS data). There could be some need to adjust things related to the sample rate as I remember having run into some troubles with it in the past.
  • AutodeskCharacter: This is probably the most important folder as it contains the scripts dedicated to processing the animation information received from Greta and applying it to the Unity characters. See details below.
  • GretaCommunication: Those are the scripts that make punctual communication between Greta and Unity possible. They use the OSC communication protocol which is fairly easy to use and implement if you have your own needs (especially if you have defined a new module for Greta).
  • ThriftImpl : Those are the scripts that make the communication between Greta and Unity possible, thanks to the Thrift language/protocol. You only need to worry about those if you need to change something in how animation information and commands are exchanged between the two applications (intensive use cases, contrarily to the OSC method of communication which is more dedicated to punctual communication needs).
  • Time: Some utility that can be useful for the synchronization of Greta’s and Unity’s clocks (framerate).
  • Tools: I have never had to worry about those scripts, and I remain unsure of whether they are actually used in the current implementation of the Greta/Unity communication. It does seem like the BAPFAPAUDIOConfig script could be useful as a definition script for some of the animation parameters interfaces however.

While I cannot go into all the details of each script and what it does here, I will try to clarify what the AutodeskCharacter scripts do, and especially what I have modified to improve the way Greta and Unity interact (partial blending of the animations).

AutodeskCharacter scripts

The BapMapper, FapMapper and ConcatenateJoints scripts are interfaces used to translate the coordinates from the GRETA 3D space to the corresponding coordinates in the Unity 3D space. The BapMapper does this for the body parts, the FapMapper does it for the face and the ConcatenateJoints takes care of joints.

The GretaCharacterSynchronizer, GretaEnvironmentSynchronizer, GretaObjectTracker and SceneManager are all dedicated to ensuring that what happens in the Unity environment in terms of objects and agent movement are reproduced in the GRETA environment (thus keeping them “synchronized”). Those scripts work in theory but they are severely lacking in quality and require you to double your work on the Unity environment by creating copies of the objects in the GRETA environment, which is much harder to work with. In the current state of things I, personally, advise to focus on developing directly for the Unity environment for things that require dynamic behavioral responses. As an example, I would deal with gaze directly in Unity instead of using GRETA’s current tools for this. If you do want to use GRETA’s tools, you will need to spend significant time improving the code and you should then consider getting in touch with the GRETA team on Github to share your code if you are willing to do so, as it would help all future GRETA users.

The GretaObjectMetadata and GroupEnums scripts are pure utility scripts, with the first one being used for referencing objects existing both in Unity and GRETA and the second one being rarely used but dedicated to defining enums that make sense for communicating with GRETA. This last one could thus prove useful if you build a new GRETA module that would use specific pre-defined values and needs to communicate with Unity.

Now, pragmatically, the most important scripts to actually understand are the GretaCharacterAnimator and GretaAnimatorBridge. They are two alternative versions of the same script and GretaCharacterAnimator is the legacy version while the GretaAnimatorBridge is the version featuring the changes I implemented to allow for Unity/GRETA animation blending.

GretaCharacterAnimator

As mentioned above, in Unity we use Bap- and FapMappers interfaces to translate the body parts coordinates received from GRETA to Unity coordinates. The GretaCharacterAnimator thus is the script where we deal with the animation information received from GRETA via Thrift (there’s some attributes dedicated to the management of the connection). This requires initializing and keeping lists of Bap- and FapMappers that will be used upon receiving the animation information.

This first happens in Awake() where we perform the initialization of the connection with Thrift and set up the 3D model/skeleton in Unity to correspond to the default state of the model/skeleton in GRETA. This means setting up the skeleton to have an N-pose instead of the classical T-pose (otherwise the gestures’ relative coordinates will be completely off) and initializing the Bap- and FapMappers based on the available bones found in the Unity GameObjet corresponding to the 3D model/skeleton.

Now when your app will be playing, the GretaCharacterAnimator will use its’ FixedUpdate calls to perform the animation of the 3D model. Specifically, it will check whether AnimationFrames are available to perform (it remains unclear to me whether we receive the frames live and one by one or whether we sometimes receive multiple ones at once, which makes it difficult to know exactly whether we are currently in the middle of performing a FML for example) and if so it will retrieve the AnimationFrame information and let the Bap- and FapMappers execute the updating of the bones’ coordinates. However, since there is no discrimination in the process every single bone defined in the Bap- and FapMappers will be updated during this step. This means that if a Unity Animator is running another animation at the same time, it will most likely change those coordinates again after FixedUpdate and you won’t be able to see the GRETA animation (even if you use an AvatarMask on the Unity Animator controller, because it will simply set the bones to default position on the body parts concerned by the mask). If you instead change the script so that it will perform its’ update of the 3D model in LateUpdate, it will override any animation performed by a Unity Animator.

Because of this, you can’t really get both systems to work at the same time, which would make having the character walk in the scene impossible (at least, there would be no leg animations since this is not really supported in GRETA). For some specific things, you may even want Unity to deal with mocap animations instead of GRETA since the GRETA support for motion capture files is old and not guaranteed to work with recent format of files and coordinates but you would be forced to not use GRETA at all in that case.

GretaAnimatorBridge

To try and bridge the two animation systems and get them to play nice with each other, I made some changes to the GretaCharacterAnimator script to allow taking advantage of AvatarMasks and Animation layers from the Unity Animator to determine what animation system is in charge of which body part. It is still not perfect and it has some serious drawbacks but it can be used and will work with most simple use cases (such as locomotion and fixed attribution of what system deals with what body part, it gets more clunky when you need to make dynamic modifications to the attributions). Those modifications are available in the GretaAnimatorBridge script.

The first notable modification is that now we determine whether we use Fixed or Late Update depending on whether there is a Unity Animator involved in our animation system. Now, the biggest change however is that instead of indiscriminately applying the Bap- and FapMappers, we use Dictionaries to classify the mappers by relevant body parts (currently it goes something like Head, Trunk, Arms, Legs). This will allow us to make use of the Unity Animator Controllers’ layers.

mappers

The easiest way to do that is to create a new Animator Controller and add new Layers to it (see Figure for an example). For each Layer you will give it a name that includes the body parts that the Animator will be in charge of animating when this layer’s weight is set to 1. It should be named so as to correspond to what is used as Dictionary keys in the script (currently hard-coded, an improvement could be to expose the variables in the inspector) as we then parse the string to determine the keys.

animator

As a result, at the time of applying the Bap- and FapMappers we only apply those that are not indexed with a key present in the current layers that have a weight of 1, leaving Unity’s animations untouched. For each Animator Controller Layer, you should specify an appropriate AvatarMask that excludes everything that is not part of the body parts used by the layer. Unfortunately, because of how those masks are used by Unity, this means that upon setting the weight to 0 or 1 for a layer, the model will switch to a default pose for the body parts where the mask is used. This means that you will need to send some animation information from GRETA (via an FML for example) to fix it but there may be a delay which would break the immersion for the participant/user. This is why dynamic modifications are not yet very good.

Clone this wiki locally