Cacophony

A gesture library for Unity

Gestures meet actions to keep chaos under control

Cacophony is a gesture detection library for Unity I created at 5of12 to help us solve a common customer request. To be able to build apps that detect hand gestures to trigger application behaviour. Sounds simple, but if too many gestures are used it can quickly devolve into chaos.

Cacophony is a wholistic solution to this problem. From a design point of view it provides a structure that both designers and developers can follow, facilitating clear discussion. From a technical point of view it provides a modular system, built to be easily extended, for rapid iteration and predictable outcomes.

5of12

Open Source

Made with Unity

Gesture + Action

Cacophony couples gestures with actions. This simple extension lets you capture the whole interaction in your design. Pinch twice. Point up then forward. Simple conversational designs, with less ambiguity.

Iterate Fast

A hand gesture in Cacophony is built from hand poses. Give them names that work for you. Provide a list of poses to detect, a list of those to ignore and let the gesture detector handle the rest. Be specific in your design without getting deeply technical.
✊ != ✋ || ✌️

Feel connected

Gesture control most often fails when people aren't given enough feedback to build trust in the system. We made sure to design in plenty of ways for you to hook in visuals, audios and haptics at each stage of the interaction

Peek inside the box

Cacophony's VU meter inspired debug widgets help you verify your gestures are configured correctly before writing any application code. State indicators hold their value for a short time to give you chance to observe the state of values that change too quickly for the human eye.

In the (near) future

Cacophony uses a predictive approach to gesture detection, looking a little into the future to reduce the delay between user action and app reaction. It's a marginal gain to counter latency in data processing that helps users feel connected.

Extensibility

Cacophony is agnostic to your chosen source of data. It's built around Ultraleap hand tracking because that's what its first project required, but it's not a hard dependency. Replace the hands with OpenXR, MediaPipe, Meta or Apple while keeping application code the same.

FAQ

Why is called Cacophony?

As a gentle warning! A reminder of what not to do. If you attach sound to every gesture in your app and it's making a cacophony, that's too many gestures.

Can I use it in XR?

Yes, probably! Cacophony was actually designed for screenless operation, so it should run anywhere. Give it a try and let us know!

What did you learn building it?

Gesture detection is hard, but it's possibly the easier part of the problem. Consist behaviour and clear design intent is harder to achieve!

What's next?

We've designed the system to work with any kind of source data, but so far not tested it. So we plan to plug in MediaPipe or OpenXR tracking next.

Check out the Open Source

Available on github now!

© Peter Nancollis 2025