The PopcornFX plugin for Unreal Engine 4 was recently released on the marketplace (available on our store store.popcornfx.com/popcornfx-ue4-plugin), so you might wonder what PopcornFX performance looks like in such an engine. This article is here to answer that question!
You will create all of your effects directly into the PopcornFX Editor, which provides lots of useful metrics. From the editor you can have an estimation of your effects cost regarding:
- Resource size, medium count, …
- Performance cost
- Memory usage
Integration in Unreal Engine 4
Once you have created your effects inside the PopcornFX editor, you can import them inside Unreal Engine 4, which will replicate your PopcornFX project folder structure (see tutorial videos). All dependencies such as textures, texture atlases, materials, meshes are resolved at import time. A simple drag & drop is what it takes to import an effect in your level.
PopcornFX being a middleware, what role does it play inside Unreal Engine? In a nutshell:
- The PopcornFX SDK handles the entire particle system simulation.
- The PopcornFX Plugin goes back and forth between the PopcornFX SDK and the target engine depending on what needs to be done (Rendering, Sounds, Decals, Lights, Events, Blueprints, ..)
Thanks to Unreal Engine 4’s plugin API, we managed to create our plugin without any source code or source asset modification. We were able to integrate deeply with their whole rendering pipeline, update system, UI, and blueprint system.
Note: The following tests were made in an empty scene so we could test with the least performance overhead possible. We will be timing Game and Render threads over different feature sets from PopcornFX, and compare those with what UE4’s Cascade solution offers. Those tests are CPU only (GPU benchmarks would need another article). For an exhaustive list of existing PopcornFX stats directly inside Unreal Engine, see http://wiki.popcornfx.com/index.php/UE4/Performance.
PopcornFX Emitters are evolving within a PopcornFX Scene: all particles will spawn/evolve/die inside it. You can control various simulation/rendering settings for each PopcornFX Scene in your world. Scenes will have their subset of worker threads that run in parallel, waiting for tasks from the PopcornFX SDK (billboarding, collision queries, audio updates, …).
From simulation to rendering, everything will be batched together (compatible drawcalls, compatible mediums, collision queries, actions, …), giving PopcornFX great performance compared to the classic per instance approach. For more information on simulation batching, see the script execution model article on our tech blog.
First thing you’ll notice when creating an effect inside PopcornFX is the clear distinction between the particle behavior and its renderer. You can think of PopcornFX renderers as Cascade’s TypeData, without affecting your particle behavior and the supported features.
What we have done in order to try to get accurate benchmarks is create effects inside PopcornFX with straightforward features such as acceleration, basic scripting, collisions, velocity, create the same effects using Cascade and compare the two.
- PopcornFX update stats were gathered using Update: PopcornFX update time through the “stat PopcornFX” console command.
- Cascade update stats were gathered using PSys Comp Tick Time through the “stat Particles” console command.
Simple emitter with particles spawning on a skinned mesh (surface)
Note: PopcornFX has to reskin on the CPU to use this feature. It asynchronously skins on worker threads, and doesn’t stall the game thread at all.
The above features are simulation only, meaning that they won’t add any additional cost on the rendering side. As you’ll read in the rendering section, PopcornFX uses billboarders to generate the geometry, so every render related feature needs to be timed on the rendering thread, while Cascade executes most of those on the game thread. As modifying the Size or Color field in PopcornFX is just a matter of writing memory, what actually needs to be timed is the resulting billboarding cost (generating colors, uvs, ..). Keep in mind that the following examples are not fully representative of PopcornFX cost, as most of the job will be done on the render thread, but it still gives you the game thread cost.
Every compatible render medium (batch of particles that share the same rendering properties) will be batched together to issue a single draw call. The following charts will give an overview of drawcall count, and render time for each particle renderer.
Note: The following drawcall count doesn’t include the final draw call amount, which might change based on the platform (see this UE4 wiki page for more information).
- PopcornFX Drawcall count were gathered using “Render: DrawCalls” through the “stat PopcornFX” console command.
- Cascade Drawcall count were gathered using “Particle Draw Calls” through the “stat Particles” console command.
- PopcornFX Render time was gathered using “Render: GDME” through the “stat PopcornFX” console command.
- Cascade Render time was gathered uing “Particle Render Time” through the “stat Particles” console command.
Although we tend to consider billboards are only simple squares oriented towards a view, there are in fact several billboarding modes available. PopcornFX’s Billboarding Modes are Cascade’s Screen Alignments, you’ll find the conversion table below:
|Screen alignment modes|
|Facing Camera Position||Viewpos Aligned Quad|
|Square (Uniform Size)||Screen Aligned Quad|
|Rectangle (Non-Uniform Size)||Screen Aligned Quad|
|Velocity||Velocity Axis Aligned|
|Away From Center||~ Planar Axis Aligned|
|?||Velocity Spheroidal Aligned|
|?||Velocity Capsule Aligned|
|?||Planar Axis Aligned|
PopcornFX uses Billboarding as a more generic term: the process of generating geometry (positions, normals, tangents, colors, texcoords, ..) with its alignment determined by the billboarding mode. This allows specific features such as the VelocityCapsuleAligned, which generates 6 vertices. However, it doesn’t allow us to make basic assumptions like what Cascade does (see below).
All billboarders work on batches of particles :
- CPU billboarding is done by parallelizing tasks on the different active worker threads.
- GPU billboarding is done in compute shaders (see PopcornFXSortComputeShader.usf, PopcornFXBillboarderBillboardComputeShader.usf).
Cascade has no such thing as Billboarding (at least, not as in PopcornFX), sprites particles will always be considered a rectangle. What it allows Cascade to do is always send 4 vertices to render to the GPU, and reposition them in the vertex shader, depending on the screen alignment. However, this only allows simple alignments (ViewAligned, VelocityAligned).
It means that regardless of the screen alignment, Cascade’s CPU/GPU billboarding cost will remain steady, while PopcornFX CPU billboarding cost will vary depending on the billboarding mode (for example, VelocityCapsuleAligned is more expensive than VelocityAxisAligned).
Ribbon geometry is a pretty useful feature that generates geometry between particles. There are two ways of spawning ribbon particles (although the second option is the most common one):
- Connecting together particles that are from a single emitter
- Connecting together particles that are spawned from a source particle
The way PopcornFX generates ribbons differs from Cascade: by default, it maps the material over each individual subdivision of the ribbon instead of over the entire geometry.
To change this behavior, the CParticleRenderer_Ribbon allows you to specify a TextureUField (used to defined how the material will be mapped on the geometry).
Like for Billboard rendering, PopcornFX uses billboarders to generate geometry, texcoords, normals, etc. Depending on how expensive the billboarding is, it will affect the overall rendering cost. Cascade’s combination of Source and SpawnPerUnit modules are PopcornFX’s Spawner Evolver. There are also screen alignment modes available for ribbon renderers, you’ll find the conversion table below:
|Ribbon specific Screen alignment modes|
|Camera Up||Viewpos Aligned Quad|
|Source Up||Normal Axis Aligned / Side Axis Aligned|
|World Up||Normal Axis Aligned / Side Axis Aligned|
PopcornFX’s NormalAxisAligned / SideAxisAligned allows you to plug any field containing the desired billboarded axis.
Anim Trails / Beams
Anim trails and Beams are specific behaviors of the ribbon/billboard renderers, so we won’t benchmark them.
Mesh geometry is a straight forward feature, particles are rendered as static meshes than can have any material applied to all of their submeshes.
Cascade provides additional mesh specific screen aligments (Face Camera With Roll, Face Camera With Spin, ..) that are all indirectly supported by PopcornFX: each CParticleRenderer_Mesh has a list of Static/Dynamic Transformations that allow you to orient the mesh the way you want.
Specify custom fields to orient the rendered mesh
There is a wide range of PopcornFX specific features that cannot be benchmarked with Cascade, in a nutshell:
- By default, particle meshes have no orientation.
- Static transformations are obviously less expensive than dynamic transformations that need to be built each frame.
- The more custom fields you specify for Dynamic Transformations, the more expensive it will be at billboarding time (matrix constructions, …).
Particles can be rendered as light sources, which allows to add more life to some effects (fire, sparks, …). However, they need to be used with caution regarding performance.
As for the actual implementation, UE4 has a dedicated callback to gather any lights from a scene proxy. Cascade computes all attenuations and light data directly inside this callback (FPrimitiveSceneProxy::GatherSimpleLights), while PopcornFX handles that in each worker thread, at render time (FPrimitiveSceneProxy::GatherDynamicMeshElements). The resulting rendering cost is much higher for Cascade as it is not parallelized over worker threads.
Although the render time graph below shows PopcornFX’s cost way above Cascade, it is still much cheaper over the whole frame, when taking into account the cost of GatherSimpleLights (~4.5 ms vs ~17 ms for 100 emitters and 50k particles) :
Cascade has good rendering optimizations for billboard rendering, however, ribbon, mesh, and light rendering seem to suffer a lot. Even though the PopcornFX plugin is production ready, there is still room for improvement (rendering optimizations, screen space collisions, …), and we’ll keep adding new features in future versions of the plugin.
The way PopcornFX was designed allows you to deeply customize your effects, but there is a tradeoff: PopcornFX has a steep learning curve. You can create simple effects easily, but when it comes to creating more complex effects, you’ll need to dig into tutorials/wiki/support !
What stands for Cascade:
- Embedded and easy authoring in UE4
- Builtin LOD system
- Better-working transparent sorting between particles and transparent ingame objects
- Bigger community/tutorials
- Slightly faster for low numbers of effect instances.
What stands for PopcornFX:
- Performance (simulation, rendering, drawcall count, ..)
- Cross-engine compatibility
- Flexibility with its scripting system
- Debug and performance metrics
- Standalone editor (easier to outsource FX creation)
Of course, nothing stops you from building your level with both PopcornFX and Cascade effects, there is absolutely no limitation on that, and it does make sense in some cases.