PopcornFX Performance inside Unreal Engine 4

PopcornFX screenshot - outsourcing

The PopcornFX plugin for Unreal Engine 4 was recently released on the marketplace (available on our store store.popcornfx.com/popcornfx-ue4-plugin), so you might wonder what PopcornFX performance looks like in such an engine. This article is here to answer that question!

PopcornFX Editor

You will create all of your effects directly into the PopcornFX Editor, which provides lots of useful metrics. From the editor you can have an estimation of your effects cost regarding:

  • Resource size, medium count, …
  • Framerate
  • Performance cost
  • Overdraw
  • Memory usage

PopcornFX-Editor-Resources PopcornFX-Editor-Viewport
Resource analyzer, Particle editor

Integration in Unreal Engine 4

Once you have created your effects inside the PopcornFX editor, you can import them inside Unreal Engine 4, which will replicate your PopcornFX project folder structure (see tutorial videos). All dependencies such as textures, texture atlases, materials, meshes are resolved at import time. A simple drag & drop is what it takes to import an effect in your level.

ue_ex_content Take full advantage of PopcornFX features into Unreal Engine. With a PopcornFX Runtime carefully optimized for each platform, the astounding flexibility of PopcornFX Effects, and a little practice, you should be able to create nearly anything that comes to your mind !

PopcornFX being a middleware, what role does it play inside Unreal Engine? In a nutshell:

  • The PopcornFX SDK handles the entire particle system simulation.
  • The PopcornFX Plugin goes back and forth between the PopcornFX SDK and the target engine depending on what needs to be done (Rendering, Sounds, Decals, Lights, Events, Blueprints, ..)

Thanks to Unreal Engine 4’s plugin API, we managed to create our plugin without any source code or source asset modification. We were able to integrate deeply with their whole rendering pipeline, update system, UI, and blueprint system.

ue_ex_projectiles

Benchmarks

Note: The following tests were made in an empty scene so we could test with the least performance overhead possible. We will be timing Game and Render threads over different feature sets from PopcornFX, and compare those with what UE4’s Cascade solution offers. Those tests are CPU only (GPU benchmarks would need another article). For an exhaustive list of existing PopcornFX stats directly inside Unreal Engine, see http://wiki.popcornfx.com/index.php/UE4/Performance.

PopcornFX Emitters are evolving within a PopcornFX Scene: all particles will spawn/evolve/die inside it. You can control various simulation/rendering settings for each PopcornFX Scene in your world. Scenes will have their subset of worker threads that run in parallel, waiting for tasks from the PopcornFX SDK (billboarding, collision queries, audio updates, …).

From simulation to rendering, everything will be batched together (compatible drawcalls, compatible mediums, collision queries, actions, …), giving PopcornFX great performance compared to the classic per instance approach. For more information on simulation batching, see the script execution model article on our tech blog.

First thing you’ll notice when creating an effect inside PopcornFX is the clear distinction between the particle behavior and its renderer. You can think of PopcornFX renderers as Cascade’s TypeData, without affecting your particle behavior and the supported features.

Simulation (CPU)

What we have done in order to try to get accurate benchmarks is create effects inside PopcornFX with straightforward features such as acceleration, basic scripting, collisions, velocity, create the same effects using Cascade and compare the two.

  • PopcornFX update stats were gathered using Update: PopcornFX update time through the “stat PopcornFX” console command.
  • Cascade update stats were gathered using PSys Comp Tick Time through the “stat Particles” console command.

SimpleEmitterPopcornFX-UE4-Graph-1PopcornFX-UE4-Graph-1_Extended

Constant Acceleration

SimpleEmitter_CollisionsPopcornFX-UE4-Graph-2PopcornFX-UE4-Graph-2_Extended
Collisions

Events_OnCollisionPopcornFX-UE4-Graph-19PopcornFX-UE4-Graph-19_Extended
OnDeath event triggering another emitter (EventGenerator, EventReceiver Spawn)

Billboards_ShapeContainsPopcornFX-UE4-Graph-20PopcornFX-UE4-Graph-20_Extended
KillBox

SimpleEmitter_SpawnOnShapePopcornFX-UE4-Graph-3PopcornFX-UE4-Graph-3_Extended
Simple emitter with particles spawning on a shape

SimpleEmitter_SkinnedMeshPopcornFX-UE4-Graph-24PopcornFX-UE4-Graph-24_Extended
Simple emitter with particles spawning on a skinned mesh (surface)
Note: PopcornFX has to reskin on the CPU to use this feature. It asynchronously skins on worker threads, and doesn’t stall the game thread at all.

The above features are simulation only, meaning that they won’t add any additional cost on the rendering side. As you’ll read in the rendering section, PopcornFX uses billboarders to generate the geometry, so every render related feature needs to be timed on the rendering thread, while Cascade executes most of those on the game thread. As modifying the Size or Color field in PopcornFX is just a matter of writing memory, what actually needs to be timed is the resulting billboarding cost (generating colors, uvs, ..). Keep in mind that the following examples are not fully representative of PopcornFX cost, as most of the job will be done on the render thread, but it still gives you the game thread cost.

RibbonsPopcornFX-UE4-Graph-15PopcornFX-UE4-Graph-15_Extended

SimpleEmitter_MeshesPopcornFX-UE4-Graph-9PopcornFX-UE4-Graph-9_Extended

SimpleEmitter_LightsPopcornFX-UE4-Graph-8PopcornFX-UE4-Graph-8_Extended

SimpleBillboards_ColorOverLifePopcornFX-UE4-Graph-16PopcornFX-UE4-Graph-16_Extended

SimpleBillboards_SizeOverLifePopcornFX-UE4-Graph-17PopcornFX-UE4-Graph-17_Extended

SimpleBillboards_SubUVPopcornFX-UE4-Graph-18PopcornFX-UE4-Graph-18_Extended

Rendering

Every compatible render medium (batch of particles that share the same rendering properties) will be batched together to issue a single draw call. The following charts will give an overview of drawcall count, and render time for each particle renderer.

Note: The following drawcall count doesn’t include the final draw call amount, which might change based on the platform (see this UE4 wiki page for more information).

  • PopcornFX Drawcall count were gathered using “Render: DrawCalls” through the “stat PopcornFX” console command.
  • Cascade Drawcall count were gathered using “Particle Draw Calls” through the “stat Particles” console command.

PopcornFX-UE4-Graph-5
Same effect instantiated multiple times

  • PopcornFX Render time was gathered using “Render: GDME” through the “stat PopcornFX” console command.
  • Cascade Render time was gathered uing “Particle Render Time” through the “stat Particles” console command.

Billboards

Although we tend to consider billboards are only simple squares oriented towards a view, there are in fact several billboarding modes available. PopcornFX’s Billboarding Modes are Cascade’s Screen Alignments, you’ll find the conversion table below:

Screen alignment modes
CascadePopcornFX
Facing Camera PositionViewpos Aligned Quad
Square (Uniform Size)Screen Aligned Quad
Rectangle (Non-Uniform Size)Screen Aligned Quad
VelocityVelocity Axis Aligned
Away From Center~ Planar Axis Aligned
 ?Velocity Spheroidal Aligned
 ?Velocity Capsule Aligned
 ?Planar Axis Aligned

PopcornFX uses Billboarding as a more generic term: the process of generating geometry (positions, normals, tangents, colors, texcoords, ..) with its alignment determined by the billboarding mode. This allows specific features such as the VelocityCapsuleAligned, which generates 6 vertices. However, it doesn’t allow us to make basic assumptions like what Cascade does (see below).

All billboarders work on batches of particles :

  • CPU billboarding is done by parallelizing tasks on the different active worker threads.
  • GPU billboarding is done in compute shaders (see PopcornFXSortComputeShader.usfPopcornFXBillboarderBillboardComputeShader.usf).

Cascade has no such thing as Billboarding (at least, not as in PopcornFX), sprites particles will always be considered a rectangle. What it allows Cascade to do is always send 4 vertices to render to the GPU, and reposition them in the vertex shader, depending on the screen alignment. However, this only allows simple alignments (ViewAligned, VelocityAligned).

It means that regardless of the screen alignment, Cascade’s CPU/GPU billboarding cost will remain steady, while PopcornFX CPU billboarding cost will vary depending on the billboarding mode (for example, VelocityCapsuleAligned is more expensive than VelocityAxisAligned).

SimpleBillboardingPopcornFX-UE4-Graph-7PopcornFX-UE4-Graph-7_Extended
View Aligned Billboards (Uniform/Non-Uniform size)

Billboards_ViewPosAlignedPopcornFX-UE4-Graph-11PopcornFX-UE4-Graph-11_Extended
View Position Aligned Billboards

Billboards_VelocityAlignedPopcornFX-UE4-Graph-12PopcornFX-UE4-Graph-12_Extended
Velocity Aligned Billboards
Note: Despite its name, PopcornFX’s VelocityAxisAligned stretches the particle size by a custom axis, which can be different from the actual particle velocity.

Ribbons

Ribbon geometry is a pretty useful feature that generates geometry between particles. There are two ways of spawning ribbon particles (although the second option is the most common one):

  • Connecting together particles that are from a single emitter
  • Connecting together particles that are spawned from a source particle

There are some differences between Cascade and PopcornFX ribbons in the way they are generated and rendered:

Ribbon_MappingRibbon_Mapping_Wire
The way PopcornFX generates ribbons differs from Cascade: by default, it maps the material over each individual subdivision of the ribbon instead of over the entire geometry.

To change this behavior, the CParticleRenderer_Ribbon allows you to specify a TextureUField (used to defined how the material will be mapped on the geometry).

CParticleRenderer_Ribbon_TextureUField Ribbon_Mapping_TextureUField
Specifying TextureUField = LifeRatio

Like for Billboard rendering, PopcornFX uses billboarders to generate geometry, texcoords, normals, etc. Depending on how expensive the billboarding is, it will affect the overall rendering cost. Cascade’s combination of Source and SpawnPerUnit modules are PopcornFX’s Spawner Evolver. There are also screen alignment modes available for ribbon renderers, you’ll find the conversion table below:

Ribbon specific Screen alignment modes
CascadePopcornFX
Camera UpViewpos Aligned Quad
Source UpNormal Axis Aligned / Side Axis Aligned
World UpNormal Axis Aligned / Side Axis Aligned

PopcornFX’s NormalAxisAligned / SideAxisAligned allows you to plug any field containing the desired billboarded axis.

SimpleRibbonsPopcornFX-UE4-Graph-6PopcornFX-UE4-Graph-6_Extended
Ribbons spawned with the regular Spawn module

Ribbons_ViewPosAlignedPopcornFX-UE4-Graph-13PopcornFX-UE4-Graph-13_Extended
Ribbons spawned with the SpawnPerUnit module
Note: PopcornFX provides other spawning metrics for trail particles (Time, Custom).

Ribbons_ViewPosAlignedPopcornFX-UE4-Graph-23PopcornFX-UE4-Graph-23_Extended
Ribbons spawned with the SpawnPerUnit module (Using TextureUField=LifeRatio), mapping the material over the entire geometry.

Anim Trails / Beams

Anim trails and Beams are specific behaviors of the ribbon/billboard renderers, so we won’t benchmark them.

 

Meshes

Mesh geometry is a straight forward feature, particles are rendered as static meshes than can have any material applied to all of their submeshes.

Cascade provides additional mesh specific screen aligments (Face Camera With Roll, Face Camera With Spin, ..) that are all indirectly supported by PopcornFX: each CParticleRenderer_Mesh has a list of Static/Dynamic Transformations that allow you to orient the mesh the way you want.

CParticleRenderer_Mesh_Transforms

 

 

 

 

 

 

 

Specify custom fields to orient the rendered mesh

CParticleRenderer_Mesh_MeshId
Like a texture flipbook, particles can specify which mesh descriptor they want to render (this can be randomized on a per particle basis, modified over time, …).

There is a wide range of PopcornFX specific features that cannot be benchmarked with Cascade, in a nutshell:

  • By default, particle meshes have no orientation.
  • Static transformations are obviously less expensive than dynamic transformations that need to be built each frame.
  • The more custom fields you specify for Dynamic Transformations, the more expensive it will be at billboarding time (matrix constructions, …).

SimpleMeshesPopcornFX-UE4-Graph-10PopcornFX-UE4-Graph-10_Extended
Simple mesh particles

Meshes_VelocityAlignedPopcornFX-UE4-Graph-14PopcornFX-UE4-Graph-14_Extended
Forward Axis Aligned mesh particles (here, Velocity is used)

Lights

Particles can be rendered as light sources, which allows to add more life to some effects (fire, sparks, …). However, they need to be used with caution regarding performance.

As for the actual implementation, UE4 has a dedicated callback to gather any lights from a scene proxy. Cascade computes all attenuations and light data directly inside this callback (FPrimitiveSceneProxy::GatherSimpleLights), while PopcornFX handles that in each worker thread, at render time (FPrimitiveSceneProxy::GatherDynamicMeshElements). The resulting rendering cost is much higher for Cascade as it is not parallelized over worker threads.

Although the render time graph below shows PopcornFX’s cost way above Cascade, it is still much cheaper over the whole frame, when taking into account the cost of GatherSimpleLights (~4.5 ms vs ~17 ms for 100 emitters and 50k particles) :

SimpleEmitter_LightComplexityPopcornFX-UE4-Graph-26
Render time

PopcornFX-UE4-Graph-25PopcornFX-UE4-Graph-25_Extended
Light setup time

Conclusion

Cascade has good rendering optimizations for billboard rendering, however, ribbon, mesh, and light rendering seem to suffer a lot. Even though the PopcornFX plugin is production ready, there is still room for improvement (rendering optimizations, screen space collisions, …), and we’ll keep adding new features in future versions of the plugin.

The way PopcornFX was designed allows you to deeply customize your effects, but there is a tradeoff: PopcornFX has a steep learning curve. You can create simple effects easily, but when it comes to creating more complex effects, you’ll need to dig into tutorials/wiki/support !

What stands for Cascade:

  • Embedded and easy authoring in UE4
  • Builtin LOD system
  • Better-working transparent sorting between particles and transparent ingame objects
  • Bigger community/tutorials
  • Slightly faster for low numbers of effect instances.

What stands for PopcornFX:

  • Performance (simulation, rendering, drawcall count, ..)
  • Cross-engine compatibility
  • Flexibility with its scripting system
  • Debug and performance metrics
  • Standalone editor (easier to outsource FX creation)

Of course, nothing stops you from building your level with both PopcornFX and Cascade effects, there is absolutely no limitation on that, and it does make sense in some cases.