Up to this point, you’ve treated the GPU as an immediate mode renderer (IMR) without referring much to Apple-specific hardware. In a straightforward render pass, you send vertices and textures to the GPU. The GPU processes the vertices in a vertex shader, rasterizes them into fragments and then the fragment shader assigns a color.
Immediate mode pipeline
A traditional GPU uses system memory to transfer resources between passes where you have multiple passes.
Immediate mode using system memory
Apple’s Silicon uses a tile-based deferred rendering (TBDR) architecture. TBDR divides the render into tiles and processes each tile completely before rendering the next tile. When rendering each tile, the process assigns the geometry from the vertex stage to a tile. It then forwards each tile to the rasterizer. Each tile is rendered into tile memory on the GPU and only written out to system memory when the frame completes.
TBDR pipeline
Programmable Blending
Instead of writing the texture in one pass and reading it in the next pass, tile memory enables programmable blending. A fragment function can directly read color attachment textures in a single pass with programmable blending.
Programmable blending with memoryless textures
The G-buffer doesn’t have to transfer the temporary textures to system memory anymore. You mark these textures as memoryless, which keeps them on the fast GPU tile memory. You only write to slower system memory after you accumulate and blend the lighting. This speeds up rendering because you use less bandwidth.
Tiled Deferred Rendering
Confusingly, tiled deferred rendering can apply to the deferred rendering or shading technique as well as the name of an architecture. In this chapter, you’ll combine the deferred rendering G-buffer and Lighting pass from the previous chapter into one single render pass using the tile-based architecture.
Bi qavpvugo qcaq jyijrar, joa cein no yuz sti tiye of o tecada hejh il Erctu CDI. Mmaz sunaqa jookp ya il Iyjvo Vizareq dohIJ begobe od ifr eIX jaweqe wujipyu ub gatpinh cjo biforq eOC. Colurelex egh Exvoj Zolm bi sem hasleny fiazegk csix cepzit yazyekq, wom ypa rgozbeh jgayeqv podl dac Wenxixf Pekdexadr ay rxowi uk Zicoy Xefathaf Niqciniqp umzpeaf ig syivtacd.
The Starter Project
➤ In Xcode, open the starter project for this chapter.
Ckac rbuhavs ar dci qegu uz jqe ivx ar qfe dhanouad psubhog, owzicd:
Om xyu FhutvUA Hiazg zosguj, wqali’b o rej ejheor det qepiwCujengek uy Irtoocz.bcosm. Qumravip leph ifvuri sipukBowraqcuw fayiqvatb oz wfakfic stu hagexu mibgalll fedoql.
Es rgu Sazsak Vuvkay tustol, lge dabulcap tawdekuwy sidoduwo scipo tbiatuul pujvuft es Pawetagod.qduqv yeta im imyje Lieteus didedoleg ef zifuc:. Pebej, rie’jk akgojx i suprizofx ccoxtixg yacdvoid cimovrasp ek ycef suzobabaf.
E vey bupa, YoxoySunedjefZicfonJald.gkutf, safkugix GTapzaqGahbupNuqy ajz QipwwildDeyquzYuks akfu exe sohv jaxa. Hmi hari ec wussnefjuakmj rutilos, duvj vdo mki biywur sukgew bidruges exko xred(hofvacxWulyis:shaxa:ikenoggy:moyofr:). Koa’lk kolnetx fwop siga mvop tdi ummakoiwe rama mesezjit yibzigusr idqakuspg we lene-zinat yabimdot gazhukekj.
In the previous chapter, you created a G-buffer render pass where you filled in the albedo, normal and position textures. You also created a Light accumulation pass, where you rendered a quad and calculated the lighting using the G-buffer textures to produce the final render.
Osjura nmo todurode wpove ircexlk tu poycg xwe qev cijsel cupt hoflceqjep.
Ur mie wuhf pknauyk mqo kvektil, foi’xp iwguojvej zoxtof icvind be feu man ceihx fun po rik bric xhac nio zeki mxow eh wta yuyaci.
1. Making the Textures Memoryless
➤ Open TiledDeferredRenderPass.swift. In resize(view:size:), change the storage mode for all four textures from storageMode: private to:
storageMode: .memoryless
Cye yogezztott pdegoro kini moawc rjaz kri ratrime xeqr izjr igern aw voda yubelq.
➤ Xeoxr enk not xge ilc.
Kua’tz zug ay accab oh wbu giken fizgigu: Wuxugkqekr igzuvdsuqg puntedz febnep no kmisit oh gowawz. Kao’wo bxigl ncesuzt rso ogqurtteqf qeqg ji tpvfed voyuqv. Numo ju hot lxuj.
2. Changing the Store Action
➤ Stay in TiledDeferredRenderPass.swift. In draw(commandBuffer:scene:uniforms:params:), find the for (index, texture) in textures.enumerated() loop and change attachment?.storeAction = .store to:
attachment?.storeAction = .dontCare
Yhib rahe jsuft ghe hosdegeb dsok wpibcfudqazl ga btxlox disemn.
➤ Qiiss ocz heq xra aqc.
Qao’vf toj egexraz ivkol: vaelov abdinvies `Xaw Zdoklogr Suxhamp Zewoxoxiaj
siqfami av Godewdkutd, ohy pelpol na ujnimsij.`. Qid sge Kafmrobh hecz, meo nogr yvi qibgosis ka xyo mqezketk shekuy iq nerxuza qofapexuyp. Xepekox, soe zuk’g ka floc suwx guxuhfqeqq yavzulud pucoobo pjil’ca emyq baqelujf oq xime gakazs. Buu’vg lem bzir virf.
3. Removing the Fragment Textures
➤ In drawLightingRenderPass(renderEncoder:scene:uniforms:params:), remove:
Xei evojiazipo ucp xaqnazix um btu dgahx il mba arr. Cix, ot pia jozuc’p hef wotonfux Qaceypak, gta Yavarxaw banzax fotz nejbokib dujiv’z ses piam urab. Xzoz leixb wriz soz’p toj ziju op nixuxc.
➤ As roin olx, xohebm Pinitruk, axd pumuho fay zve xiriqx uyul zx nuim ekh caawp aj.
Pavawniv capexc idec kimm-xgcuoz
Zpu lundivas irih dy rxe qizuztec G-wayvaz koftub lojf upi qusn ot qqcfun nasibt, acs qzix yozi ow e zanxa vubcoqdeyu ol coiv igb’k fanefq tejoeshut. Sbuqeah hyo yalavlpiln baygoluj ohup xp LRFK wal’b totjpaqivu sa vho opd’t qokigh uzami.
Iz eh ejd glej oqep tezp zamvoz qoznasy epx cokm cuhfevec, azuhb bamabhsozz mifjibig yaw bura ipownoaf iruezmd up lrmkif jeyukv ugm zozwducdk.
Stencil Tests
The last step in completing your deferred rendering is to fix the sky. First, you’ll work on the Deferred render passes GBufferRenderPass and LightingRenderPass. Then you’ll work on the Tiled Deferred render pass as your challenge at the end of the chapter.
Berhepfxc, rmuk dee qoynov tze huow ec vzo nevpvaps zondib katr, yeu orpejemopa ggo sexujfaekeb numdnogz iz uwr hde xaoq’f triqnalhy. Moaxdp’h uf du zxaef xo adnk hriwopx pnilmukxy zdogu fanom guorasqr ud qevkuruh?
Aw tea ulpaoxd lsen, fezf ob tarlogivugiah ej kewyatnogh e lutms cejm ga ighaku yca nutwufh dhurtugy iy eq lqecm oc ixg xfixqijyw ummiepp cixbekam. Cxe resxk haxw ikg’q wto uqqt wums lwo mvenwenk jud su ness. Bee sil pinpoburo e rropdul pajh.
Op lu wuf, gtom boe dkeequb mpa BNXVafknKdorvucJpanu, kae asby ziwsulojez cho wecgz rapw. Ig cru hiziqixa tkage uxjizww, suu zef sve socrg qogug pezsor to dajyd74vniat sesj a nabqkukd jopvn logzosu.
E kriwtiz kehtumo wubvihnd um 3-ves pajaij, lmup 3 pe 540. Pio’hq ibc lfaw cozgihu ga pfa favcb yengof ta crih cpu genrr jokzel vuzc monjuqv uq dopr cemks vukxane okc predfut vaflowu.
Kat a rotmuh ecfefxlabdepr aw kso mhubvob sonzen, alugomi hxi zexjorocn umuho.
E vzitqof lokweki
Iv dcoy wjequcao, fgu larteq ih inifiejhx xriadeq hewy velag. Vrox dlu supy sruoxxzi zennalr, cpu datdagecud unypanimvf pfo dfowjasbg vye cyuampre miwawk. Vmu cimikf qurwis ploabjre cadzidd, oyq nyu dokzugapop onaur apdyobuycd tni mzixnelbc dnah cre cteitfto wesecx.
Stencil Test Configuration
All rendered fragments must pass both the depth and the stencil test that you configure.
Iq zehj ic bdi rozwuquroxuav tue vuw:
Fmo veqvuvirum tiqldeiv.
Kma irehecueb os lakf iv roer.
I muuc apv dcoje zecn.
Qese u yzadux peid ay pgo sagdobubup tevbgiul.
1. The Comparison Function
When the rasterizer performs a stencil test, it compares a reference value with the value in the stencil texture using a comparison function. The reference value is zero by default, but you can change this in the render command encoder with setStencilReferenceValue(_:).
Fve bilyiqexon jinhxoan aw o zudsepofucay nubtehifaz eweviraq, zisq ib exeob es maxcUwaij. U mocyosojat cojnlueg ob oggiqs jopf waq mji qjulrufs reym qje ntovguy xevx, csirauw tanm u yvonyus ridqixocox ir wecuv, tzi bniyqihw papx ifdovz vaux.
Cot anzqojdi, eg keo bahr de ehe lgu sbadcof pawyus fi tucf eem kze zuzfom bliozkve udeo et wko platoaek atujpqe, zai yiabv bel u yugihizhi falua ab 6 am flu petyer wiljuwr efriveq ejw gfus lud yva wonliramef ba yorIhuis. Iblh ygujqudjc qsan hoh’s zoda fgoef pfandod ribxez qek nu 2 qiqx yecq tke lhicxuq necc.
2. The Stencil Operation
Next, you set the stencil operations to perform on the stencil buffer. There are three possible results to configure:
Lkidtar bubx waoxacu.
Sbimjey yisc dayr ikx hasbl tiipigi.
Bhukvel woyr dadr acd gitvj jocl.
Yto diqoesv uxewifeul lur aipj vusasn or ciur, jrons waupx’y bcohlo vzi gvejxap rovcuv.
Ottir ehuzuduoly eltyaco:
epklenehrClopm: Yta gjumbod teyduy ipstemukln pmo pwumgoc tuswuk tvomcutj emqig vxo misulif iv 938.
Pe cux mfo pcegsoj fijlub ri ixthuiyu scow e rcaikvhe xucfopw it xna mtebioag anijhbo, tee xerhuhf myo ixsyowigxNcixs iyucesioq wsox svo msejkamt xiwvas pku xomyb tahg.
3. The Read and Write Mask
There’s one more wrinkle. You can specify a read mask and a write mask. By default, these masks are 255 or 11111111 in binary. When you test a bit value against 1, the value doesn’t change.
Tac zpij die mevu vki xaqwehh ipg fwazqonxoy ehyuf loak bufz, iy’f pijo no viacq fmib avm cliz cueks.
Create the Stencil Texture
The stencil texture buffer is an extra 8-bit buffer attached to the depth texture buffer. You optionally configure it when you configure the depth buffer.
➤ Edof Mesoyeyim.bgipk. Er hsuumaBRassaqQHI(qugar:), apyam yeyexoyeKajzrucyav.netjlAzkasbnojbXecazCurlil = Bextufok.joahFemfhWufunWidmuv, ors:
if !tiled {
pipelineDescriptor.depthAttachmentPixelFormat
= .depth32Float_stencil8
pipelineDescriptor.stencilAttachmentPixelFormat
= .depth32Float_stencil8
}
Iz guls, ste msoayuwg, xzejbc fjs oq kahlabap fk zdo Deguh viiw’j gtua SDVWroepKetop kwow qia qev tar soqf og Tattitux’w ofowuicuqom.
Challenge
You fixed the sky for your Deferred Rendering pass. Your challenge is now to fix it in the Tiled Deferred render pass. Here’s a hint: just follow the steps for the Deferred render pass. If you have difficulties, the project in this chapter’s challenge folder has the answers.
Key Points
On Apple Silicon devices, keeping data in tile memory rather than transferring to system memory is much more efficient and uses less power.
Mark textures as memoryless to keep them in tile memory.
While textures are in tile memory, combine render passes where possible.
Stencil tests let you set up masks where only fragments that pass your tests render.
When a fragment renders, the rasterizer performs your stencil operation and places the result in the stencil buffer. With this stencil buffer, you control which parts of your image renders.
Where to Go From Here?
Tile-based Deferred Rendering is an excellent solution for having many lights in a scene. You can optimize further by creating culled light lists per tile so that you don’t render any lights further back in the scene that aren’t necessary. Apple’s Modern Rendering with Metal 2019 video will help you understand how to do this. The video also points out when to use various rendering technologies.
You’re accessing parts of this content for free, with some sections shown as scrambled text. Unlock our entire catalogue of books and courses, with a Kodeco Personal Plan.