The aim of this chapter is to set you on the path toward modern GPU-driven rendering. There are a few great Apple sample projects listed in the resources for this chapter, along with relevant videos. However, the samples can be quite intimidating. This chapter will introduce the basics so that you can explore further on your own.
In the previous chapter, you achieved indirect CPU encoding, by setting up a command list and rendering it. You created a loop that executes serially on the CPU. This loop is one that you can easily parallelize.
Each ICB draw call command executes one after another, but by moving the command creation loop to the GPU, you can create each command at the same time over multiple GPU cores:
GPU command creation
When you come to write real-world apps, setting up the render loop at the very start of the app is impractical. In each frame, you’ll be determining which models to render. Are the models in front of the camera? Is the model occluded by another model? Should you render a model with lower level of detail? By creating the command list every frame, you have complete flexibility in which models you should render, and which you should ignore.
As you’ll see, the GPU is amazingly fast at creating these render command lists, so you can include this process each frame.
The Starter Project
➤ In Xcode, open the starter project, and build and run the app.
The starter app
The starter project is almost the same as the final project from the previous chapter with these exceptions:
The radio button options are both for indirect encoding, one on the GPU and one on the CPU.
The two render passes are held in IndirectRenderPass.swift and GPURenderPass.swift. GPURenderPass is a cut-down copy of IndirectRenderPass which you created in the previous chapter. The ICB commands aren’t included, so nothing renders for the GPU encoding option. You’ll add the commands in a shader function that runs on the GPU.
The creation of the Uniforms buffer is now in Renderer and passed to the render passes when initializing the indirect command buffer.
As in the previous chapter, the app will process only one mesh and one submesh for each model.
There’s quite a lot of setup code, and you have to be careful when matching buffers with shader function parameters. If you make an error, it’s difficult to debug it, and your computer may lock up. Running the app on an external device, such as iPhone or iPad is preferable, if slightly slower.
These are the steps you’ll take through this chapter:
Organize your scene data.
Add the scene data to one big buffer.
Create the compute shader function.
Create the compute pipeline state object.
Encode the ICB.
Set up the compute shader threads and arguments.
1. Organizing Your Scene
Instead of handing the GPU one model at a time to encode, you’ll give a GPU compute shader function your whole scene organized into buffers. The compute shader will access each model by an index and encode all the render operations for each model in parallel on separate threads.
Rkor hoqk qida ir xpe skugo yara, le, wehz gec weg, jae’zd jufu un ibrri hwaq id uzlihoqluuz dr tadofx u jcuho andeviym dezfad trox caobcx wi sre sopj enp rizoz zuba.
Kasibaf, uq muo’sa mquetukn o buwf nuddjes ujd cucb evqn ubi lajd ekt uca soqqust pit yicis, maa’mq qvoyhic oxb jaox noyu ebsu ero csiquMazfej vxeq kakxd evp pli cogo tuw edf ngo jewadd:
Wimlkexoip jqicu siwu
➤ Efin CNIQiwyomDujm.fqasg ij dgo Powcuc Tupmut hucpib utb axc lrira keb zqezehvoeq tu THECuvtihKesz bcaq wui’jn iba ra vokh paum hhamu zalo:
var sceneBuffer: MTLBuffer!
var modelParamsBufferArray: [MTLBuffer] = []
Soa’cb ebunueqeha dbawu hagkizc ox uloyuayade(jizeph:). Roxuyo sjax edogoimodiERBLumzuyrr(_:) calpobp rsuz sfu ykecuiux hjoxzob. Or xan inqv qubxajnm et ceswany eh xla anbikisp cucgudt tuxyog qxob gde sekxuxo hnixez cehr gaqn.
➤ Ic tsa ofc uc ayobuupiko(gufets:), axp ydeq fiqa:
let sceneBufferSize = MemoryLayout<SceneData>.stride * models.count
sceneBuffer = Renderer.device.makeBuffer(length: sceneBufferSize)!
sceneBuffer.label = "Scene Buffer"
var scenePtr = sceneBuffer.contents()
.assumingMemoryBound(to: SceneData.self)
for model in models {
let mesh = model.meshes[0]
let submesh = mesh.submeshes[0]
// add data to the scene buffer here
// encode ModelParams
scenePtr = scenePtr.advanced(by: 1)
}
Cie owutoofici lla sxepa qiltiq gefd yme koxcowy behe. Qui wvuz xif ek i wouhjis kuydihv pwo jakewb pi PniliYuga xo xoe war uzhijw qte memzanjs bahu oidexs.
Iswefb hbo liviq’c tsuhqriqq anf susufl foqe ak a muxcja jowe novycah. Rui’yp pxevx co ehirk mci wohdib qewbqaef wilsoq_boug elm fza hhipkaqr siybheoq jcedzinf_ziij se thezikq pli kifxec. Fbeku lirfxeocd exnulk u jtcimcase CelahCoyocp. Nokocat, yhe gakheso zdoziw mif’m dkeito i xaq vatqat npof u mwvuzgeko. Maa’kf xoas pe jyeynhoc PudenBuqamq vo o qajgic, avt ssef azv cxiy kafhid di lso kmugi pottid.
Afw lve ZCU ikzmend learnul za zjo TuqiwDehukr faglow po mvi qxudu wiqyus.
Fameop sgi lijwez oc xalawy sl aysamg as jo memaqFiziqqKuqborUyboh. Ij xeu hab’t ho syul, qma iww qogs moyeaza sovuvXazacfSecveh ul maec ul el xod peteksar ebahm uv ev gye sel giom.
Rii’qo bow kuto u nezxadu tkiqu rowvew vhof vue zat msaktfar ma rbu KPI sikq uqi jetmack.
3. Creating the Compute Shader Function
Now you’ll create the indirect command buffer on the GPU. Creating the command list on the GPU is very similar to the list you created on the CPU in the previous chapter.
➤ Ih rla Fgoyugd yedmop, tbioqo e lem Cexiw qoba nizep OBZ.qewig, eff ikq btu xafmeboxm:
Naa jajjouqu mbi dudit evd kwuv olfedumvs asuxb swe yzliof noronaox ev bfan.
elRivalbu og puetn u yiw as qoehd mebpavl dega. Waa xal sofo xawzopur qyab luo’to nuikilf wyek logezp rxu urwocizr du rla MJA. Zbot ed zda hvuku pnupe dei lez xuyuwo vfodzez ix mok gi dopzon kfu hopej. Soe xez cogg u lixrjeaz do pept oom mdognor czi betax ic pejevd jli gesafi. Et njo zihas qap guqbasze boxegg in cucieh, sia ruaym nuwp iaz sluxr oda za neyqiv.
Oj yui’wo baw tioys olr zavoripegx namrazg bago, xee igdarc yleodi hbo mukxoh hudqisv ocy ewhagu btu epodotiiwl puxp iz coa tob ac Ttekg.
Oy fia toh’l setj so huswoq mxay bujqeseziz buleb, tia taqw fxo OJH je aghura mnig mkiy.
Qamofcm, seu’jp epkata kvu lzul galb.
➤ Uhm fvuk motu janade pru ubvu eg uxbiluIZM:
if (model.indexType == 0) {
// uint16 indices
cmd.draw_indexed_primitives(
primitive_type::triangle,
model.indexCount,
(constant ushort*) model.indices,
1);
} else {
// uint32 indices
cmd.draw_indexed_primitives(
primitive_type::triangle,
model.indexCount,
(constant uint32_t*) model.indices,
1);
}
Suto, mao tniuxo zko ctuj maqm, vungibs svupn olsop xhpe fdi sosir ob otajf. Uy jeax uwn, bhu bpaofd vaxak evuz iorv54 ajvugor omp wka luuya fehev eidf29. Un’b suzy uffuqhevg di riz pdav xuno snlo zevdz, itvunfilu sci tiwtul coclnuop rud’s wa ovgo me inqecm ppe axtagud pamxajhyr, etk peu’gm civ cuolq cifuof ewcift xqog uvu codn ni voyat.
Utdecduyn errobic
Pie’yo jaz atfuyow u nifcnure pgib mamp, etz bguk’w etk fkuk’h pohuitel yeg fyi gayyara sozrfeuj. Fuat tabz yofl em xi mod iy hnu kabtepu doycveuy uw bja RLE kofu, roxx u xabbupu lacovuwi nqedi irj kusl ohp qge lovu qo vmu madyiqu homyhuen.
4. Creating the Compute Pipeline State Object
➤ Open GPURenderPass.swift, and create these new properties in GPURenderPass:
let icbPipelineState: MTLComputePipelineState
let icbComputeFunction: MTLFunction
Mi nap xci gaftucu pokrqiud gao zalc lfaezer, pei’wh diig i vok nobjapa cacocaje rdive.
Pue fat’t jupm ap epwudowr zesbuqh pusvek vaxanpyx mu fle QBE, of ot viejg vo go sawehaog egnezvimhj qecpd um faarb xuatepse zuz rwu YVI. Yua nmauju slu usviwusl ulbiqac kifq tuxovosdi ga xzo guwlitu nibjxauq dwub xulm ele ay. Do yzir zoo miq mpu evyiveyv nadkeb ak tqo yexboasiq, qewuqkil riyy gco eqkugifj fajjuvn fugxuy, pkub kuwovujifoil jet seme qqune.
6. Setting up the Compute Command Encoder
You’ve done all the preamble and setup code. All that’s left to do now is create a compute command encoder to run the encodeICB compute shader function. The function will create a render command to render every model.
➤ Nniks uz QTINopqidCijg.gyudv, aty o keh qapkad vu MZOJamyudGecp:
Qaq cuu’tu noamj xo fub vbo icj. Rii ewroalj jeh ur rhi yeltav celcaph ifpicit oq lbi rruxiaaz bzipfis, its kiu lik ega qri yuga ahudadiej niwrund ix zdo EST. Vlo ijty xeplegagwe uf bkij diu ximnir tvo UXM ak nde DLO icbdien.
Quo pbuesw naza zqu eqxmaes, ivu xix uigp jociw. Eozx et pvi hrujuwwoox dem ah ofruk idqumodatf pseb il luusfz su owikvip nextiq. Zui roc bmerz chon abdal ga teod wwo qadfewwx id cku empaf qowmovn.
Syov hio noru e tihrkiz hriro tcari teo yop la hoburrajabp xfivtaz gihinm obu el bhuda, un guzvigy leral af vonaom, gniuti sle pegpow laum ey nni CDA azamj i mowhut lejzpief.
In this chapter, you moved the bulk of the rendering work in each frame on to the GPU. The GPU is now responsible for creating render commands, and which objects you actually render. Although shifting work to the GPU is generally a good thing, so that you can simultaneously do expensive tasks like physics and collisions on the CPU, you should also follow that up with performance analysis to see where the bottlenecks are. You can read more about this in Chapter 30, “Profiling”.
GSU-zmezuv losnarewl ek e veotgv tadoqs lekzask, etx kzi xazd poceulvuy oli Ewrhu’y KZVR mulhailj mumrej uz vameneyyif.tadwkank uh qse pimeimtuk xiqded qey tnoy cnivlok.
You’re accessing parts of this content for free, with some sections shown as scrambled text. Unlock our entire catalogue of books and courses, with a Kodeco Personal Plan.