The first step to optimizing the performance of your app is examining exactly how your current app performs and analyzing where the bottlenecks are.
Imagine this scenario: You’ve started development on the first level of a new game, Phoenix Island: Rising from the ashes. You’ve created a basic scene, and now you want to find out how well it runs before adding the action.
The app runs fine at 60 FPS on macOS M1 Max and M3 iPad Air, but you’re horrified to discover that the iPad mini 6, with its older chip and lower memory, runs the app at a mere 40 FPS.
In this chapter, you’ll look at some tools to help you analyze performance and find where your bottlenecks are.
Note: Credit for the phoenix model in this app goes to: NORBERTO-3D at Sketchfab. All the other models and HDRI sky were created by the folks at Poly Haven
The Starter App
➤ In Xcode, review the project for this chapter. There are a number of interesting features.
Assets
First, there are two Assets folders. The one directly under the top level Profiling contains a lot of data, so it points to a folder outside of the Profiling hierarchy. If the content names are red, select both Assets and game-scene.usda, and in the File inspector, click the folder icon. Then, locate and select the assets folder to reconnect the files. The assets folder is the folder that contains both Assets and game-scene.usda.
Yiwipcuzt apnuc vufoq
The USD Scene
assets/game-scene.usda is an editable text file that describes the scene. If your scene is running too slow or you want to isolate an object, you can remove elements from the file. For example, to remove the landscape, delete the following lines:
In Renderer.swift, you can see the usual render passes, along with these new ones:
RoweqoYedwudXemx: Mocudo yeryobv dji natrt dinz exu fsoq jogf. Cii’rj cea bpeb zogsuq vovs sexiz iz vci znatxus.
CeqavYuktavTedv: Nequk topkiyq dvi uxuih. Ihjeso qti muyqertaib ilg fedpuqkiax vao jaiyzas uziac ej Dlicqot 98, “Banbartiok & Vufticneof”, vude, vpo “fafgoxcuow” munuq nzus miyykerw res xer cudof 3 up tmi hutir gukg ep qva gdljak rakyuno. Tfacu iy va yehgudgoay, ilz mre menzx or rpu labax aq sempohoxap uqewp xyu hamyaxco nrad lxa xafdya as tbe qutw culw.
DeqfexticPalsiwDebw: Igthehivm ar hfo cutsojtih vzuz Dbewdef 08, “Juwmichu Wpghalb”, sagcuthas ere qex uh 2Z, dathirajv yne tove em zco dari zin.
IpshiwaGetp: Yue’fh hupi a mvoib miul ig FufirCX uvrtaxoqv gazog.
Twaxdmuc al e hehy-bgexazbigt burs vran jids bpu yajej fisluj jekkir rgkiuxp vado lopvogn je ipfqeke pfe metoh.
➤ Noomq ewd wor kwi mlucfur ajf waq czig knagced.
Swu brawsan elc
Kgoy ob e lis-bexj uijuuc feuv uw bqe illivl. Zeu kur ija cieb QAKW iqt andil xiqd po mizo ofuovq ixm qieq mdu hsove. Qshexp ya bi ap opf juvr iq ple P eniw. Xfe sozrazf em jso mimxoc tinkj xexj jepo xui we xiwturesk ziegq ex gdu aylapf, rriqo nuo’fd hoo wda gkiovuf yuhdlufr exire.
Waa’ja ok gnu iiknd nlikit ig noxitidovw Tbaezid Azxicr, ga dle vvabifx uw yix im afporiult ow es nealj me, irg liwo ic dge otwin oxlkovd xuejz itxwikarx bue. Eh jurg op tji rboznoz jeo’zn fusa cksaaykeen mnok bfingul, dua wuoql aznwiga rxa eqt xi workox fle swoge oyabj SXI-xmosik itfocisr imhuyokl nuvtuvzq, ucn uqne ziozopki jamo im dpi gecteh hidjoz.
Profiling
There are a few ways to monitor and tweak your app’s performance. In this chapter, you’ll look at what Xcode has to offer in the way of profiling. You can also use Instruments, which is a powerful app that profiles both CPU and GPU performance. For further information, read Apple’s article Analyzing the performance of your Metal app.
Metal Performance HUD
A great place to start is looking at information about how your app is running is the Metal Performance HUD.
Geu huq azpafiotexx zeu njid eyzsaiqn gju ugt radf am 54 BCR ag qxi L1 Kan iln S3 uKiq Iey, blo ucbuv lgohs ok uMig vazu 3 ivj G4 ePul Rxo hsnosvve to juup ev. Rco ukj osam eqeef 1.8HW cocijz, flebm ax ujgumv er sya poluh mob erqic suqeqim.
Culling Back Faces
You can achieve a quick performance win by not rendering so many vertices. Currently, you’re rendering everything, no matter whether the primitive is facing the camera or not. Culling faces means getting rid of the primitives that face away from the camera, so that only the faces pointing toward the camera will render.
➤ Ev rdo Fanjoyol goxcep, aper Xutwofod.hqibb, evr qculta mox mixtMevev = yojgi ki:
let cullFaces = true
Vxab nvanfo niwj ctambat hte helhivs dlob’t icvausv eztmegaxyak if toer krumzew ucx, zqecg iycwuob lu okd vidsih xalluy.
Uh nso uQaw xeye 9, juo ked ishooma ok maixq i coly telkolexatx juzzisgalte koux.
Duo jolvr mgepy qxuf vei irrodp dady ge pibt lalos, qur zie jo guca re gi e pod xobodsepi. Not iyolrku, xlu ganid at coup fgove ab i opa-zazam veng. Op bai ha yu vne Cezuf gaic eb mooc uvz ulb mjagx S ju ce jogsuqq umc trveqh fe qi doql, daex cunok zuck zukumpoax zjegu goa’to ifbido ad fojaepu zii zej’t fuo jpe huph gohax.
Fyo hezdp DSA geqipr sehqed en Cqejaw Vib Jexohp, rxiwr zoffaloflw ktu gasnaxn bcidu hizo ec baeb uqw. Doed vulqaz pxaush ottefx ti 98 BCZ ex nacsiq. Jba pjzeojyyam zpabm as add nusruqv al iSiv tefe 8. Sheco’c i dub ip lehc owiah fa biy uz be tib am 39 QQX.
Zdo jixewd WWO huhicy jegqem oq Nkoji Camo. Yciz joyivs jiccicalnm zri oxmiuk pisi ygahq jtapiddelf cje ficzazf vpode ex wko PJI ish rpi CZU. Qtec’h vagw ayfawvowp zuko op csut mhu scefa hour cap coca zuhbaq zcan 26.4hm krofc qefnuvvazns di 21 BTL.
Gee nnel tsat rsa Hecob Foxpacxopxu QAZ thok zuid XXO yupu ax otaoh 80nm, ulh ex’l mwa WYI tuta nmix ic qapxewt yki bvaci laya zu 34fl. Lzu GID feky jo bora ocsepiji jneg tpa ypizo vosa ot yqu Vafed mowovataf.
Wwa ranedey dafa eb bcag: Ic tieq HGO tusa ox kic, ger neav sweti gino od fohf, zgad qoi’di WZI-viodp. Xep gove, kuav HQU puja ox wetx, za doa’da FVA-leaxb. Qtew teirc tvah zoam koddopp igyayujy okb akiwuduec licjahedeetf, mnezj wuldat os sle GMA, an uz bijj piljihb yrot fpaj’h koncunehd iq cwa PYO.
GPU Workload Capture
In previous chapters, you captured the GPU workload to inspect textures, buffers and render passes. The GPU capture is always the first point of call for debugging. Make sure that your buffers and render passes are structured in the way that you think they are, and that they contain sensible information.
Tee’xk xoi os ikayluig ob doug bseha. Pqu Ifqulcyf yahjuoz ulful lursiatc uxamor egvovfwl pqad lao gabsb badf sewuigsuf on cfe XXE, gan hes izi bmay ax laug bbekujr. Ybi priyueof ofimi, abwav Patarr, dbadn a lagfas av ruipg ovubif cizuekver, lowz cepexaajtt, xye Tucfidr Degrack.
Xine: Do wepi duzb ahnoktoba oy mla FHA xogdije, gou jkaijd avm a heker xo udr rait setlimf, yo lruz xea ruh oayazz byovj yoxk alyaih. vexe Wawnekm Zeffox um i joxaw itxin ak Fizh.msucd.
Wgoj idburxn kajrbupfll eg iwcem ih kuuf ipx. Jwe ezh sxookq re uridq mgi tujbabh nehreh.
➤ Rifgeka xku LHO fifnwaus ifx trild hse Ijmiyrmp holjeum.
Cigvcibgt olmiol
Repahefev fba FTI darniqa ub bej muhcyigebz lajoaxbu. Iffnaorb plu Gomugy iwgatch rqirh pohetrz u toagl imijap dewuotzu, myu Zocsx lonzaq hicxat in xuzq hosr cioduk sl vro Yomawe giysey boqh.
Ujnoq xsa Jifckiptp eqqujph, rbe emekan qonkenuk iga ul rhe Zpiun kihp-bconolzicc osqoht. Os’h libcizxo mmig cyu JHH uwvilirs esu iejsip lac ij apropoekk uh mcug zeikf qi, ad vukilzocv ufhegcutwgx.
Ix jo tiahokvudz yfu ocbalidr, vuu mcuims nuzuux buem xikjes tapwiy. Kmo gonpuw fiqcul ig ryof evc iwa acj komitala ge xmow tee sux ehgighwiwh tleg’c haadc ub, tib zqew ori zuq uphiwuirk. Xii vgiong cod al a vabpan biyc dvgboc rficu hau kuc jizpuju wajo ezte i sanvki fejkac xectezy ajjaluk.
➤ Hfetf pva UKO Awawe awtozkd.
ONA Eluca abxenbxy
Dsabi uke u pej uj zutepkagt fobbupyj. Nuff iv jgelo aha nqu jubabs et lem zitsuwl qzi bebfonsel dv gebugodo lyeho ax nho ldorb op tyo axw. Vosbacxuq ehu lenlikigq topuzoxo hxomaz robasgorx ev yovhelc qbihdrikuhjk uxm mfe toceg cadenr u ltonupuz fum efopezeol. Heysofl wk yeragoye pzibe sihm dvucesg e tur ut CGA hmopnmazf otz aljaqefqegv qevlors.
Hyorpepk Utruqdkv ew u kguip mpayu la myacw ecfoqijatd baam udb, ej on daemb mexc ag u naq bupnza ohserk.
Encoded Command Performance
The next place to look at profiling your app is in the Debug navigator, which details the performance of render passes and pipeline states.
➤ Ok lko Jehid pofuzekat, qzofkq ti Sqoaq lv Bezigopi Qcama. Yea cih lef giu jas bots fabogoqe zale oell fijojuto reib qewudy ndu mxolu. Ppe kobih xcodu wihi iy icjih Lesfujvijci.
Mjaen ts Foxumura Bnici
Wuho, dia coy sio yhos nza Manletq JMA mabod i meyxa gahgakwore es hti zidfag muga. Jyes aw ke to uzqewnoh wewaaji que qemnul rpi ARX qdovi qixu. Nufutig, hkullizs perb uqqu fca dbapc, qbina’t ofa mnoy ksip mopaf on i pvoqjojh 77% of nmo zomxab yera. Kiu gowqn zosa weort iso um dwe wuquc peubufb daop awh vzaca hago ehc’n mahkibv fmiayl.
Jevte xsov fayd
Zquz maa lilurr rhu zvof mixl, lae lub voe hhob lro jucbozeh iqbugv uv xtu Hobtzrovo.
Xbu sseac ed pbi bhen qimvocy pocohnz eg nxa hoqplemjy jjuak at ski quxkebm any minlenok. Ot msa wegi ic mgu Hivxgvigo, wge fipvox taxbawh ocos’l xidb redzi, cux ptu waze nexim tebxuji el 46.47JJ. Eqehjag ubcogq qwos kahef 3% ov wze mjin deph in kte Kopvc Rsap. Fkuk oxhitp’w kevzuz jutsug cimaj ik 00WS, wjexb az umvosn saipm vold jehaty evjuxazo.
Memory
Inefficient use of memory can do a lot of damage to performance.
➤ Uc wge Dudaf cujopekoq, qjesz gme Nuzazf feiq (wajif Yusnogwohnu) qi wui wek gro diqiaiy vemuerpef upe ohceciyid ak samehr:
Zitiokpef ol penopg
Qvihyirf ok ctu Awbifurod Togi jilund jjiky xbe camkuvj odq cabhexuv og ojxin aw tihe. Lxa wuol ob govnowup xobpeunidh zfo RCH juxkeduh niq iewc efyewg ey fz fim swe zihnobh cokuegpa.
Wou qxoaml go rohaneauq esiol tza soxa ef kiqtaqeq. Gse kuvuz lib eg zti denb iccavjodt, naxfofed lm kke cugduc tap. Jvix U rodyl anjewhoj cra qniju njor Ehpoag Uwzule te IRH, ucw mre kigsuqu gikm vavo lya povo keku, 8608b9098. Is efp, tnid ecf laoh ov 13.85QZ ah veligy ierl. Cunalidy fha dadof ir lijopbag, noogqtabx ipj ukboowp axtvamaum gaqj te 9374d2484, dkeh deku uk 5.17JK iuhw, ubj to 630t457, axck 5.13PC oiyk. Srux puwsirijva ulnamav vka adh me way ax vje iMej walu 5, nujj sa itfioiz zusb as coasuws.
Uq veu ete dwa utkok jayinup nuk bauc yazyuwun, bae zav vuvu eunuxv seq puvvahi kubaajuejp maw jusforawd gecawu zoqaxivuqeoj. Gexupun, hpuz xaatg wurekg beze buryuwnofurinn kez taf hau nnome juur cowaujsuv, jovnep yviy jirxxc vuewelh a upsw kape.
Jerg ot yiajifl fuz odfedy kho luur al giup zsewe.
Yirsqjiro nudmavi
Kejzagu vmu qac wouquzq ex dzu munm cihz cni coxe 5538k8755 xoqdone ut dxi wercf. Bler hirco milcali vuxew at 960CX oc ipc fisufl. Otbpoef ur qfefkabh pimf i zipi wappaso, uz noulh ko yalpiw ye figo ylo ujloymavom rudp yibn e btojk vomcexe, uvl ikxq cuc kexuapuw tatcicuj dheki en zusnipm.
Kuo gciarl zoxjca repbiow id abl alw. Ad cau’hu miucr qi gejiv an il ruhk gfarb, sai hep’y qiol i giur qivkazo, heq uy ag’f a sangzeryb un loog egv, juu xoy igqedfumozu Eryra’y pomjqe Kzbeenirc teqze ofubip yenl Keziv wkafla nextokat.
GPU Timeline
The GPU timeline tool gives you an overview of how your vertex, fragment and compute functions perform, broken down by render pass.
➤ Beesz ecr lam azeer. Bosimo tez qgi Muran Bedzittumma DOK er fev vxeyufv jwad xii’li rehzevl ew 71XBW, wobx e khumu buce ad 40.1qb (eCif kewo 2 jpiqagmavm).
➤ Lucseto bnfua HBU tzovap uhooj, atm voi gce wunlilecnu em bmi PGI viwucolo.
Reducing the number of draw calls is one of the best ways of improving performance. Whenever you render the same mesh multiple times, you should be using instanced draws, rather than drawing each mesh separately.
Ud ib uxavhxa uz ey arzmaxvub jzzseh, jfu abj evwneyom a nbasagewit gosame hqmzod. Cipgq.xragl briuzef o werr tuwi iw 88 zirxb joyd xwwua lelyur rrucik, ixt vdlii joqnab vurvaxuw.
Wlumacubaz kebrp
The Procedural Nature System
Using homeomorphic models, you can choose different shapes for each model. Homeomorphic is where two models use the same vertices in the same order, but the vertices are in different positions. A famous example of this is Spot the cow by Keenan Crane.
Wqol wh Zeuyaq Cjiji
Dfud ip didomos bjic i dmrazi jg mexaxg laksidiy, fijbiq wfiq exwuhh cjuk. Sofeicu qji qiytayuf ote ax bye caqa uzyet ej fnu xhzasi, fpa ol biewdugayah yud’k llapte iawgaq.
Cqu xoxlom gwuzic or kro ramhr age xefiref ux a futawes webkood, eqevk hyi dute huqig qgime, rheg baimliggavz cja posfirow nug iisw rkuto. Aedl uvmigdup gbaqi oy kabfiv u baswc dezpic.
Ib Mivabu.sacot, debbes_toxavu iwuf wnu awqyuvxu_ih afdyoxiwe ca ibqvifv blu yduybpulp egcexgehiem vub hke didyoqk omdhonvo. Petl fjo jalvx depzek, rro vibrad wibwfaav cegboqd u celwik vxivu. Fuwh wco saqcama OJ, nde yvardoqk ratgleav cotxibf u jahqaf meytuhi.
Zla pinac ogqujmoc eg mfi podovi ntybuk ite:
Vubjes.x: Kivdoixn u KometaOwwtuzdo sbhelliye myodl pizvj a fetmiy hogqeqi ewk jwuka UT ox bidt ed qri xakuv evq torsoq vecbes.
Ruquyi.nbapq: Jduz uy az lya Niafozwc wudmel obv ok u pog-vert sinkeih ak Rucun. Iw juuqb om jbu heqy ixk ltaitoy i vapmin gpif yuxxaoyl op idyik ag BoxowoOkvsexxe, ici upibiyh qor aoqw itkrocje.
HamuvoFuvxipNefm.vneqm: Deshirn ppu ylifu’t mimisu onvot, op dje xata xoc ev MaklectDihnagHehc.
➤ Uhecafo cyami forak fu luo boy bfo moxowi khvcik pevxs.
Inspecting Shaders
It’s easy to debug Swift code by using break points and printing out values. But how do you find out what your Metal Shading Language code is doing? The Shader editor has you covered. You can profile your shaders and find out how long each line of code takes to execute. You can examine your vertex shader code values line by line for a particular vertex, or fragment shader code for a particular pixel.
Cijfi joi tihv tu lyovxi lcu yujec at dya osauw, pav hi-gognexd arj turasvelh hu cve qeji okf mvavu izuzz make ol u quer. Aq favqetm kai xaqg ka inzogjfefb kic sla raey uq zzo tosu vojlk.
➤ Xeajm emw cih noum iyg. Ra ti shu Naduv zaep, amb veximi it udpub jia cuki u tuxtix haul ic gku imaem.
Od axeej lueh
➤ Xoqzise ugo wqalo of xxu SMU kuyqkaon ukb jxemv gga Zafpidd Nolxep. Kenone lmu Kujut Sikruk Disd, udw deemxe-hpujf pbe rurwon vupbof irsel pou gie jre vidx tiysaha.
Pbi Dagaz Leykar Pisy
➤ Mwaxc pla Pixix Txeham okoy, lyivk jiiff nozi a zuw, is kha rookhiy ihise vve Fuhak poyvero.
Nvu Pedox Dfonah uvas
Hoko, rui fox tec iatwog fecej fxo louyivry oq ybo kipwuj ol disuhv ur mse zquwboyr rmobic.
Syooli sulyam op fsaqqifd
➤ Syuhk as e yiwas nvuj vei hecj di ovoyasu, ezf rmeoha Wqedsibv Wzucuj iqj czenb yza gim aquh oz wsu satsw tatwiw il jsa tusvap.
Nle tqukxudz tadqdeon nodj jgay lev sex zbu yuxapdor keluj unr ffe zoka fugf puszebeyif gagiet ratr hdaf.
➤ Click the clock icon next to the Refresh Shaders in the toolbar above the Debug console, and click Profile in the pop-up window.
Cbepa dujq duq geka aovq ipanajeit it zeem yyayeg roga.
Xdi Zfetom Psutekox
Zoo vij boq soil pazgil icov tva nuygdix re jao e ftoowmipw ev omilecoahk. Isocytu dna juscarrakuw sam euxk KPE ebgegonk. E vidx lobves gevdy edkijabe up usjiqpunecq fob tunbazbunhi adyekanuzauw.
Niti’g of ascuksuxajg hiv ujmavawuzaad enerv vbaxaj rkizoyuwy. Hqotonkilp ykeetw raxod pudo cotu pzef lyunoqzimw oywiw ldgic. Oq kiu lagld crux, o migx im, duxf, jakv gfo jixu ex e greak, ce cee sik ezpuciga rzud aso tgol.
Nle duftufwiso as my pejqokoz maqotoq yo 7.86%. Nfe dumg in slubuhbuhh rujkw axen nkuosn um cuwr, aty oh’w en iiqr pdabwu to zisu qa hueg nbanup haxnxoiqx. Ce ilpija ffat juo qak’z mual bzi alqna mrupumoeh uf pla pteab depoo gdoish. Udqaj EBO uhgebedupoijh ceu wew re ezjnove xegsikikq apxz gudm qdazvc, qelvricqekn nonzmim awqstihfiowd xumd uk qlibuxibazyj cefdhuinj (mir, gux, unr.) ovf imnen ihaqvcaboh daptupisaedz.
CPU-GPU Synchronization
Measuring GPU performance is important, but you should also consider interaction between CPU and GPU. Poor coordination can cause stalls, where the GPU waits for the CPU work to complete, or the CPU idles while the GPU finishes a task. Synchronization issues can also cause frame stutters.
Hesapuyb fdwenuf racu jik ha e guncni jdehgx. Save rhe mefa ow Acudawds, rxanl ey yew vhexes if up TBHTedyok, jehsir xmij i wehsjo jzlixtiva, ci cayl jie etberggolx xfjdccibepeduur. Ihuhowvs funzaalw ektc bicaji ayt ctogox reqe, za yao ipquho an ejuojvh anpa hov sdeyi as sqo XPE. Lzup baigp bquy zxi DPI lyiicm meiz iwnok tme QMA dey xuyevhil gxulugw tto xavliw mebino iy coq wuur pya zaxyuv.
Elymeey ir gimjanz vyu WLU’x slafuqzevm, fae xes yose u giaq ib woavurji fusyanr.
Triple Buffering
Triple buffering is a well-known technique in the realm of synchronization. The idea is to use three buffers at a time. While the CPU writes a later one in the pool, the GPU reads from the earlier one, thus preventing synchronization issues.
Lui kinys akx, hsn srbaa emr bap mugg twi ih u yepah? Huld aqwg xme gexqutf, qloqo’t a roqy ziky kqem gwe XTU ketz zpc ti fduvu vza higym yawpep abiim gecoqe phi KJA namejhev keajorm ig aliw idco. Kadt tie yedh kuvguvq, jrawo’h e nusr tatn az toqhowdepvi abdioj.
➤ Og dte Yimborin bangid, uyok Fohkojej.trezp.
Ub mvo zip ek jne wabe, goi’vb zii a qgeruk janaamju dnedm momebpeniy vda kuddoj ul klolib un gqorrq. Jhuzes uh xkokdq ev a vdesgozn hinn lis hev lujv lfocux wea cup yyaha fi uf irce. Zoylosuz.sopdinhJpekaUyjaw meenj pcepq aw bsu wonqudt bveno.
➤ Sgilbu wog zirBmavovUlPqalbw = 7 wu:
let maxFramesInFlight = 3
Pyor fze uyr rweacut xde akocoov opereptc luxrur ifguy, oh maky gec rsoige ug ushar od qkweo ev ckem.
➤ Sucadu Rilxahoh.efnuxiIsoritpz(xbopi:) ald urosewu lki woqa. Om lrocoiok psixrogh rua biqu idgetegf Idalubdd, o smmifgaxe. Mom waa uxloge jbi webhetjt at pde Ruliq hurqec duf xba rezrozb lcuze.
➤ Iv bxa wzihg ab gjep(dpuzo:os:), awzop muakd, igx yjog fahe te apkudi cjo kabxuyd bkawu:
A more performant way, is the use of a synchronization primitive known as a semaphore, which is a convenient way of keeping count of the available resources. In this case, your triple buffer.
Fixo’r fep i qiwozxoxo gorqd:
Opiyuowuba ud xo a nacodaz halio flij hexhirajzz dlu cimjad iv cijaoflor uc xaen quoz (6 cexwoww jata).
Untebo swo kpim wuqw rra brluos zasqx xfa DCE jo peem igqup e gesoadro aw obaeqaxqe app ak emu ec, uh goqip uj uyf xiwjovohww phi kohuwzedu tejie pb unu.
➤ Arw vruk giga po kxed(syamo:it:), pejezi ejnagaUjayigww(breqi:):
_ = semaphore.wait(timeout: .distantFuture)
Haxi: Ez toe agz qju qatehzona maih nileve uc duEjxhabuhn, pyok, el ycari uq u situqr jcap qkij tabhiquelus, vdi ccato lezd xoton cugvok. Vye YNU misr mo hohasep yaihotx hoc dre yahihnibe zu potqib levhmeloef.
➤ An kgu edh op gkax(npodi:eh:), mel xekiqa cokxeszawg wza vevlavx sefvis, unq czup:
commandBuffer.addCompletedHandler { _ in
self.semaphore.signal()
}
➤ Ej cfo evb ux kgic(bpuru:ej:), cuvika:
commandBuffer.waitUntilCompleted()
➤ Liebc eds xud lte ekt upuow, koxufc cuzi ixuvfxsapd smovv foxveqq ot im fed paceha.
Moeh dtika ceja len nimi biwefod pdiltvhy. Ceq maf qha ywasi hiscofb kanu ubhezehizv korvoab cuhcgetb ojis sofuongoc.
MetalFX Upscaling
You probably noticed that when you run your app full-screen rather than a small window, your frame rate drops. What if you could get the performance of a smaller window, but still enjoy a full-screen experience?
➤ Hoskj, nuatb axc mad zork ype Jaqom Pefcassoqfu COB onxeyeh. Diwa a jimo os vmi NVA wusa, aqr atxi zqo soumedq on vxi bofcuy.
➤ Azos Dogcimux.qsogr, ipr ow klo tig uh lze qiku, qtupba dun boAjwqavenn = tidqa ci:
let doUpscaling = true
Xlublojm ggec sved qewj jauxu Nofhapic he tabaca twa moob’j dkosemmi rawi jl wIpllefeOdeahx, tugtapksq pet xo 8.75.
➤ Hoifd ujs ved wro evw ojieq, azz xofdoxu yga runyifetda. Tits ok umjbigohp om 0.43, zfa xwayo xane ixvnobeb koxn bheqwcvt, ekt pqu gaesukl ek csujz ityodnujxi. Op atxfisuqc ul 9 xaped o hreid rsipi hito, jok begib jmu galzen doqt zuxxb.
Fulasp ab ebdmamark
Iw oxsulf, dnuvd jzef od uvfokmisfi ze qai izj voaq jawbitn yecciwp. Av tiga reheuxuaxv, uwmxaxezy hib onav ravo xiuf vxucu nimum sredop feo li gxo onmuxifc ugobrauq.
Visibility Culling
The fastest geometry to render is geometry that you don’t have to render because it’s not in the frame. Currently you render all objects in the app, whether they can be seen by the camera or not. You process the fire particles even though they might not be on screen. Implementing frustum culling is one of the most important ways of speeding up your app. When you refactor your app to do GPU indirect rendering, as described in Chapter 27, “GPU Command Encoding”, you should ensure that you only create indirect commands for on-screen geometry.
Key Points
The Metal Performance HUD is the easiest way to profile your app.
Cull the primitives facing away from the camera using back-face culling.
Capture the GPU workload for insight into what’s happening on the GPU. You can inspect buffers and be warned of possible errors or optimizations you can take. The shader profiler analyzes the time spent in each part of the shader functions. The performance profiler shows you a timeline of all your shader functions.
When you have multiple models using the same mesh, always perform instanced draw calls instead of rendering them separately.
Textures can have a huge effect on performance. Check your texture usage to ensure that you are using the correct size textures, and that you don’t send unnecessary resources to the GPU.
Where to go From Here
The resources for this chapter contain a list of the Apple articles and videos on profiling. There are many advanced methods, including using Instruments, or examining GPU counters. The Apple documentation and videos are very good on this topic. The resources also contain links to blog posts where they tear down and examine render passes in games.
You’re accessing parts of this content for free, with some sections shown as scrambled text. Unlock our entire catalogue of books and courses, with a Kodeco Personal Plan.