返回列表 回复 发帖

Uniy3D优化之Optimizing Graphics Performance

unity3d 学习交流群 211257505


Optimizing Graphics Performance 优化图形性能


Desktop 桌面游戏

Good performance is critical to the success of many games. Below are some simple guidelines for maximizing the speed of your game's graphical rendering.
良好的性能是许多游戏的成功的关键。下面是一些简单的指导,可以用来帮助您最大化你的游戏图形渲染速度。

Optimizing Meshes 优化网格You only pay a rendering cost for objects that have aMesh Renderer attached and are within the view frustum. There is no rendering cost from emptyGameObjects in the scene or from objects that are out of the view of any camera.
您仅仅只需要渲染视域体内含有Mesh Renderer的物体,而对于空的或者视域体外的游戏物体则不需要进行渲染。


Modern graphics cards are really good at handling a lot of polygons but there is a significant overhead for each batch (ie, mesh) that you submit to the graphics card. So if you have a 100-triangle object it is going to be just as expensive to render as a 1500-triangle object. The "sweet spot" for optimal rendering performance is somewhere around 1500-4000 triangles per mesh.
现代的图形卡已经非常擅长处理大量的多边形渲染,但在每次将每批几何数据(网格)提交给图形卡时却要花费很大的开销。即便你只渲染一个只有100个三角形的物体,它的开销也和渲染一个1500个三角形面片的物体的开销是一样的。因此,最优的渲染性能的"平衡点"是每次渲染1500-4000三角形网格。


Usually, the best way to improve rendering performance is to combine objects together so that each mesh has around 1500 or more triangles and uses only oneMaterial for the entire mesh. It is important to understand that combining two objects which don't share a material does not give you any performance increase at all. The most common reason for having multiple materials is that two meshes don't share the same textures, so to optimize rendering performance, you should ensure that any objects you combine share the same textures.
通常情况下,提高渲染性能的最佳途径是将物体组合一起,使每个网格大概拥有大约 1500个三角形面片,并且整个网格只使用一种材质。值得注意的一点是,您需要了解将拥有不同材质的两个物体拼合在一起并不会给您带来任何的性能提升。最常见的原因两个网格所用的材质并不共享相同的纹理,所以要优化渲染性能,您至少应该确保您所组合的任何物体共享相同的纹理。


However, when using many pixel lights in theForward rendering path, there are situations where combining objects may not make sense, as explained below.
但是,在Forward rendering path的渲染方式下,如果每个像素使用多个光源,那么可能会出现拼合物体不起作用的情况,我们将在下面对其进行解释。


Pixel Lights in the Forward Rendering Path 前向渲染路径的像素光源Note: this applies only to theForward rendering path.
If you use pixel lighting then each mesh has to be rendered as many times as there are pixel lights illuminating it. If you combine two meshes that are very far apart, it will increase the effective size of the combined object. All pixel lights that illuminate any part of this combined object will be taken into account during rendering, so the number of rendering passes that need to be made could be increased. Generally, the number of passes that must be made to render the combined object is the sum of the number of passes for each of the separate objects, and so nothing is gained by combining. For this reason, you should not combine meshes that are far enough apart to be affected by different sets of pixel lights.

如果您使用像素光照,那么每个网格将会渲染多遍,渲染次数是根据每个像素的光源个数来决定的。如果两个相距很远的网格进行合并,它将会增加合并物体的有效大小。因此,任何可以找到该物体的像素光源都将在渲染过程中进行计算,这势必会增加整个渲染的pass。一般情况下,拼合物体的渲染pass总数将是各个物体所需要的渲染pass的综合,这样一来,拼合这些物体显然没有任何的好处。出于以上原因,您不应该组合距离较远的、受到不同的像素光照影响的物体。


During rendering, Unity finds all lights surrounding a mesh and calculates which of those lights affect it most. TheQuality Settings are used to modify how many of the lights end up as pixel lights and how many as vertex lights. Each light calculates its importance based on how far away it is from the mesh and how intense its illumination is. Furthermore, some lights are more important than others purely from the game context. For this reason, every light has aRender Mode setting which can be set toImportant orNot Important; lights marked as Not Important will typically have a lower rendering overhead.

在渲染过程中,Unity会查找一个网格周边的所有光源,并计算最影响这些光线的。质量设置(Quality Settings)是用来修改多少光源将最终会作为像素光源和多少光源将作为顶点光源。每个光源的重要性计算是基于它距离物体的远近以及它自身的亮度强弱。此外,一些光源的重要性与否需要根据游戏内容而定,因此,每个光源都有一个渲染模式(Render Mode)设置,来制定其是重要还是不重要的;被标记为不重要的光源通常会被设定为较低的渲染开销。


As an example, consider a driving game where the player's car is driving in the dark with headlights switched on. The headlights are likely to be the most visually significant light sources in the game, so their Render Mode would probably be set to Important. On the other hand, there may be other lights in the game that are less important (other cars' rear lights, say) and which don't improve the visual effect much by being pixel lights. The Render Mode for such lights can safely be set to Not Important so as to avoid wasting rendering capacity in places where it will give little benefit.
比如,我们以一个赛车游戏为例,选手的车打开车前疼在黑暗中行驶。车前灯应该算是游戏中最为显著的
光源,所以它们的渲染模式应该被设定为重要(Important)。同时,游戏中很可能存在一些不重要的光源(比如其他车的车尾灯)。对于这些光源,其绘制模式可以被设定为不重要(Not Important),以此来避免渲染方面的开销,从而达到更良好的运行性能。

unity3d 学习交流群 211257505


Per-Layer Cull Distances 每层裁剪距离In some games, it may be appropriate to cull small objects more aggressively than large ones in order to reduce number of draw calls. For example, small rocks and debris could be made invisible at long distances while large buildings would still be visible. To accomplish this culling, you can put small objects into aseparate layer and setup per-layer cull distances using theCamera.layerCullDistances script function.

在某些游戏中,通过裁剪小物体来减少绘制调用的数量可能比裁剪大物体达到更好的效果。例如,小块岩石和碎片在距离较远时可以设为不可见,而大型建筑物仍然可见。为了完成这样的裁剪,可以将小物体放入单独的图层和并使用Camera.layerCullDistances脚本来对每层物体进行不同的视域裁剪距离。




Shadows 阴影If you are deploying for Desktop platforms then you should be careful when using shadows because they can add a lot of rendering overhead to your game if not used correctly. For further details, see theShadows page.

如果您要在桌面平台上进行发布,那么您应谨慎地使用阴影,因为如果没有正确地使用它们,可能会将很多不必要的渲染开销添加到您的游戏中。有关进一步的详细信息,情况阴影部分(以后翻译)



Note: Shadows are not currently supported on iOS or Android devices.
注意:目前阴影功能暂不支持iOS和Android设备。

RE: Uniy3D优化之Optimizing Graphics Performance(II)

A useful background to iOS optimization can be found on theiOS hardware page.
关于iOS优化的背景知识可以参见iOS hardware page
Alpha-Testing    Alpha 测试Unlike desktop machines, iOS devices incur a high performance overhead for alpha-testing (or use of thediscard andclip operations in pixel shaders). You should replace alpha-test shaders with alpha-blend if at all possible. Where alpha-testing cannot be avoided, you should keep the overall number of visible alpha-tested pixels to a minimum.
与桌面机不同,iOS 设备在进行alpha测试(或在像素着色器中使用丢弃和裁剪操作)时会造成很高的性能开销。所以,如果可以的话,您应该尽可能地用alpha混合替换alpha测试。如果alpha测试不能避免,那么您应该尽量将可见的 alpha测试像素总数降到最低。

Vertex Performance         顶点性能Generally you should aim to have no more than 40,000 vertices visible per frame when targeting iPhone 3GS or newer devices. You should keep the vertex count below 10,000 for older devices equipped with the MBX GPU, such as the original iPhone, iPhone 3G and iPod Touch 1st and 2nd Generation.
一般来说,如果您想在iPhone 3GS或更新的设备上每帧渲染不超过40,000可见点,那么对于一些配备 MBX GPU的旧设备(比如,原始的 iPhone,如iPhone 3g和 iPod Touch第1和第2代)来说,你应该保证每帧的渲染顶点在10000以下。

Lighting Performance    光照性能Per-pixel dynamic lighting will add significant rendering overhead to every affected pixel and can lead to objects being rendered in multiple passes. Avoid having more than onePixel Light illuminating any single object and use directional lights as far as possible. Note that aPixel Light is a one which has itsRender Mode option set toImportant.
逐像素的动态光照将对每个受影响的像素增加显著的计算开销,并可能导致物体会被渲染多次。为了避免这种情况的发生,您应该避免对于任何单个物体都使用多个像素光照,并尽可能地使用方向光。需要注意的是像素光源是一个渲染模式(Render Mode)设置为重要(Important)的光源。


Per-vertex dynamic lighting can add significant cost to vertex transformations. Try to avoid situations where multiple lights illuminate any given object. For static objects, baked lighting is much more efficient.
逐像素的动态光照将对顶点变换增加显著的开销。所以,应该尽量避免任何给定的物体被多个光源同时照亮的情况。对于静态物体,采用烘焙光照方法则是更为有效的方法。


Optimize Model Geometry     优化模型几何When optimizing the geometry of a model, there are two basic rules:
当优化模型的几何数据时,有两个可以遵循的基本原则:


      Don't use any more triangles than necessary

      不要使用不必要的三角形面片


      Try to keep the number of UV mapping seams and hard edges (ie, doubled-up vertices) as low as possible

      尽量降低的 UV映射的接缝和硬边缘(ie,增加了一倍以上顶点)的数量


Note that the actual number of vertices that graphics hardware has to process is usually not the same as the number reported by a 3D application. Modeling applications usually display the geometric vertex count, ie, the number of distinct corner points that make up a model.
请注意,显卡所处理的实际顶点数目通常不等于一个 3D应用程序所报告的数目。建模工具通常显示的是模型的几何顶点数,ie,组成模型的不同角点的数目。

For a graphics card, however, some geometric vertices will need to be split into two or more logical vertices for rendering purposes. A vertex must be split if it has multiple normals, UV coordinates or vertex colors. Consequently, the vertex count in Unity is invariably a lot higher than the count given by the 3D application.
但是,对于显卡来说,一些几何顶点往往需要根据渲染用途来拆分成两个或多个逻辑顶点。比如,一个顶点如果有多个法线、 UV坐标或顶点颜色,则必须对其进行拆分。因此,Unity的顶点计数总是比的 3D应用程序的顶点计数高很多。

Texture Compression     纹理压缩Using iOS's native PVRT compression formats will decrease the size of your textures (resulting in faster load times and smaller memory footprint) and can also dramatically increase rendering performance. Compressed textures use only a fraction of the memory bandwidth needed for uncompressed 32bit RGBA textures. A comparison of uncompressed vs compressed texture performance can be found in theiOS Hardware Guide.
使用iOS自带的PVRT压缩格式可以有效地降低纹理的大小(这将导致更快的导入速度以及更小的内存占用),从而显著地提升渲染效率。压缩后的纹理所占用的内存带宽只相当于未压缩的 32位 RGBA 纹理的一小部分。在iOS Hardware Guide中,您可以找到的未压缩纹理与压缩纹理的性能比较。

Some images are prone to visual artifacts in the alpha channels of PVRT-compressed textures. In such cases, you might need to tweak the PVRT compression parameters directly in your imaging software. You can do that by installing the PVR export plugin or using PVRTexTool from Imagination Tech, the creators of the PVRT format. The resulting compressed image file with a .pvr extension will be imported by the Unity editor directly and the specified compression parameters will be preserved.
某些图像经过PVRT 压缩后可能会在alpha通道中出现一些视觉瑕疵。在这种情况下,您需要直接在图像处理软件中调整 PVRT压缩参数。你可以安装PVR导出插件或直接使用Imagination Tech提供的PVRTexTool。生成的扩展名为.pvr的压缩图像文件,将直接被导入到Unity编辑器中,并且其指定的压缩参数将被保留。

If PVRT-compressed textures do not give good enough visual quality or you need especially crisp imaging (as you might for GUI textures, say) then you should consider using 16-bit textures instead of RGBA textures. By doing so, you will reduce the memory bandwidth by half.
如果PVRT压缩纹理无法给你足够好的视觉质量,或者你需要特别清晰的图片(比如用来作为GUI纹理),那么您应该考虑使用16位的纹理来代替RGBA纹理。因为这样做,你也可以减少一半的内存带宽。


Tips for writing high-performance shaders     写出高性能的shaderThe GPUs on iOS devices have fully supported pixel and vertex shaders since the iPhone 3GS. However, the performance is nowhere near what you would get from a desktop machine, so you should not expect desktop shaders to port to iOS unchanged. Typically, shaders will need to be hand optimized to reduce calculations and texture reads in order to get good performance.
自从iPhone 3GS 以来,IOS 设备上的 Gpu就可以充分支持像素着色器和顶点着色器。但是,IOS设备上的性能还远远是没有达到PC机的程度,所以您不应期望PC上的着色器可以端口不变地移植到IOS设备上。通常情况下,着色器需要进行一些优化来减少计算量和纹理读取,从而才能达到良好的性能。

Complex mathematical operations                  复杂的数学计算Transcendental mathematical functions (such aspow,exp,log, cos,sin,tan, etc) will tax the GPU greatly, so a good rule of thumb is to have no more than one such operation per fragment. Consider using lookup textures as an alternative where applicable.
一些超越数学函数(如pow,exp,log,cos, sin,tan等)将增大GPU计算的负担,所以一个好的法则就是在每个片段中使用不多于一次的这种操作。在适当的情况下,可以考虑使用查找纹理来作为折中方法。

It is not advisable to attempt to write your ownnormalize,dot,inverse, sqrt operations, however. If you use the built-in ones then the driver will generate much better code for you.
不建议您自己写normalize,dot,inverse, sqrt等操作函数,如果您使用内建函数,底层驱动将会为你生成更好质量的代码。


Bear in mind also that the discard operation will make your fragments slower.
切记,丢弃(discard)操作将会让你的片段运行缓慢。


Floating point operations        浮点数操作You should always specify the precision of floating point variables when writing custom shaders. It iscritical to pick the smallest possible floating point format in order to get the best performance.
编写自定义着色器时,您应该始终指定浮点变量的精度。为了获得最佳性能选择最小可浮动点格式至关重要。

If the shader is written in GLSL ES then the floating point precision is specified as follows:-
如果着色器是用GLSL ES来写的,那么浮点精度的指定如下所示:

      highp - full 32-bit floating point format, suitable for vertex transformations but has the slowest performance.

      highp全32位浮点格式,适合顶点变换,但效率最低。

      

      mediump - reduced 16-bit floating point format, suitable for texture UV coordinates and roughly twice as fast ashighp

      mediump缩减的16位浮点格式,适合于纹理UV坐标,效率大致是highp的两倍。

      

      lowp - 10-bit fixed point format, suitable for colors, lighting calculation and other high-performance operations and roughly four times faster thanhighp

      lowp – 10位固定点格式,适合于颜色,光照计算和其他高性能操作,效率大致是highp的四倍。


If the shader is written in CG or it is a surface shader then precision is specified as follows:-
如果着色器是用CG写的,或者它是一个表面着色器,那么它的精度指定如下所示:


      float - analogous tohighp in GLSL ES, slowest

      floatGLSL ES中的highp相似,效率最慢。

      

      half - analogous tomediump in GLSL ES, roughly twice as fast as float

      half -GLSL ES中的mediump相似,效率大致是float的两倍。

      

      fixed - analogous tolowp in GLSL ES, roughly four times faster than float

      fixed -GLSL ES中的lowp相似,效率大致是float的四倍。


For further details about shader performance, please read theShader Performance page.
对于着色器性能的更多细节,请查看Shader Performance page


Hardware documentation        硬件文档Take your time to study Apple documentations onhardware and best practices for writing shaders. Note that we would suggest to be more aggressive with floating point precision hints however.
花费一些时间去学习Apple硬件相关文档best practices for writing shaders,但我们更加建议关注浮点型的精度问题。
Bake Lighting into Lightmaps     烘焙光照信息到光照贴图(LightmapBake your scene static lighting into textures using Unity built-inLightmapper. The process of generating a lightmapped environment takes only a little longer than just placing a light in the scene in Unity,but:
通过Unity内建的Lightmapper,可以将您的场景的光照信息烘焙到纹理上。生成一个光照贴图要比仅仅在Unity的场景中放置一个光源来花费更长的时间,但是:


      It is going to run a lot faster (2-3 times for eg. 2 pixel lights)

      它将运行更加快速(如果像素光源是2的话,光照贴图的运行速度将是动态光照的2-3倍)


      And look a lot better since you can bake global illumination and the lightmapper can smooth the results

      看上去效果更好,因为你可以烘焙全局光照明效果,同时光照贴图器还可以柔化(smooth)光照效果。


Share Materials         共享材质If a number of objects being rendered by the same camera uses the same material, then Unity iOS will be able to employ a large variety of internal optimizations such as:
如果当前帧中的许多物体都使用相同的材质,那么Unity  iOS将会启动大量的内部优化,比如:

      Avoiding setting various render states to OpenGL ES.

      避免设置各种不同的render state到OpenGL ES上。


      Avoiding calculation of different parameters required to setup vertex and pixel processing

      避免设置顶点和像素处理所需的不同参数的计算


      Batching small moving objects to reduce draw calls

      批处理小型运动物体,来减少绘制调用


      Batching both big and small objects with enabled "static" property to reduce draw calls

      批处理开启静态属性的物体来减小绘制调用


All these optimizations will save you precious CPU cycles. Therefore, putting extra work to combine textures into single atlas and making number of objects to use the same material will always pay off. Do it!
所有这些优化将节省您宝贵的CPU周期。因此,进行纹理拼合使物体共用相同材质,这样的额外工作是会让您如愿以偿的。去做吧!

Simple Checklist to make Your Game Faster   简单的清单,让您的游戏运行更快

      Keep vertex count below:

      保证顶点个数如下:


·         40K per frame when targeting iPhone 3GS and newer devices (with SGX GPU)

对于iPhone 3GS以及更新的设备(SGX GPU),每帧至多渲染4万个顶点


·         10K per frame when targeting older devices (with MBX GPU)

对于较老的设备(MBX GPU),每帧至多渲染1万个顶点


      If you're using built-in shaders, peek ones from Mobile category. Keep in mind thatMobile/VertexLit is currently the fastest shader.

      如果您使用的内置的着色器,请从移动类别中选择它们。请记住,Mobile/VertexLit是目前最快的着色器。


      Keep the number of different materials per scene low - share as many materials between different objects as possible.

      保证每个场景所用的材质种类达到最低,尽可能地在让多种物体分享相同的材质。


      SetStatic property on a non-moving objects to allow internal optimizations.

      对于非运动的物体请打上Static标记,这样这些物体可以进行内部优化。


      Use PVRTC formats for textures when possible, otherwise choose 16bit textures over 32bit.

      尽可能地选择PVRTC格式的纹理,如果不能地话,也请选择16位纹理而不是32位的。


      Use combiners or pixel shaders to mix several textures per fragment instead of multi-pass approach.

      使用合并或像素着色器来混合每个片段的纹理,来代替多通道渲染方法。


      If writing custom shaders, always use smallest possible floating point format:

      如果写自定义的着色器,请尽可能地使用最小的浮点类型:


·         fixed /lowp -- perfect for color, lighting information and normals,

fixed /lowp适用于颜色,光照信息和法线,


·         half /mediump -- for texture UV coordinates,

half /mediump适用于纹理UV坐标,


·         float /highp -- avoid in pixel shaders, fine to use in vertex shader for vertex position calculations.

float /highp避免在像素着色器中进行使用,最好用于顶点着色器中的顶点位置计算。


      Minimize use of complex mathematical operations such aspow,sin,cos etc in pixel shaders.

      尽量在像素着色器中少使用复杂的数学运算操作,比如pow,sin,cos等。


      Do not usePixel Lights when it is not necessary -- choose to have only a single (preferably directional) pixel light affecting your geometry.

      尽量不要使用多个像素光源,选择单一的像素光源(最好是方向光)来照亮您的模型。


      Do not use dynamic lights when it is not necessary -- choose to bake lighting instead.

      尽量不要使用动态光照,选择烘焙光照来代替。


      Choose to use less textures per fragment.

      每个片段选择较少的纹理。


      Avoid alpha-testing, choose alpha-blending instead.

      避免alpha测试,选择alpha混合来代替。


      Do not use fog when it is not necessary.

      尽量不要使用雾效果。


      Learn benefits of Occlusion culling and use it to reduce amount of visible geometry and draw-calls in case of complex static scenes with lots of occlusion. Plan your levels to benefit from Occlusion culling.

      学习利用遮挡剔除功能的优势,并通过它在复杂的、高度遮挡的场景中来减少可见几何物体的数量和绘制调用。根据遮挡剔除的特点来设计您的场景。


      Use skyboxes to "fake" distant geometry.

      使用天空盒来“淡化”远处的几何模型。

返回列表