Nytro Posted December 11, 2015 Report Posted December 11, 2015 Optimizing software in C++An optimization guide for Windows, Linux and MacplatformsBy Agner Fog. Technical University of Denmark.Copyright © 2004 - 2014. Last updated 2014-08-07.Contents1 Introduction ....................................................................................................................... 31.1 The costs of optimizing ............................................................................................... 42 Choosing the optimal platform........................................................................................... 52.1 Choice of hardware platform....................................................................................... 52.2 Choice of microprocessor ........................................................................................... 62.3 Choice of operating system......................................................................................... 62.4 Choice of programming language ............................................................................... 82.5 Choice of compiler.................................................................................................... 102.6 Choice of function libraries........................................................................................ 122.7 Choice of user interface framework........................................................................... 142.8 Overcoming the drawbacks of the C++ language...................................................... 143 Finding the biggest time consumers ................................................................................ 163.1 How much is a clock cycle? ...................................................................................... 163.2 Use a profiler to find hot spots .................................................................................. 163.3 Program installation .................................................................................................. 183.4 Automatic updates .................................................................................................... 193.5 Program loading ....................................................................................................... 193.6 Dynamic linking and position-independent code ....................................................... 203.7 File access................................................................................................................ 203.8 System database ...................................................................................................... 203.9 Other databases ....................................................................................................... 213.10 Graphics ................................................................................................................. 213.11 Other system resources.......................................................................................... 213.12 Network access ...................................................................................................... 213.13 Memory access....................................................................................................... 223.14 Context switches..................................................................................................... 223.15 Dependency chains ................................................................................................ 223.16 Execution unit throughput ....................................................................................... 224 Performance and usability ............................................................................................... 235 Choosing the optimal algorithm ....................................................................................... 246 Development process...................................................................................................... 257 The efficiency of different C++ constructs........................................................................ 267.1 Different kinds of variable storage............................................................................. 267.2 Integers variables and operators............................................................................... 297.3 Floating point variables and operators ...................................................................... 327.4 Enums ...................................................................................................................... 337.5 Booleans................................................................................................................... 337.6 Pointers and references............................................................................................ 367.7 Function pointers ...................................................................................................... 377.8 Member pointers....................................................................................................... 377.9 Smart pointers .......................................................................................................... 387.10 Arrays ..................................................................................................................... 387.11 Type conversions.................................................................................................... 407.12 Branches and switch statements............................................................................. 437.13 Loops...................................................................................................................... 4527.14 Functions ................................................................................................................ 487.15 Function parameters ............................................................................................... 507.16 Function return types .............................................................................................. 507.17 Structures and classes............................................................................................ 517.18 Class data members (properties)............................................................................ 517.19 Class member functions (methods)......................................................................... 537.20 Virtual member functions ........................................................................................ 537.21 Runtime type identification (RTTI)........................................................................... 547.22 Inheritance.............................................................................................................. 547.23 Constructors and destructors .................................................................................. 557.24 Unions .................................................................................................................... 557.25 Bitfields................................................................................................................... 567.26 Overloaded functions .............................................................................................. 567.27 Overloaded operators ............................................................................................. 567.28 Templates............................................................................................................... 577.29 Threads .................................................................................................................. 607.30 Exceptions and error handling ................................................................................ 617.31 Other cases of stack unwinding .............................................................................. 657.32 Preprocessing directives ......................................................................................... 657.33 Namespaces........................................................................................................... 658 Optimizations in the compiler .......................................................................................... 668.1 How compilers optimize ............................................................................................ 668.2 Comparison of different compilers............................................................................. 748.3 Obstacles to optimization by compiler....................................................................... 778.4 Obstacles to optimization by CPU............................................................................. 818.5 Compiler optimization options ................................................................................... 818.6 Optimization directives.............................................................................................. 828.7 Checking what the compiler does ............................................................................. 849 Optimizing memory access ............................................................................................. 879.1 Caching of code and data ......................................................................................... 879.2 Cache organization................................................................................................... 879.3 Functions that are used together should be stored together...................................... 889.4 Variables that are used together should be stored together ...................................... 889.5 Alignment of data...................................................................................................... 909.6 Dynamic memory allocation...................................................................................... 909.7 Container classes ..................................................................................................... 939.8 Strings ...................................................................................................................... 969.9 Access data sequentially .......................................................................................... 969.10 Cache contentions in large data structures ............................................................. 969.11 Explicit cache control .............................................................................................. 9910 Multithreading.............................................................................................................. 10110.1 Hyperthreading ..................................................................................................... 10311 Out of order execution................................................................................................. 10312 Using vector operations............................................................................................... 10512.1 AVX instruction set and YMM registers ................................................................. 10712.2 AVX-512 instruction set and ZMM registers .......................................................... 10712.3 Automatic vectorization......................................................................................... 10712.4 Using intrinsic functions ........................................................................................ 10912.5 Using vector classes ............................................................................................. 11312.6 Transforming serial code for vectorization............................................................. 11712.7 Mathematical functions for vectors........................................................................ 11912.8 Aligning dynamically allocated memory................................................................. 12012.9 Aligning RGB video or 3-dimensional vectors ....................................................... 12012.10 Conclusion.......................................................................................................... 12013 Making critical code in multiple versions for different instruction sets........................... 12213.1 CPU dispatch strategies........................................................................................ 12213.2 Model-specific dispatching.................................................................................... 12413.3 Difficult cases........................................................................................................ 124313.4 Test and maintenance .......................................................................................... 12613.5 Implementation ..................................................................................................... 12613.6 CPU dispatching in Gnu compiler ......................................................................... 12813.7 CPU dispatching in Intel compiler ......................................................................... 13014 Specific optimization topics ......................................................................................... 13214.1 Use lookup tables ................................................................................................. 13214.2 Bounds checking .................................................................................................. 13414.3 Use bitwise operators for checking multiple values at once................................... 13514.4 Integer multiplication............................................................................................. 13614.5 Integer division...................................................................................................... 13714.6 Floating point division ........................................................................................... 13914.7 Don't mix float and double..................................................................................... 14014.8 Conversions between floating point numbers and integers ................................... 14114.9 Using integer operations for manipulating floating point variables......................... 14214.10 Mathematical functions ....................................................................................... 14514.11 Static versus dynamic libraries............................................................................ 14614.12 Position-independent code.................................................................................. 14814.13 System programming.......................................................................................... 15015 Metaprogramming ....................................................................................................... 15016 Testing speed.............................................................................................................. 15316.1 Using performance monitor counters .................................................................... 15516.2 The pitfalls of unit-testing ...................................................................................... 15616.3 Worst-case testing ................................................................................................ 15717 Optimization in embedded systems............................................................................. 15818 Overview of compiler options....................................................................................... 16019 Literature..................................................................................................................... 16320 Copyright notice .......................................................................................................... 164Download: http://www.agner.org/optimize/optimizing_cpp.pdf Quote