更快的C++: Move构造函数和完美转发 In 2002, Howard Hinnant and some others proposed adding move semantics to C++ Back in 2003, Alexander Alexandrescu wrote on article in Dr. Dobbs about move constructors I first heard about move constructors in 2007 when Howard Hinnant presented the concept at BoostCon Pete Isensee 高级技术小组 微软
问题陈述 对对象的深拷贝是昂贵的 C++是建立在复制语义的基础之上的 拷贝在源代码中常常是不明显的 游戏拷贝对象 – 很多! STL容器是以存值方式来进行储存 编译器临时对象是通过值拷贝来进行复制的 拷贝在源代码中常常是不明显的 游戏拷贝对象 – 很多! Deep: objects that refer to additional memory Classic examples: strings, vectors, linked lists C++ philosophically favors copy Return Value Optimization helps, but doesn’t solve the problem
示例 深拷贝 浅拷贝 更深的拷贝 struct Texture { unsigned long mSize; unsigned long* mpBits; }; 深拷贝 浅拷贝 struct Particle { Vector3 mPos; Vector3 mVel; Color mCol; }; struct ParticleSystem { std::vector< Particle > mPar; Texture mTex; }; 更深的拷贝 Shallow: all PODs Deep: points to other memory Does this make you uncomfortable? Code perfectly reasonable Readable, correct, functional, w/ strict value semantics ParticleSystem particleSys(...); particleSys = StartExplosion(); // Explosion begins particleSys += AddSmoke(); // More particles added
ParticleSystem particleSys(...); Deep: objects that refer to additional memory Classic examples: strings, vectors, linked lists C++ philosophically favors copy Return Value Optimization helps, but doesn’t solve the problem ParticleSystem particleSys(...); particleSys = StartExplosion(); // Explosion begins particleSys += AddSmoke(); // More particles added
粒子系统 开始爆炸() ParticleSystem particleSys(...); particleSys = StartExplosion(); // Explosion begins particleSys += AddSmoke(); // More particles added 粒子系统 开始爆炸() t … v t … v t v t … v … t v t … v … StartExplosion: unnamed temporary object Assume particleSys is not empty and # particles is less than the explosion * That little equal sign killed us from a performance standpoint AddSmoke: vector reallocation, and likely some copying of texture memory t … v t v …
复制临时对象是昂贵的 粒子 复制 (ticks) 1 6,113 10 7,201 100 5,543 1,000 8,579 10,000 56,614 100,000 635,962 1,000,000 6,220,013 运算符性能=(常量 粒子系统类) Measured on Xbox 360 You can see the benefits of cache coherancy at 100 particles Logarithmic scale; linear shoots straight through the roof * With a small number of particles, the cost is primarily the allocation cost * With large number, the cost is copying memory Clearly, shouldn’t be using this type of semantic – you should just avoid temporaries
避开临时对象是困难的 bool Connect( const std::string& server, ... ); if( Connect( “microsoft.com” ) ) // temporary object created v.push_back( X(...) ); // another temporary a = b + c; // b + c is a temporary object x++; // returns a temporary object Ref counting, but then you have to deal with synchronization and ref count locking Ref counting also breaks down (for becomes costly) for generic code, when you want to also handle lightweight types like int So what are we going to do? a = b + c + d; // c+d is a temporary object // b+(c+d) is another temporary object
我们所希望的是… 这样的一种编译环境… 比如… 我们可以避免不必要的复制 在此种情况下避免复制是安全的 完全由程序员进行控制 Imagine a world where… Sounds like a world of unicorns and butterflies and rainbows
粒子系统 开始爆炸() ParticleSystem particleSys(...); particleSys = StartExplosion(); // Explosion begins particleSys += AddSmoke(); // More particles added 粒子系统 开始爆炸() t … v t … v t v t … v … t v t v … TBD remove as we go down the list? StartExplosion: unnamed temporary object We’ve avoided two extra memory allocations We’ve avoided extra memory overhead (two copies of vectors, of textures) We’ve avoided one series of destructors We’ve avoided copying memory We’ve gone from an O(N) operation to constant time It’s time to look at some code… What does that second line of code expand too? t v t v
考虑赋值 struct ParticleSystem { std::vector< Particle > mPar; Texture mTex; }; ParticleSystem& operator=( const ParticleSystem& rhs ) { if( this != &rhs ) { mPar = rhs.mPar; // Vector assignment (copy) mTex = rhs.mTex; // Texture assignment (copy) } return *this; 典型的拷贝赋值 ParticleSystem& operator=( <Magic Type> rhs ) { // move semantics here ... return *this; } 我们所想要的 We want the compiler to select move behavior at compile time for move-safe objects We don’t care about rhs after the function is complete, because we moved from it
解决方案:使用C++11标准来解决 在不需要的时候不要进行复制,相反的使用move语义来代替 深对象是极其重要的 主要的新语言特性:右值引用 完美转发
示例 拷贝赋值 Move赋值 ParticleSystem& operator=( ParticleSystem&& rhs ) { ParticleSystem& operator=( const ParticleSystem& rhs ) { if( this != &rhs ) { mPar = rhs.mPar; // Particle vector copy mTex = rhs.mTex; // Texture copy } return *this; 拷贝赋值 ParticleSystem& operator=( ParticleSystem&& rhs ) { if( this != &rhs ) { mPar = std::move( rhs.mPar ); // Vector move mTex = std::move( rhs.mTex ); // Texture move } return *this; Move赋值 No const New notation New function
可移动对象:右值 Move来自 = 自我抽取 每一个表达不是左值就是右值 从一个右值进行移动总是安全的 左值 右值 在内存中 是 否 可以获取地址 有名字 可移动 否* It’s time to talk about rvalues Evisceration = resource stealing/pilfering Scott Meyers: “rip its guts out” Not types, expressions Historical from C: L=left, R=right L=locator = located in memory rvalues are expressions referring to anonymous temporary objects. Because they are temporary we can modify them without anybody noticing. Hardly ever safe to move from an lvalue
左值与右值示例 int a; // a is an lvalue ++x; // lvalue X x; // x is an lvalue x++; // rvalue X(); // X() is an rvalue *ptr // lvalue int a = 1+2; // a is an lvalue; 1+2 is an rvalue foo( x ); // x is an lvalue x+42 // rvalue foo( bar() ); // bar() is an rvalue “abc” // lvalue ++x vs. x++ 4321 // rvalue std::string( “abc” ) // rvalue
T&& 右值引用 T&: 引用(前C++11) T&: C++11中的左值引用 T&&: 右值引用;C++11中的新内容 右值引用所指向的对象可以安全地使用move语义 右值引用绑定至右值表达 左值引用绑定至左值表达 && does not equal reference to a reference
绑定 foo( ParticleSystem&& ); // A: rvalue foo( const ParticleSystem&& ); // B: const rvalue foo( ParticleSystem& ); // C: lvalue foo( const ParticleSystem& ); // D: const lvalue ParticleSystem particleSys; const ParticleSystem cparticleSys; foo( particleSys ); // lvalue foo( StartExplosion() ); // rvalue foo( cparticleSys ); // const lvalue
绑定和重载解析规则 表达 右值 常量右值 const rvalue 左值 常量左值 const ivalue 优先级 T&& 是 最高 引用 类型 右值 常量右值 const rvalue 左值 常量左值 const ivalue 优先级 T&& 是 最高 const T&& T& const T& 最低 const T& binds to everything – that’s why we use it a lot T&& binds only to non-const rvalues – those are objects we typically move from Two simple rules: obey const correctness, prevent rvalues from binding w/ modifiable lvalue references
std::move std::move ~= static_cast< T&& >(t) 这等于告诉编译器:将该命名变量作为右值 ParticleSystem& operator=( ParticleSystem&& rhs ) { if( this != &rhs ) { mPar = std::move( rhs.mPar ); // Vector move assignment mTex = std::move( rhs.mTex ); // Texture move assignment } return *this; std::move ~= static_cast< T&& >(t) 这等于告诉编译器:将该命名变量作为右值 由于引用崩溃、参数演绎和其他晦涩难懂的语言规则使得该函数的实现高度复杂 template< class T > inline typename std::remove_reference<T>::type&& move( T&& t ) noexcept { using ReturnType = typename std::remove_reference<T>::type&&; return static_cast< ReturnType >( t ); } rhs is an lvalue! Generates absolutely zero code Has a great name
Move赋值 std::vector<T>& operator=( std::vector<T>&& rhs ) { ParticleSystem& operator=( ParticleSystem&& rhs ) { if( this != &rhs ) { mPar = std::move( rhs.mPar ); // Vector move assignment mTex = std::move( rhs.mTex ); // Texture move assignment } return *this; std::vector<T>& operator=( std::vector<T>&& rhs ) { if( this != &rhs ) { DestroyRange( mpFirst, mpLast ); // call all dtors if( mpFirst != nullptr ) free( mpFirst ); mpFirst = rhs.mpFirst; // eviscerate mpLast = rhs.mpLast; mpEnd = rhs.mpEnd; // rhs now empty shell rhs.mpFirst = rhs.mpLast = rhs.mpEnd = nullptr; } return *this; Texture& Texture::operator=( Texture&& rhs ) { if( this != &rhs ) { if( mpBits != nullptr ) free( mpBits ); mpBits = rhs.mpBits; // eviscerate mSize = rhs.mSize; rhs.mpBits = nullptr; // clear rhs } return *this; // Standard assignment operator Texture& Texture::operator=( const Texture& rhs ) { if( this != &rhs ) { if( mpBits != nullptr) free( mpBits ); mSize = rhs.mSize; mpBits = malloc( mSize ); memcpy( mpBits, rhs.mpBits, mSize ); } return *this; we can safely eviscerate rhs – it’s a temporary object that’s about to go away rhs: notice why its important that rhs be non-const normally taboo – this is critical for moves rhs destructor will be called – needs to be in a safe sensible state Imagine that the texture library was a 3P library – out of our control – not move enabled
中场回顾 使用右值引用语义来启动moves 使用非常量右值:重载运算符右边操作数rhs std::move函数告诉编译器:“这是一个真正的右值。” 绑定规则允许逐步转换 当你进行运算时实现右值引用 从低级程序库开始 或者从高层级的代码开始,由你自行决定
重新观察运算性能 粒子 复制 (ticks) Move (ticks) 1 6,113 1019 10 7,201 1100 100 运算符=(常量 粒子系统类) 粒子 复制 (ticks) Move (ticks) 1 6,113 1019 10 7,201 1100 100 5,543 968 1,000 8,579 1200 10,000 56,614 865 100,000 635,962 993 1,000,000 6,220,013 1173 运算符=(粒子系统类) Measured the cost of operator=(&) compared to operator=(&&) Always faster But wait, we haven’t even talked about move construction or perfect forwarding
Move构造函数 ParticleSystem::ParticleSystem( ParticleSystem&& rhs ) : // invoke member move ctors mPar( std::move( rhs.mPar ) ), mTex( std::move( rhs.mTex ) ) { } vector<T>::vector( vector<T>&& rhs ) : mpFirst( rhs.mpFirst ), // eviscerate mpLast ( rhs.mpLast ), mpEnd ( rhs.mpEnd ) { // rhs now an empty shell rhs.mpFirst = rhs.mpLast = rhs.mpEnd = nullptr; } Texture::Texture( Texture&& rhs ) : mpBits( rhs.mpBits ), // eviscerate mSize( rhs.mSize ) { // rhs now an empty shell rhs.mpBits = nullptr; }
完美转发问题 假设我们有一些setter函数 void ParticleSystem::SetTexture( const Texture& texture ) { mTex = texture; // We’d like to move if tx is a temporary } void ParticleSystem::SetTexture( Texture&& texture ) { mTex = std::move( texture ); // Move } void ParticleSystem::Set( const A& a, const B& b ) { // Uh-oh, we need three new overloads... } Gets complicated if you have lots of setters, setters with multiple parameters, or setters with template parameters, some of which are moveable (rvals) and some which are not
使用函数模板及右值来进行解决 C++11中强大的新规则。鉴于: 模板右值引用参数可绑定到任意值 表达 右值 常量右值 左值 常量左值 优先级 template< typename T > void f( T&& t ); // template function 表达 引用类型 右值 常量右值 左值 常量左值 优先级 template T&& 是 最高 T&& const T&& T& const T& 最低 Remember the binding rules (green/red table) – this is a new rule. Even better, the type is preserved. If t is an lvalue, T’s type is deduced as [const] T&, we instantiate the T& version of f If t is an rvalue, T’s type is deduced as [const] T&& Awesome. With a single function, we always get the right results. We simply need to pass arguments as template parameters. Called perfect forwarding.
绑定右值引用模板参数 示例 template< typename T > void f( T&& t ); // template function int a; const int ca = 42; f( a ); // instantiates f( int& ); f( ca ); // instantiates f( const int& ); f( StartExplosion() ); // instantiates f( ParticleSystem&& );
完美转发 std::forward<T> 相当于 template< typename T > void ParticleSystem::SetTexture( T&& texture ) { mTex = std::forward<T>( texture ); // invokes right overload } std::forward<T> 相当于 static_cast<[const] T&&>(t) 当 t 是一个右值 static_cast<[const] T&>(t) 当 t 是一个左值 Remember that SetTexture function texture is an lvalue std::forward preserves constness Enables you to copy lvalues and move rvalues Name states the intent Absolutely no code generated Now we’ve seen an example of a “perfect setter”, but that’s not all perf. for. is good for… template< class T > inline T&& // typical std::forward implementation forward( typename identity<T>::type& t ) noexcept { return static_cast<T&&>( t ); }
完美的构造函数 典型的多参数构造函数;不处理右值 完美的构造函数;处理你往其中扔进的一切代码 ParticleSystem::ParticleSystem( const std::vector<Particle>& par, const Texture& texture ) : mPar( par ), mTex( texture ) { } 典型的多参数构造函数;不处理右值 template< typename V, typename T > ParticleSystem::ParticleSystem( V&& par, T&& texture ) : mPar( std::forward<V>( par ) ), mTex( std::forward<T>( texture ) ) { } 完美的构造函数;处理你往其中扔进的一切代码 Could add an overload to handle rvalues, but there’s an even better solution
特殊的隐式成员函数 三法则(Rule of Three):如果你定义了三个成员函数的任意一个,你必须同时定义其他两个 隐式产生的情况 Default ctor 没有其他的构造函数显式声明 Copy ctor 没有move构造函数或move赋值显示声明 Copy assign Move ctor 没有拷贝构造函数、move赋值或析构函数显示声明 Move assign 没有拷贝构造函数、拷贝赋值或析构函数显示声明 Dtor 没有析构函数显示声明 Do move ctor/assign ever get generated automatically by compiler? Fair amount of controversy regarding implicit move functions 三法则(Rule of Three):如果你定义了三个成员函数的任意一个,你必须同时定义其他两个 Move二法则(Rule of Two Moves):如果你定义了任意一个move函数,你必须同时定义另一个
明确特殊隐式函数 struct ParticleSystem { std::vector< Particle > mPar; // Copyable/movable object Texture mTex; // Copyable/movable object // Ctors ParticleSystem() = delete; ParticleSystem( const ParticleSystem& ) = default; ParticleSystem( ParticleSystem&& ) = default; // Assign ParticleSystem& operator=( const ParticleSystem& ) = default; ParticleSystem& operator=( ParticleSystem&& ) = default; // Destruction ~ParticleSystem() = default; }; New notation in C++11 If you had a raw pointer in this class, all bets are off
C++11 STL容器move启用 STL算法move启用 只要通过简单的重新编译你就可以立即获得速度优势 包括std::string 包括排序、分区、交换 只要通过简单的重新编译你就可以立即获得速度优势 template< typename T > swap( T& a, T& b ) { T tmp( std::move( a ) ); a = std::move( b ); b = std::move( tmp ); } Remember that I said you couldn’t move from lvalues? Wicked fast, super cool I’ve found that rvalue references are more challenging than they look We’re so used to copy semantics that move semantics take a while to click
推荐用语: 可移动类型 struct Deep { Deep( const Deep& ); // Copy ctor Deep( Deep&& ); // Move ctor template< typename A, typename B > Deep( A&&, B&& ); // Perfect forwarding ctor Deep& operator=( const Deep& ); // Copy assignment Deep& operator=( Deep&& ); // Move assignment ~Deep(); template< typename A > // Deep setters void SetA( A&& ); };
推荐用语:空指针 Move构造函数 Move赋值 T( T&& rhs ) : ptr( rhs.ptr ) // eviscerate { rhs.ptr = nullptr; // rhs: safe state } Move构造函数 T& operator=( T&& rhs ) { if( this != &rhs ) { if( ptr != nullptr ) free( ptr ); ptr = rhs.ptr; // eviscerate rhs.ptr = nullptr; // rhs: safe state } return *this; Move赋值 shared_ptr<T> is move enabled
推荐用语:高级Objs Move构造函数 Move赋值 T( T&& rhs ) : base( std::move( rhs ) ), // base m ( std::move( rhs.m ) ) // members { } Move构造函数 T& operator=( T&& rhs ) { if( this != &rhs ) { m = std::move( rhs.m ); // eviscerate } return *this; Move赋值 Don’t have to “clear” rhs; the individual move operations will handle that
推荐用法:完美转发 构造函数 Setter函数 template< typename A, typename B > T( A&& a, B&& b ) : // binds to any 2 params ma( std::forward<A>( a ) ), mb( std::forward<B>( b ) ) { } 构造函数 template< typename A > void SetA( A&& a ) // binds to anything { ma = std::forward<A>( a ); } Setter函数
编译器和move支持 特征 微软 GCC Intel Clang 右值引用 VS 2010 4.3 11.1 2.9 STL move语义 空指针标识nullptr 4.6 12.1 可变参数模板 Defaulted和deleted函数 4.4 12.0 3.0 noexcept 详尽清单:http://wiki.apache.org/stdcxx/C++0xCompilerSupport
进阶要点 通过重载右值引用,你可以在编译时选择是否跳转至x可移动的情况(x为临时对象) 你可以逐步地实现超载 好处会累积至深对象 可显著提高性能 Now you know how to move-enable your game engine Significant: orders of magnitude better
更进一步的研究:一些本次演讲所未涵盖的主题 x值xvalues、泛左值glvalues、纯右值prvalues 安置(例如“位置插入”) 使用容器创建元素,w/ no moves/copies 使用完美转发和可变参数函数 在其他情况下移动左值是OK的 Moves和例外情况 完美转发并不总是那么完美 例如积分和指针类型;还有位域 Noexcept和隐式move Consider std::move for final use of deep locals
最佳的做法 更新至支持右值引用的编译器 现在返回值是合理的了 – 既是可读的又是快速的 对深对象添加move构造函数/赋值/setters函数 Move用语:this指针 = 右指针rhs pointers = null 使用非常量右值引用 进行移动时,满足调用的obj不变量 避免返回常量T – 禁止move语义 明确特殊隐式函数 通过执行新的move代码以确保正确性 Invariants: valid empty state – same as default constructed
谢谢! 我的联系方式: pkisensee@msn.com 个人主页: http://www.tantalon.com/pete.htm Scott Meyers: http://www.aristeia.com Stephan Lavavej: http://blogs.msdn.com Dave Abrahams: http://cpp-next.com Thomas Becker: http://thbecker.net Marc Gregoire: http://www.nuonsoft.com 请让我知道你使用move语义启用你的代码后你所观察到的是什么样的结果 We’ve come to the end of our crash course in move semantics and rvalue ref No books – yet Go forth and make your games faster
其他参考材料
C++ 标准参考文献 N1610 (v0.1) 2004 http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1610.html N2118 (v1.0) 2006 http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2118.html N2844 (v2.0) 2009 (VC10 impl) http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2009/n2844.html N3310 (sections 840, 847, 858) (v2.1) 2011 (VC11 impl) http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html N3053 (v3.0) 2010 http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3053.html
右值参考文献 Scott Meyers的Move语义和右值引用: http://www.aristeia.com/TalkNotes/ACCU2011_MoveSemantics.pdf 以及 http://skillsmatter.com/podcast/home/move-semanticsperfect-forwarding-and-rvalue-references Scott Meyers对完美转发的研究(C++ 及2011后版本) Thomas Becker对右值引用的解释: http://thbecker.net/articles/rvalue_references/section_01.html STL博客: http://blogs.msdn.com/b/vcblog/archive/2009/02/03/rvalue-references-c-0x-features-in-vc10-part-2.aspx?PageIndex=3 Marc Gregoire的博客 http://www.nuonsoft.com/blog/2009/06/07/the-move-constructor-in-visual-c-2010/ Visual Studio C++ 11的新特性 http://blogs.msdn.com/b/vcblog/archive/2011/09/12/10209291.aspx Mikael Kilpelainen的左值和右值http://accu.org/index.php/journals/227 从左值进行移动http://cpp-next.com/archive/2009/09/move-it-with-rvalue-references 二进制运算符 http://cpp-next.com/archive/2009/09/making-your-next-move/ 安置 http://stackoverflow.com/questions/4303513/push-back-vs-emplace-back