|
|
-
As the designer of base class, you may hesitate whether to use private or protect access control. Then, let's try the following examples:
1. Call protected member function
#include <cstdio>
class A { protected: void b() {printf("Oops!\n");} };
void f(A* a) { class A_hack:public A { friend void f(A*); }; static_cast<A_hack *>(a)->b(); }
class B { public: void f(A* a) { class A_hack:public A { friend B; }; static_cast<A_hack *>(a)->b(); } };
int main() { f(NULL); B().f(NULL); }
Although the result of the cast is undefined as stated in the standard, if no this pointer adjustment happens and the layout of A is the same as A_hack which are normally the case, the code will break the access control.
For the evil user, his purpose is archieved.
2. Call pure virtual function
class A { protected: virtual void Fun() =0; };
class B:public A { public: B() {Dummy();}
private: void Dummy() {Fun();} }; class C:public B { public: virtual void Fun() {} };
int main() { C c; }
Do you ever wonder how to call pure virtual function? Do you like to see what does _purecall in VC do? Just try the above code.
The problem here is because of the protect access of "Fun" in base class. Of course, you should not call virtual function in ctor (they will not have polymorphic behavior), but when you call Dummy in ctor, you may not realize the fact that Dummy will call virtual function internally. Then the disaster happens.
If what you want by using virtual is to allow the derived class to provide some specific behavior, you'd better declare the function as private.
Although both the above code are nonconformant, they at least show the possibility that your user can do more than what you expect for protect.
If you don't even have confidence with private (Like evil #define private public which definitely violates One Definition Rule), you'd better use the PImpl Idiom to implement your interface.
|
-
-
The previous post: Minimize the size of your program - high level
Do you know the smallest x86 PE file in the world? Here is the answer: http://www.phreedom.org/solar/code/tinype/
- Smallest possible PE file: 97 bytes
- Smallest possible PE file on Windows 2000: 133 bytes
- Smallest PE file that downloads a file over WebDAV and executes it: 133 bytes
BTW: You can use IDA Pro, WinDBG or OllyDbg to open and debug the tiny PE :P
The main idea of tiny PE is simple: Striping unnecessary bytes and collapsing data structure by overlapping. So the overhead of PE format will be minimized. Here, we are more concerned about how to decrease the size of our code.
To investigate the code generated by the compiler, you can use the /FAsc flag.
First, we can apply the technique used in tiny PE to collapse the PE Header. There're lots of unused fields left, so we can move our data into these fields to save space. Another benefit is that, the address difference of these data will be within 0x80, which is suitable for our further optimization of the code.
Now let's begin our optimization!
1. Algorithm
a. Local variable initialization
Normally, the initialization involves lots of bytes of code.
unsigned long keyT[4]={0xCBDCEDFE,0x8798A9BA,0x43546576,0x00102132};
The above code will be translated to assembly like:
mov DWORD PTR [ebp+offset8], imm32
It is 7 bytes. Then 7*4 = 28 bytes to initialize the whole array.
Instead, we can change the initialization into for-loop.
for (size_t i=0;i<16;++i) ((char *)keyT)[i]=0xFE-i*0x11; ((char *)keyT)[15]++;
The above code can be implemented as:
lea edx, [ebp+keyT-1] xor ecx, ecx mov cl, 16 mov al, 0FEh loc_4: inc edx mov [edx], al sub al, 11h loop loc_4 inc byte [edx]
28 -> 18 bytes!
b. memcpy
By default, the compiler will call the API memcpy in msvcrt.dll. But the cost is high. The alternative is to use intrinsic version of memcpy which takes advantage of "rep movs" instruction instead. If we take a closer look at the algorithm, it turns out that the code simply rotates keyT array t bytes left.
memcpy(Buf1,(char *)keyT+t,0x10-t); memcpy((char *)Buf1+0x10-t,keyT,t); memcpy(keyT,Buf1,16);
That means most of the parameter for memcpy can be reused, which will save lots of store/load.
;ebx == t, edi == end of keyT xor ecx, ecx mov cl, 10h sub cl, bl mov esi, edi sub esi, ecx lea edi, [ebp+Buf1] push edi rep movsb
mov cl, bl lea esi, [ebp+keyT] rep movsb
pop esi mov cl, 10h rep movsb
Thanks to the value reuse, the last two memcpy only use 12 bytes, and 35 bytes in total! In comparison, 54 bytes before optimization, not including more than 10 bytes for PE format overhead related to reference memcpy API
2. 8-bit vs 32-bit
In 32-bit environment, all the data and address are 32-bit. Then the 32-bit immediate numbers and long jump instructions can contribute to large portion of our code. We can try to make these value small enough, so they can be held in 8-bit data type. For example, when we call:
printf("%08X",keyT[k]);
The compiler will generate the code like this:
push hexstr+ImageBase ; "%08X" call [printf+ImageBase]
Because we arrange the positions of hexstr and printf within 80 bytes, we can use the following code instead:
mov eax, hexstr+ImageBase push eax add al, printf-hexstr call [eax]
Because printf uses cdecl calling convention, the caller is responsible to balance the stack. When we pop up the arguments from stack, we can reuse the pushed "eax" for "printf("\n");"
add al, newline-hexstr push eax add al, printf-newline call [eax]
22 -> 17 bytes!
We can also split the long jmp into two short jmps (5 -> 4 bytes)
3. Misc
- Prevent using 16-bit instruction.
- Use "loop" instruction to do the loop.
- Use enter/leave the setup stack.
- Use "lods/stos" to load and store from the memory.
- Use "pusha/popa" for multiple push/pop.
- Sometimes, the instruction for "eax" will be a little shorter than for other registers.
- Sometimes, 8-bit instruction will be a little shorter than the 32-bit version.
- Return value is not needed here, and OS doesn't require us to save the registers. So we can omit both the "xor eax, eax" at the end of the program and the "push/pop" instructions to save the register.
- Register allocation is also very important. If chosen properly, we can share their values and prevent many load/store.
- Take advantage of al, ah and alike. Otherwise the limited number of registers on X86 will be a big issue.
For more detailed information on how to optimize your code for size, please see the excellent optimization manual by Agner Fog.
BTW: Don't forget to set the "Debug Directory Size" in the PE Header to 0, otherwise the program will crash on WinXP. Because that data member overlaps with our code, that means we have to insert instructions like this into our code:
jmp short skip_debug times 4 db 0 ;ensure debug directory size is 0 skip_debug:
Here is the source (you can compile it using "nasmw -f bin") and the result:
| |
PE Header |
Code |
Data |
Total size |
| Before optimization |
480 |
272 |
144 |
896 |
| After optimization |
136 |
180 |
7 |
323 |
|
-
; based on tiny.asm, modified by xiangfan ;e_cblp 2b ;tbl 1b ;hexstr 1b ;LoaderFlags 4b BITS 32 ; ; MZ header ; ; The only two fields that matter are e_magic and e_lfanew mzhdr: dw "MZ" ; e_magic dw 0 ; e_cblp UNUSED ; ; PE signature ; pesig: dd "PE" ; e_cp UNUSED ; PE signature ; e_crlc UNUSED ; ; PE header ; pehdr: dw 0x014C ; e_cparhdr UNUSED ; Machine (Intel 386) dw 1 ; e_minalloc UNUSED ; NumberOfSections ;dd 0xC3582A6A ; e_maxalloc UNUSED ; TimeDateStamp UNUSED <0xC> start: jmp short main tbl: db 1,1,2,3,7,5,6,4,8 db 0 ;align 4 ;dd 0 ; e_sp UNUSED ; PointerToSymbolTable UNUSED ; e_csum UNUSED ;dd 0 ; e_ip UNUSED ; NumberOfSymbols UNUSED ; e_cs UNUSED ;==4 dw sections-opthdr ; e_lsarlc UNUSED ; SizeOfOptionalHeader dw 0x10F ; e_ovno UNUSED ; Characteristics ; ; PE optional header ; ; The debug directory size at offset 0x94 from here must be 0 filealign equ 4 sectalign equ 4 ; must be 4 because of e_lfanew %define round(n, r) (((n+(r-1))/r)*r) opthdr: printf_: dw 0x10B ; e_res UNUSED ; Magic (PE32) ;db 8 ; MajorLinkerVersion UNUSED ;db 0 ; MinorLinkerVersion UNUSED ;dw 0x29E db "pr" ; ; PE code section ; sections: ;dd 0 ; SizeOfCode UNUSED ; Name UNUSED ;dd 0 ; e_oemid UNUSED ; SizeOfInitializedData UNUSED ; e_oeminfo UNUSED db "intf",0 db 0,0,0 ;align dd codesize ; e_res2 UNUSED ; SizeOfUninitializedData UNUSED ; VirtualSize ;==C dd start ; AddressOfEntryPoint ; VirtualAddress dd codesize ; BaseOfCode UNUSED ; SizeOfRawData ;==C dd start ; BaseOfData UNUSED ; PointerToRawData ; ; Import table (array of IMAGE_IMPORT_DESCRIPTOR structures) ; ImageBase equ 0x400000 idata: dd ImageBase ; ImageBase ; PointerToRelocations UNUSED ; OriginalFirstThunk UNUSED ;==4 dd sectalign ; e_lfanew ; SectionAlignment ; PointerToLinenumbers UNUSED ; TimeDateStamp UNUSED ;==4 dd filealign ; FileAlignment ; NumberOfRelocations UNUSED ; ForwarderChain UNUSED ; NumberOfLinenumbers UNUSED dd msvcrt ; MajorOperatingSystemVersion UNUSED ; Characteristics UNUSED ; Name ; MinorOperatingSystemVersion UNUSED ; FirstThunk dd iat ; MajoirImageVersion UNUSED ; MinorImageVersion UNUSED dw 4 ; MajorSubsystemVersion ; OriginalFirstThunk UNUSED ;dw 0 ; MinorSubsystemVersion UNUSED ;dd 0 ; Win32VersionValue UNUSED ; TimeDateStamp UNUSED hexstr: db "%08X",0 db 0 ;align dd round(hdrsize, sectalign)+round(codesize,sectalign) ; SizeOfImage ; ForwarderChain UNUSED ;==8C dd round(hdrsize, filealign) ; SizeOfHeaders ; Name UNUSED dd 0 ; CheckSum UNUSED ; FirstThunk idatasize equ $ - idata dw 3 ; Subsystem (Win32 Console) ;dw 0 ; DllCharacteristics UNUSED sstr: db "%s" dd 0 ; SizeOfStackReserve dd 0 ; SizeOfStackCommit dd 0 ; SizeOfHeapReserve dd 0 ; SizeOfHeapCommit ;dd 0 ; LoaderFlags UNUSED dstr: db "%d",0 db 0 dd 2 ; NumberOfRvaAndSizes ; ; The DLL name should be at most 16 bytes, including the null terminator ; ; ; Data directories ; ; The debug directory size at offset 0x34 from here must be 0 scanf_: newline: db 10,0 ;dw 0x2AB db "scanf",0 ;dd 0 ; Export Table UNUSED ;dd 0 ;==38 dd idata - $$ ; Import Table hdrsize equ $ - $$ main: var_42C equ -42Ch var_28 equ -28h var_18 equ -18h ;var_14 equ -14h ;var_10 equ -10h ;var_C equ -0Ch ;var_8 equ -8 ;var_4 equ -4 enter 42Ch, 0 push esp ;[ebp+42C] ;push dstr+ImageBase ; "%d" ;call [scanf+ImageBase] mov eax, dstr+ImageBase push eax add al, scanf-dstr call [eax] pop eax ;dstr pop ecx ;pushad ;ebx,esi,edi mov edi, [ecx] test edi, edi jle short loc_4002D0 ;times 4 db 0 add al, newline-dstr loc_400235: ; CODE XREF: start+100j ;lea esi, [esp] ;pushad==20h push esp add al, sstr-newline push eax nop jmp short skip_debug iat: printf: dd printf_ scanf: dd scanf_ ;dd 0 times 4 db 0 ;ensure debug directory size is 0 skip_debug: add al, scanf-sstr call [eax] pop esi ; "%s" pop esi ;push sstr+ImageBase ; "%s" ;call [scanf+ImageBase] ;pop ecx ;pop ecx ;mov dword [ebp+var_18], 0CBDCEDFEh ;mov dword [ebp+var_14], 8798A9BAh ;mov dword [ebp+var_10], 43546576h ;mov dword [ebp+var_C], 102132h lea edx, [ebp+var_18-1] ;xor ecx, ecx xchg eax, ecx mov cl, 16 mov al, 0FEh loc_4: inc edx mov [edx], al sub al, 11h loop loc_4 ;mov byte [edx], 0 inc byte [edx] loc_400272: ; CODE XREF: start+D0j ;movsx ebx, byte [esi] ;inc esi lodsb movsx ebx, al test ebx, ebx jz short loc_4002DA pusha ;edi,esi and al, 0Fh xchg eax, ebx ;mov eax, ebx cdq ;xor ecx, ecx ;ecx is alway =0 here mov cl, 9 idiv ecx mov cl, 4 test edx, edx lea edi, [ebp+var_18] jl short loc_40028F mov dl, byte [edx+tbl+ImageBase] loc_40028F: ; CODE XREF: start+9Aj mov eax, edx imul eax, dword [edi] add eax, 0FBFCh stosd loop loc_40028F ;xor ecx, ecx ;and bl, 0Fh mov cl, 10h sub cl, bl mov esi, edi sub esi, ecx lea edi, [ebp+var_28] push edi rep movsb mov cl, bl lea esi, [ebp+var_18] rep movsb ;lea esi, [ebp+var_28] pop esi mov cl, 10h rep movsb popa jmp short loc_400272 loc_4002D0: jmp short loc_400311 loc_4002D2: jmp short loc_400235 ; --------------------------------------------------------------------------- loc_4002DA: ; CODE XREF: start+6Cj lea esi, [ebp+var_18] mov bl, 4 loc_4002E0: ; CODE XREF: start+EBj lodsd push eax ;push hexstr+ImageBase ; "%08X" ;call [printf+ImageBase] mov eax, hexstr+ImageBase push eax add al, printf-hexstr call [eax] pop eax dec ebx ;ebx==0 after 2DA pop ecx jnz short loc_4002E0 ;push newline+ImageBase ; "\n" ;call [printf+ImageBase] add al, newline-hexstr push eax add al, printf-newline call [eax] dec edi pop eax jnz loc_4002D2 loc_400311: ; CODE XREF: start+1Ej ;popad ;xor eax, eax leave retn msvcrt: db "msvcrt", 0 codesize equ $ - start filesize equ $ - $$
|
-
Nowadays, the size of the program is no longer a big issue. But it is still very interesting to write program with extremely small size.
Here, I'll limit the discussion on X86 + PE format to simplify the analysis. And I choose VC9 as our C++ compiler. The target program reads the number of queries first, then calculates the digest for each query and outputs the result.
If you use the default setting to compile the code, you will get an executable of size 56320 bytes.
cl /c test.cpp link test.obj
That is small, right? But most of the code in the executable is not yours. By default, the compiler will statically link to the CRT library, so they will reside in your program. We'd better get rid of these stuff. Let's add /MD flag to cl.exe. WOW! test.exe is only 5632 bytes now! But that is not the end of the story, we can decrease the size even further.
If you check the file generated, you can still find lots of stuff not belong to you. For example, the entry point procedure "start" is added "magically" by the compiler to initialize the CRT (you can check the source code of this function in Microsoft Visual Studio 9.0\VC\crt\src\crt0.c). In our program, these work can be safely ignored. We can pass /entry:"main" flag to tell the linker to use our main function as the entry point directly. Unfortunately, you will get the following linker errors if you do that:
MSVCRT.lib(gs_report.obj) : error LNK2019: unresolved external symbol __imp__TerminateProcess@8 referenced in function ___report_gsfailure MSVCRT.lib(gs_report.obj) : error LNK2019: unresolved external symbol __imp__GetCurrentProcess@0 referenced in function ___report_gsfailure MSVCRT.lib(gs_report.obj) : error LNK2019: unresolved external symbol __imp__UnhandledExceptionFilter@4 referenced in function ___report_gsfailure MSVCRT.lib(gs_report.obj) : error LNK2019: unresolved external symbol __imp__SetUnhandledExceptionFilter@4 referenced in function ___report_gsfailure MSVCRT.lib(gs_report.obj) : error LNK2019: unresolved external symbol __imp__IsDebuggerPresent@0 referenced in function ___report_gsfailure
Don't worry. That is what /GS flag does, and we will discuss it later. Here we simply pass kernel32.lib to link.exe to resolve the failure. The size of test.exe is now 3072 bytes.
/GS is used to detect buffer overrun, and is on by default. This flag is very useful, but not suitable when size is an issue. So we turn it off by passing /GS- to cl.exe and test.exe decrease to 2560 bytes!
There are no redundant code now, and we can focus on other stuff in our program. In PE format, different sections are used to store different kinds of data. For example, VC store code into ".text" and data into ".data". Each section will align to the 512-byte boundary in the file. That is why all the previous size of test.exe are multiply of 512. We can merge them to save the space. These flags for link.exe are what we need: /MERGE:.rdata=.text /MERGE:.data=.text. We will then have test.exe of 1536 bytes!
With /O1 for cl.exe (minimize size), we can decrease the size to 1024 bytes. After loosing the section alignment requirement to 4 by linker flag /ALIGN:4, we achieve the size of 896.
All of these tunings are high level, and finally we decrease the size of the original exe by more than 98%!
cl /O1 /MD /GS- /c test.cpp link test.obj /entry:"main" /MERGE:.rdata=.text /MERGE:.data=.text /ALIGN:4
BTW: If the main is completely empty, the above flag setting can generate a program of only 468 bytes. That is the lower bound of high level size optimization. To go even further, we have to resort to low level approach. That will be described in the next post.
The program:
#include <cstdio> #include <cstring>
static unsigned char Tbl[9]={1,1,2,3,7,5,6,4,8};
int main() { unsigned char BufT[0x401]; int n; scanf("%d",&n); for (;n>0;--n) { scanf("%s",BufT);
unsigned long keyT[4]={0xCBDCEDFE,0x8798A9BA,0x43546576,0x00102132}; unsigned char *Buf=BufT; int tt; while (tt=*Buf++) { { int t=Tbl[tt%9]; for (int k=0;k<4;++k) keyT[k]=keyT[k]*t+0xFBFC; } unsigned long Buf1[4]; int t=tt&0xF; memcpy(Buf1,(char *)keyT+t,0x10-t); memcpy((char *)Buf1+0x10-t,keyT,t); memcpy(keyT,Buf1,16); } for (size_t k=0;k<4;++k) printf("%08X",keyT[k]); printf("\n"); } return 0; }
|
-
Obfuscation is widely used to protect your code from reverse engineering. Here is one example which takes advantage of indirected call and opcode overlap in X86: __declspec(naked) void Fun1() { __asm { //obfuscation chunk call LABEL1 LABEL1: pop eax add eax, 6 jmp eax //real function body ret } } __declspec(naked) void Fun2() { __asm { //obfuscation chunk call LABEL1 LABEL1: pop eax add eax, 7 /* jmp -1 jmp eax (merged with the previous instruction) 0xEB 0xFF 0xFF 0xE0 -> 0xEB 0xFF 0xE0 */ __emit 0xEB __emit 0xFF __emit 0xE0
//real function body ret } }
int main() { #if 0 Fun1(); #else Fun2(); #endif } In Fun1, the obfuscation chunk do the following work: - Get the current EIP (LABEL1)
- Adjust the EIP to the target function
- Jump to the target
In Fun2, it merges the two jump instructions which will confuse the disassembler. To make the analysis even harder, we can rely on the fact that “add eax, 7” (assume the real function body is after our obfuscation chunk) will never overflow, and replace “jmp –1” with “jno -1”. Or we can use more complicated arithmetic to compute the target address. Other kinds of obfuscations include: - API Call Obfuscation (LoadLibrary+Encrypted Parameter)
- Data Encryption (String Literal)
- Code CheckSum + AntiDebugger
- Dynamic Data Extraction
|
-
Playing with the compiler is interesting. One challenge for the compiler writer is compilation performance. There're many kinds of C++ code which are the nightmare for the compiler. Now let's go!
1. Preprocess
a. Self Inclusion (GCC only) Normally, the compiler will limit the nest level of inclusion. In order to challenge the compiler, you have to stop the inclusion at appropriate stage. GCC provides a useful macro __INCLUDE_LEVEL__ to do the work. (We can write similar code for other compiler, although less elegant)
#if __INCLUDE_LEVEL__<199 #include __FILE__ #include __FILE__ #endif
b. Macro Expansion Explosion By using macro, the code after the preprocess may reach O(2n). For example:
#define F1(x) x,x #define F2(x) F1(x),F1(x) #define F3(x) F2(x),F2(x) #define F4(x) F3(x),F3(x) #define F5(x) F4(x),F4(x) #define F6(x) F5(x),F5(x) #define F7(x) F6(x),F6(x) #define F8(x) F7(x),F7(x) #define F9(x) F8(x),F8(x) #define G1(x) F9(x),F9(x) #define G2(x) G1(x),G1(x) #define G3(x) G2(x),G2(x) #define G4(x) G3(x),G3(x) #define G5(x) G4(x),G4(x) #define G6(x) G5(x),G5(x) #define G7(x) G6(x),G6(x) #define G8(x) G7(x),G7(x) #define G9(x) G8(x),G8(x)
int main() { G9(1); }
Of course, different compilers handle this kind of overflow differently. GCC will fail directly (when all of the resource are eaten up), and VC will give ICE (Internal Compiler Error).
2. Template
a. Nesting Similarly, there're nest level limit for template, but we can easily deceive the compiler. Some version of GCC will enter infinite loop when compiles the following code:
#include <cstddef> template <class T> struct Test { static const size_t Value=Test<Test<T> >::Value; };
VC is wise enough to stop when the nest level limit is exceeded in the above code. But we can use one of VC's bug (you can also treat it as an extension) to write template which requires O(na) compilation time:
#include <cstddef>
#define INNER(A3,N3,A2,N2) \ template<size_t N3>\ struct A3\ {\ enum {N=A3<N3-1>::N+1};\ };\ template<>\ struct A3<0>\ {\ enum {N=A2<N2-1>::N};\ };
#define OUTER(A2,N2,A1,N1,A3,CONTENT) \ template<size_t N2>\ struct A2\ {\ CONTENT\ \ enum {N=A3<N2>::N};\ };\ template<>\ struct A2<0>\ {\ enum {N=A1<N1-1>::N};\ };
#define LEVEL2(a,b,c) INNER(A##b,N##b,A##a,N##a) #define LEVEL3(a,b,c) OUTER(A##b,N##b,A##a,N##a,A##c,LEVEL2(a##1,b##1,c##1)) #define LEVEL4(a,b,c) OUTER(A##b,N##b,A##a,N##a,A##c,LEVEL3(a##1,b##1,c##1)) #define LEVEL5(a,b,c) OUTER(A##b,N##b,A##a,N##a,A##c,LEVEL4(a##1,b##1,c##1))
template<size_t N1> struct A1 { LEVEL5(1,11,111) enum {N=A11<N1>::N}; }; template<> struct A1<0> { enum {N=0}; };
int main() { return A1<20>::N; }
The interesting thing is that, the standard doesn't allow this kind of specialization in template class. Is this a bug or extension of VC? A member or a member template may be nested within many enclosing class templates. In an explicit specialization for such a member, the member declaration shall be preceded by a template<> for each enclosing class template that is explicitly specialized.
Based on similar idea, we can write portable version for both GCC and VC.
#define INNER(A3,N3) \ template<class N3>\ struct A3\ {\ static void f() {}\ };
#define OUTER(A2,N2,A1,CONTENT) \ template<class N2>\ struct A2\ {\ CONTENT\ \ static void f() {\ A1<char>::f();\ A1<short>::f();\ A1<int>::f();\ A1<long>::f();\ A1<unsigned char>::f();\ A1<unsigned short>::f();\ A1<unsigned int>::f();\ A1<unsigned long>::f();\ }\ };
#define LEVEL2(a,b) INNER(A##a,T##a) #define LEVEL3(a,b) OUTER(A##a,T##a,A##b,LEVEL2(a##1,b##1)) #define LEVEL4(a,b) OUTER(A##a,T##a,A##b,LEVEL3(a##1,b##1)) #define LEVEL5(a,b) OUTER(A##a,T##a,A##b,LEVEL4(a##1,b##1)) #define LEVEL6(a,b) OUTER(A##a,T##a,A##b,LEVEL5(a##1,b##1)) #define LEVEL7(a,b) OUTER(A##a,T##a,A##b,LEVEL6(a##1,b##1)) #define LEVEL8(a,b) OUTER(A##a,T##a,A##b,LEVEL7(a##1,b##1))
LEVEL8(1,11)
int main() { A1<int>::f(); }
b. Multiple inheritance C++ allows multiple inheritance, it can be used to hang the compiler. Try the following "Fibonacci" inheritance:
template<size_t N> class Evil:virtual public Evil<N-1>,virtual public Evil<N-2> { public: virtual ~Evil() {} };
template<> class Evil<1> { }; template<> class Evil<0> { };
int main() { Evil<100> evil; }
c. OLE (Output limit exceeded) Normally, the compile time for template is O(n), but in contrast, the amount of error message will be O(n2). Taking advantage of this, you can make the error message "Output Limit Exceeded". For example:
#define ClassName A
template <int N> class ClassName { enum {Value=ClassName<N-1>::Value}; };
int main() { int n=ClassName<0>::Value; }
If I change "ClassName" to a very long name (which is widely supported by modern compilers), the error message will be easily OLE. PS: The above code will cause ICE in VC8. This is the bug of VC8, and VC9 is happy with the code.
|
-
In "Effective C#", Bill Wagner says "Always create an override of ValueType.Equals() whenever you create a value type". His main consideration is the performance, because reflection is needed to compare two value types memberwisely.
In fact, the framework provides optimization for "simple" value type. Let's find out the magic.
Here is the code disassembled by Reflector:
bool Equals(object obj) { //Compare type
object a = this; if (CanCompareBits(this)) { return FastEqualsCheck(a, obj); }
//Compare using reflection }
The magic is the two functions "CanCompareBits" and "FastEqualsCheck". They both have attribute "[MethodImpl(MethodImplOptions.InternalCall)]", which indicates that their implementations are in native dlls.
Walkthrough the source of rotor, you can find that these functions are in "clr\src\vm\comutilnative.cpp"
The comment of CanCompareBits says "Return true if the valuetype does not contain pointer and is tightly packed". And FastEqualsCheck use "memcmp" to speed up the comparison.
Then you may think you can safely rely on this optimization and stick to the default ValueType.Equals implementation. But wait a minute. Do you find anything wrong in CanCompareBits?
The problem is that the condition in the comment doesn't ensure the bitwise comparison to work.
Imagine you have a structure which only contains a float. What will occur if one contains +0.0, and the other contains -0.0? They should be the same, but the underlying binary representation are different. If you nest other structure which override the Equals method, that optimization will also fail.
This should be a bug. But it also provide another reason for why you should always implement your own instance Equals for ValueType.
|
-
Someone may wonder what the difference is between Debug and Release mode, and whether it is possible to mix them.
Here is one example: http://forums.msdn.microsoft.com/en-US/vcgeneral/thread/775ce067-b225-4141-8b86-2d7e9b61db97/
syperk said: "As a result, I've switched to compiling my debug builds using the /MD switch. If you do this, you also have to undefine the _DEBUG preprocessor macro. Otherwise you get lots of errors about _CrtDebugReportW() not found. This function is apparently inserted into your code if _DEBUG is defined, but is only defined in the debug CRT. As long as optimization is off it seems to be perfectly possible to debug programs compiled this way, and there are no concerns about CRT incompatibility.".
But that is not true. There are many problems. In my opion, the most important one is that it is not a good idea to link different modules compiled with different compiler options in C++, especially when conditional compilation is involved.
In fact, "Debug" & "Release" are only two sets of predefined compiler flags and macros definitions provided by the IDE (_DEBUG and NDEBUG are two representing macros). The compiler doesn't aware of that (but it has some magic behind compile option "MD" and "MDd"). The main problem is that program compiled using "Debug" setting and the release runtime don't share the same compiler flags, and they are highly possible to be incompatible.
For example, if you change "MDd" -> "MD" in your debug configuration, and compile the following code (don't remove the _DEBUG macro, otherwise the "Debug" version is no longer a debug version (/MDd will define this macro implicitly, and IDE will explicitly define it via compiler flag)) :
#include <iostream> #include <string> using namespace std; int main() { std::string str; return 0; }
You'll find that the program function properly. If you check the generated exe, you'll find that it refer to two dlls, one is msvcp90d.dll, the other is msvcr90.dll. Oops! You're mixing the debug and release runtime libraries! And that hide the potential incompatibility!
The magic is here (in use_ansi.h):
#ifdef _DEBUG #pragma comment(lib,"msvcprtd") #else /* _DEBUG */ #pragma comment(lib,"msvcprt") #endif /* _DEBUG */
If you disable this trick, and link to msvcp90.dll, you'll get unresolved symbol error, that is because the implementation of string class is different in "Debug" and "Release" mode. This is one example of the incompatibility.
error LNK2019: unresolved external symbol "__declspec(dllimport) public: __thiscall std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> >::basic_string<char,struct std::char_traits<char>,class std::allocator<char> >(struct std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> >::_Has_debug_it)" (__imp_??0?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@QAE@U_Has_debug_it@01@@Z) referenced in function _main
It's fortunate that this is a linker error. Otherwise, you'll waste lots of time in debugging to find out the subtle incompatibility.
STL in VC9 devotes lots of efforts to prevent break like this, then you can disable macros like "_HAS_ITERATOR_DEBUGGING" as you like, and keep your code compatible with the runtime (which is compiled with the macro enabled). But it only applies when you stick to debug or stick to release. It is really tricky to guarantee that, and I think you should never rely on this.
|
-
In VC STL, there are two macros called "_HAS_IMMUTABLE_SETS" and "_HAS_STRICT_CONFORMANCE" which are defined in yvals.h. They are related to some defects in the C++ standard 2003.
1. _HAS_IMMUTABLE_SETS will influence the constness of set::iterator.
Associative containers require its elements to be sorted, so modify them will be dangerous. But the standard doesn't prohibit the programmer to do that.
In C++0x, the following statement is added for associative containter: For associative containers where the value type is the same as the key type, both iterator and const_iterator are constant iterators (In the standard, it is said: Constant iterators do not satisfy the requirements for output iterators, and the result of the expression *i (for constant iterator i) cannot be used in an expression where an lvalue is required.) That means you cannot modify the value of the elements using iterator any more.
Discussion: http://www.cpptalk.net/image-vp135370.html
2. _HAS_STRICT_CONFORMANCE will influence two places:
a. For set, multiset, map and multimap
void erase(const_iterator position); size_type erase(const key_type& x); void erase(const_iterator first, const_iterator last);
->
iterator erase(const_iterator position); size_type erase(const key_type& x); iterator erase(const_iterator first, const_iterator last);
In C++03, erase in associative containers will not return the iterator, which is incompatible with sequence containers. It is fixed in C++0x VC's extension provides the return value for erase. _HAS_STRICT_CONFORMANCE will disable this extension.
Discussion: http://www.tech-archive.net/Archive/VC/microsoft.public.vc.stl/2005-10/msg00030.html Further Discussion: http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-defects.html#103
b. For codecvt::do_length(stateT& state, const externT* from, const externT* from_end, size_t max) const;
In C++03, codecvt<wchar_t, char, mbstate_t> and codecvt<char, char, mbstate_t> should return the lesser of max and (from_end-from). This is weakened in C++0x. Only codecvt<char, char, mbstate_t> is required to behave like that.
BTW: VC's STL implementation is provided by Dinkumware. You can see "P.J. Plauger" at the end of each STL file.
|
-
There are many useful online c++ compilation service providers. You can use them to do conformance test and portability test without purchasing and installing various kinds of compilers.
The following two sites are the most famous:
Online Judge systems are also good places to do the test:
You may be curious about the compiler version and compile options they use to compile your code. Some sites will show you the information. But some won't. Then you're on your own way to discover the secret.
First, let's check some tricks to ask the compiler to tells us some useful information
1. Output macro In C++, string literal can't be used as template argument. Most of the compilers will include the content of the string literal when they generate the error message. The following code snippet shows the magic.
template<const char *> class A {}; #define F0(x) #x #define F1(x) F0(x) #define V1 F1(__GNUC__) int main() { A<V1> a1; }
Here, macro F0 and F1 are used to convert the integral into string. The compiler will show message like "literal "3" is not a valid template argument because it is the address of an object with static linkage". And "3" is what you want.
2. Output constant Similarly, float is not allowed as template argument. And the compiler will show the float in the error message. This can be used to output float constant. Unfortunately, VC will implicitly convert float into int when it is used as template argument (see https://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=296008 for details) Here is another trick to output integral constant:
template<int N> class A { private: A(); }; int main() { A<sizeof(char)> a; }
3. Output type Sometimes, you may not be sure about the type of some variables. Then you can use the following code to detect their type.
struct Dummy {}; void Fun(const Dummy &); int main() { Fun(1+1U); }
BTW, gcc will show "not convert `2' to `const Dummy&'", then you can use "template<typename T> void Fun(T);" for the declaration of Fun (that means you can use the above code to output constant in gcc)
4. Find out include path For STL, you can use
#include <ext/hash_set> using __gnu_cxx::hash_set;
int main() { hash_set<> m; }
vector is OK, too.
Now it's time to find out the compiler information. To find out the compiler version, you can check this page: Pre-defined Compiler Macros
The information of dinkumware online test:
1. VC Version (_MSC_FULL_VER): VC8 140050727 VC7.1 13103077 VC7 13009466 VC6 12008804 Include path: D:\cplt501_vc_source\include (with _CPPLIB_VER=501)
2. EDG Version (__EDG_VERSION__): 3.05 Compile option: edgcc --alt -D_WIN32 -D_M_IX86 --no_inlining --diag_suppress=1290 --long_long --wchar_t_keyword -D_C99 -ID:\cplt501_gen_source/include/c -ID:\cplt501_gen_source/include -IC:\PROGRA~1\MICROS~2.NET\VC7/include --exceptions --microsoft -c sourceFile.cpp Because VC compatible mode is used, it may emulate bugs in VC, too
3. GCC Version: 3.2.0 Include path: D:/cplt501_gen_source/include and D:/Mingw/include/c++/3.2/ This version is really out of date
|
|
|
|