正文

time2007-02-21 21:46:00

【评论】 【打印】 【字体: 】 本文链接:http://blog.pfan.cn/aurora/23427.html

分享到:

你想知道你的代码究竟执行了多长时间吗?是的,有的时候我们需要确定我们的代码到底执行了多长时间。或许你会想这有何难,在代码的前后增加GetTickCount来得到时间不就可以了吗?没错,如果系统是线形执行的话,这样做肯定可以得到,但我们怎么能够保证系统是线形的来?

我们这里只讨论WINDOWSWINDOWS是典型的多线程、抢占式的多任务操作系统。如果你的计算机只有一块CPU的话,那么你的应用程序是这样运行的:一个线程执行一个时间片,然后再由另外一个线程执行一个时间片(这个时间片是20毫秒),然后又可能暂停这个线程回到开始的哪个线程,也有可能执行其他的线程;这就是操作系统的调度算法了,调度算法是复杂的,目的是保证所有的线程都有机会被执行,从而看起来感觉每个线程都在运行,但实际上每个时刻只可能有一个线程在运行,即占用CPU时间(所以线程是分配CPU资源的最小单位了)。上面的是单CPU的,如果在多CPU上就更复杂了,但这个时候就不能保证一个时刻只有一个线程被执行了,因为可能在每个CPU上运行一个线程(WINDOWS2000就可以在多个CPU上运行),从而更好的保证了程序的并行性。

从我们对WINDOWS的分析可以看出,程序不是线性方式执行的。或许你会问:单线程、单CPU的情况下,程序是线性的吧?乍听起来说的有道理,但事实并不是这样,因为你的程序运行在操作系统上,操作系统要能够正确的运行,它本身就有许多线程在运行(EXPLORE.EXE),在必要的时候被唤醒,所以你的程序根本不可能在单线程的理想环境下运行。

好了,既然我们已经了解了上面的本质之后,你就会想,是的真的没有办法来得到我的代码到底执行了多长时间?其实不然,下面我们就几种测试方法来比较一下,究竟该如何计算?

用来计算代码执行时间有三种方法:

1.       使用GetTickCount,得到当前的时间,单位是毫秒

代码示例:

DWORD startTime = GetTickCount();

//被测试的代码

DWORD totalTime = GetTtickCount() – startTime;

2.       使用GetThreadTimes;该函数得到的时间包括两部分,内核执行的时间和用户代码的执行时间。

代码示例:

FILETIME m_ftKernelTimeStart;

FILETIME m_ftKernelTimeEnd;

FILETIME m_ftUserTimeStart;

FILETIME m_ftUserTimeEnd;

 

 

 

 

 

GetThreadTimes(GetCurrentThread(),&m_ftDummy,&m_ftDummy,

        &m_ftKernelTimeStart,&m_ftUserTimeStart);

//被测试的代码

GetThreadTimes(GetCurrentThread(),&m_ftDummy,&m_ftDummy,

        & m_ftKernelTimeEnd,& m_ftUserTimeEnd);

使用GetThreadTimes得到的时间单位也是毫秒级别的

3.       使用QueryPerformanceCounter

代码示例:

LARGE_INTEGER m_liPerfFreq;

LARGE_INTEGER m_liPerfStart;

QueryPerformanceFrequency(&m_liPerfFreq);

QueryPerformanceCounter(&m_liPerfStart);

//被测试的代码

LARGE_INTEGER liPerfNow;

QueryPerformanceCounter(&liPerfNow);

__int64 totalTime =  (((liPerfNow.QuadPart- m_liPerfStart.QuadPart)*1000000)

                      /m_liPerfFreq.QuadPart);

 

 

 

 

 

 

 

 

 

 

使用第一种方法的缺点我们已经说过了,它假定你的程序不被打断的执行,这样才能得到正确的时间,而这种情况在一个同时运行了其他程序的环境下几乎是不存在的,所以这个方式的误差最大,得到的值也没有什么规律,如果使用第一种方法,最好要多次测试,取平均值是有帮助的。

使用第二种方法一般情况下可以得到线程的执行时间,该函数被设计就是用来得到线程得到了多少CPU时间,所以相对于第一种方法就准确的多;但我在多次测试的时候发现,使用该方法在测试比较短的时间(小于1毫秒)时会出现一些奇怪的现象,你可能多次得到一个0的结果,但突然得到一个15ms这样的测试结果(到几天我也没有明白这个时间是怎么计算的)。

使用第三种方法的要求和第一种方法一样,它也要求你的程序不被打断,但使用该函数可以得到更精确的时间(100纳秒为单位),所以它也是有用武之地的。我们前面说过WINDOWS调度线程是按时间片来调度,这个时间片是20ms,因此如果你的代码执行时间在20ms内,那么我们就有很大的可能得到正确的时间了;例如:你的代码需要执行20微秒,那么你这样编写代码:

Int I =0;

While(i<2000)

{

    QueryPerformance

       //被测试的代码

   QueryPerformance

}

你会惊奇的发现,你的那段被测试代码执行时间是:

20,20,20,…….,1000,20,20,20……..9000,20,20,….

这种现象的原因是你的这段代码整个时间超过了20ms这个时间片,所以在中间的某次执行时,你的线程被暂停去执行别的代码去了,然后又回来,这个时候就将别的代码执行时间作为你这段测试代码的执行时间了(啊,真冤枉啊,其实我执行的很快被测试代码说)。

好了我们基本知道该如何得到我们代码的执行时间,同样我们没有办法非常准确的得到我们代码执行了多长时间,如果你想使你的代码尽量不被打断,那么你可以试着将你的线程的优先级设置高一些(理论上是线程优先级越高得到CPU的时间就越多,但还是不能保证你的线程不被打断,而且我在测试的时候效果也不明显)。

好了,就写到这里了,欢迎大家来讨论,来让我们对WINDOWS更熟悉。



Trackback: http://tb.blog.csdn.net/TrackBack.aspx?PostId=479190

 

 

Why GetThreadTimes is wrong

Indrodution

Some applications are using GetThreadTimes for doing time measurement. Most people does not know: GetThreadTimes can produce incorrect values!

Some background

To verify this statement, we need to know, how GetThreadTimes is working.

kernel32.dll

Internally GetThreadTimes calls NtQueryInformationThread with ThreadTimes. This returns the THREAD_TIMES_INFORMATION structure.

ntoskrnl.exe

NtQueryInformationThread then gets the Thread´s Thread-Information-Block (TCB or TIB) and retrives the Kernel- and User-Times from the TCB and then it will multiply with the “nt!KeMaximumIncrement” and return the result.

You can also dig into GetSystemTimeAdjustment and you will see that for “lpTimeIncrement” the same value is retrived as which is used in the multiply operation in “NtQueryInformationThread”...

Conclusion

The resolution of the returned values is never better than the “TimeIncrement” of the OS. On single-processor systems this is normally 10 ms; on multi-processor-systems (and hypterthreading) it is normally 15 ms.

More investigation

If you dig deeper into the kernel, you find out, that in the timer- (or scheduler-) interrupt (normally every 10/15 ms), the increment of the thread-time values in the current running thread (TCB) is taken place.

This leads to the conculsion, that GetThreadTimes will only produce correct values, if each thread would consume all of its quantum (or at least for the whole time-slice)! So, a thread should never go into wait-states or should be interrupted by other (higher) priority threads. This always leads to wrong values in GetThreadTimes. And you can imagine: In most of the times, your thread will either go into wait-states (best examples are UI programs) or will be interrupted by other (higher priority) threads or interrupts.

Example

With this knowlegde, we can easily produce an example, in which GetThreadTimes will return 0 (zero) times for this thread, even if the thread was running for many seconds or minutes! The only think we must do, is to end our thread quantum, before the time tick is executed.

We can achieve this by doing a very short calculation (for example 2 ms) and the call “Sleep(1)”, which will apruptly end our qunatum and will reschedule or thread the next time the timer-tick is over. And this we reapeat many times (for example 10000; then we have used the CPU for at least 20 seconds; but GetThreadTimes will return 0!). To be sure that the scheduler will not reschedule our thread immediately, we need a second thread which must be ready-to-run all the time (and the times are now counted to this thread).

To cut a long story short: Here is the code:

  #include <windows.h>

#include <stdio.h> DWORD loopCounter = 0; DWORD loopCounterMax = 1000; DWORD internalCounter = 0xFFF00000; DWORD __stdcall CalcThread(LPVOID) { while(loopCounter <= loopCounterMax) { DWORD cnt = internalCounter; while(cnt != 0) cnt++; Sleep(1); loopCounter++; } return 0; } DWORD WINAPI IdleThread(LPVOID) { while(loopCounter <= loopCounterMax) { Sleep(0); // just do something... } return 0; } int _tmain(int argc, _TCHAR* argv[]) { // be sure we only use 1 processor! SetProcessAffinityMask(GetCurrentProcess(), 1); LARGE_INTEGER liStart, liEnd, liFreq; // test, how much time the inc is using... QueryPerformanceCounter(&liStart); DWORD cnt = internalCounter; while(cnt != 0) cnt++; QueryPerformanceCounter(&liEnd); QueryPerformanceFrequency(&liFreq); double ms = ((double) (liEnd.QuadPart-liStart.QuadPart) * 1000) / (double)liFreq.QuadPart; printf("Inc duration: %.3f msnn", ms); // test-end DWORD id; HANDLE hThread[ 2 ]; QueryPerformanceCounter(&liStart); hThread[ 0 ] = CreateThread(NULL, 0, CalcThread, 0, 0, &id); hThread[ 1 ] = CreateThread(NULL, 0, IdleThread, 0, 0, &id); WaitForMultipleObjects(2, hThread, TRUE, INFINITE); QueryPerformanceCounter(&liEnd); QueryPerformanceFrequency(&liFreq); ms = ((double) (liEnd.QuadPart-liStart.QuadPart) * 1000) / (double) liFreq.QuadPart; printf("Duration: %.3f msnn", ms); FILETIME ftCreate, ftExit, ftKernel, ftUser; for(DWORD i=0; i<2; i++) { GetThreadTimes(hThread[i], &ftCreate, &ftExit, &ftKernel, &ftUser); printf("Reported time for thread %dn", i+1); SYSTEMTIME st; FileTimeToSystemTime(&ftKernel, &st); printf("Kernel: %2.2d:%2.2d.%3.3dn", st.wMinute, st.wSecond, st.wMilliseconds); FileTimeToSystemTime(&ftUser, &st); printf("User: %2.2d:%2.2d.%3.3dnn", st.wMinute, st.wSecond, st.wMilliseconds); } return 0; }

h5. And here is the output

  timeAdjustment: 156250, timeIncrement: 156250, timeAdjustmentDisabled: 1

Inc duration: 2.404 ms Duration: 15679.180 ms Reported time for thread 1 Kernel: 00:00.000 User: 00:00.000 Reported time for thread 2 Kernel: 00:13.093 User: 00:02.437

And you see: thread 1 has 0 (zero) in its thread times (even it was running for 2 seconds)!

Q.E.D.

There is also a discussion about this in microsoft.public.win32.programmer.kernel

 

#include <windows.h>
#include <stdio.h>

class showtime_
{
public:
    showtime_()
    {
        QueryPerformanceFrequency( (LARGE_INTEGER*)&frequency_ );
        QueryPerformanceCounter( (LARGE_INTEGER*)&time1_ );
        GetThreadTimes( GetCurrentThread(), (LPFILETIME)&ftCreationTime1_, (LPFILETIME)&ftExitTime1_, (LPFILETIME)&ftKernelTime1_, (LPFILETIME)&ftUserTime1_ );
    }
    ~showtime_()
    {
        QueryPerformanceCounter( (LARGE_INTEGER*)&time2_ );
        GetThreadTimes( GetCurrentThread(), (LPFILETIME)&ftCreationTime2_, (LPFILETIME)&ftExitTime2_, (LPFILETIME)&ftKernelTime2_, (LPFILETIME)&ftUserTime2_ );
        printf( "PerformanceCounter: %lf\n", (time2_-time1_)*1.0/frequency_ );
        printf( "Kernel: %llu.%07llu; User: %llu.%07llu; Total: %llu.%07llu\n"
            , (ftKernelTime2_-ftKernelTime1_)/10000000, (ftKernelTime2_-ftKernelTime1_)%10000000
            , (ftUserTime2_-ftUserTime1_)/10000000, (ftUserTime2_-ftUserTime1_)%10000000
            , (ftKernelTime2_-ftKernelTime1_+ftUserTime2_-ftUserTime1_)/10000000, (ftKernelTime2_-ftKernelTime1_+ftUserTime2_-ftUserTime1_)%10000000 );
    }
private:
    unsigned long long frequency_, time1_, time2_;
    unsigned long long ftCreationTime1_,ftExitTime1_,ftCreationTime2_,ftExitTime2_;
    unsigned long long ftKernelTime1_,ftUserTime1_,ftKernelTime2_,ftUserTime2_;
};
#define showfollowtime showtime_ _;

int main( void )
{
    {
        showfollowtime
        for( int i=0; i<10000; ++i ) printf("%d",i);
    }
}

阅读(383) | 评论(0)


版权声明:编程爱好者网站为此博客服务提供商,如本文牵涉到版权问题,编程爱好者网站不承担相关责任,如有版权问题请直接与本文作者联系解决。谢谢!

评论

暂无评论
您需要登录后才能评论,请 登录 或者 注册