Welcome to MSDN Blogs Sign in | Join | Help

Is Performance a "functional" requirement?

We had an internal thread on this yesterday, maybe I was a little too glib but here is what I had to say about the topic [edited so it can be read standalone]

The original question, “Is performance a functional requirement” is highly unexciting for me.  However the revised question, “What is the best way to capture performance requirements” is profoundly interesting.  It is precisely the (anti-)correlations between attainable functionality and level of responsiveness that sometimes makes for the very best engineering work.  How do you capture all this?  Sometimes pervasively. I would argue that pervasively is in fact the only answer that works in non-trivial cases.  And this is not limited to performance as the quality metric of interest – it applies to any quality that is not going to be trivially achieved.

The nomenclature to me is uninteresting except to the extent that it assists designers and implementers in capturing  and understanding the various kinds of requirements – i.e. I know where to put them, I know where to find them, I know that I got them all, because I am, literally, on a first name basis with all the requirement types. 

I don’t want to trivialize that aspect, but, beyond that, meh, whatever… functional schmuntional, I don’t care what you call it, just do it :)

Posted by ricom | 3 Comments
Filed under: ,

Linq Compiled Queries Q & A

I did a series of postings on Linq Compiled Queries last year, I recently got some questions on those postings that I thought would be of general interest.

Q1:

Why use the 'new' keyword in this snippet?

var q = from o in nw.Orders
select new {o.everything …};

A:

If you did just :

var q = from o in nw.Orders
select o;

You're getting editable orders. Linq then has to track them in case you change them and want to submit the changes. If you use new effectively you're making a copy of the orders that is not going to be change tracked. That's faster for read only cases. The other thing you can do is mark the query context as read-only and then you get the same effect.  When I wrote that test case, that feature wasn't available yet so I used new to simulate it.

Q2: 

What do you mean when you say that linq will 'Create custom methods that bind the data perfectly' ?

A:

Whenever you use linq to sql to read data from a database it has to do two important things for you. The first is convert your Linq query into SQL. The second is to make a method that takes the stream of data that comes back from the database and converts it into the managed objects you required. That's the data-binding step. Linq creates the necessary methods automatically, and it makes the perfect code for doing this.

Q3:

How did Linq to SQL beat your ADO.Net code for insert times.  Shouldn't a tie be the best possible result?

A:

The SQL I used in my test case was pretty much the standard simplest SQL you would use for such a job. The automatically generated SQL from Linq was better than what I wrote by hand because they had parameterized the insert statements which I never bothered to do. Had I changed my SQL to what they created it would have been a tie. This is kind of like when the C++ compiler finds a machine code pattern that is better than what you would have written doing it by hand because it did something you don't usually bother doing with hand tuned machine code. But you *could* replace what you wrote with what the compiler generated.

Q4:

What are the downsides to precompiled queries?

A:

There is no penalty to precompiling (see Quiz #13). The only way you might lose performance is if you precompile a zillion queries and then hardly use them at all -- you'd be wasting a lot of memory for no good reason. 

But measure :)

Posted by ricom | 5 Comments

Hard and Soft Mode Debugging or The Woes of Soft Mode

I had to explain this a little while ago and I wrote up something that I thought was generally interesting.   This is only approximately correct (even the examples are flawed) but I think you can get the idea.

I had first hand experience trying to get a soft-mode debugger working when I was the debugger lead on Visual C++ 1.0  -- that was just oodles and oodles of fun.

The easiest way to contrast “hard-mode” and “soft-mode” is as follows:

In hard-mode the debuggee never needs to run in response to any debugger inquiry.  In soft-mode it is normal/natural for the debuggee to "help" the debugger in various ways, usually via an agent. Although some hard-mode debuggers allow you to force the debuggee to run code they don't require it in the normal performance of their job.


Notably, hard-mode debuggers can debug a core dump; soft-mode debuggers cannot. Hard-mode debuggers find it easy to attach to an already-running process and do not ever cause deadlocks in the debuggee.  Soft-mode debuggers can provide access to more/richer debuggee information, may or may not be able to attach to already running processes (because it may be “too late” to insert the needed agent) and broadly suffer from reentrancy problems that can introduce every imaginable failure mode up to and including total destruction of the debuggee’s execution environment (by virtue of calling debuggee functions at a time when data invariants are not valid.)


Classically, soft-mode debuggers place limitations on the scope in which they will debug so as to avoid obvious reentrancy problems and generally give the user of the debugger a decent experience.  They can be very effective but is rarely as easy as it superficially appears.


Let me give a simple example of how soft-mode can "mess things up."


I am debugging function f1, I put a breakpoint in it.  It is half way done its job and hits the breakpoint, the debuggee is now “stopped”.  I ask for the value of some innocuous looking property p1 via evaluation.  The debugger causes the debuggee to execute the correct code for p1.  However the program logic in p1 assumes that p1 can never be called while f1 is running, much less while f1 is half complete. As a result p1 corrupts its internal state. The debug session is now useless.


The unwanted reentrancy I described above is typically less fatal than a total corruption and in practice you tend to get pretty good results if you use some care.  There are more subtle kinds of reentrancy and other safeguards soft-mode debuggers require, in some sense, the case I describe was an “easy” one. To avoid the bigger disasters, soft-mode debuggers place significant limitations on what they will debug.  For instance, you cannot use Visual Studio in mixed mode to break in the VM itself, it’s “out of bounds”.


Contrariwise native-only mode (“hard-mode”) has no such limits.  It’s normal/natural to debug the VM like that. Hard-mode is impervious to reentrancy issues.


The trouble is if you naively combine hard- and soft-modes in a joint hardness (mixed mode) you have dragged the “native-only” half into the world of soft-mode.  Allow me to illustrate.


The user starts the mixed debugger, uses the hard-mode features to set a breakpoint somewhere in the method dispatch code of the managed VM.  The debugger dutifully stops the VM when that breakpoint is hit.  Now the user tries to evaluate a managed expression… this has no hope of working, the VM is not in a consistent state and cannot be asked to do work.   Most places in the VM are not safe stopping points.


OK so you might say, “You can’t debug the VM itself, that’s fair” but that isn’t the extent of the damage.  Suppose we instead set a native breakpoint in the system memory allocator.   Any attempt to evaluate managed expressions while the allocator is in an interim state are likely to result in a corrupt system.


All right, so not system methods, and not the VM.  What about if I set a breakpoint in some other native library that isn’t a crucial system resource? I still can fail because even if dispatching the managed expression succeeds that code could call, via interop, any native method at all, including ones related to the system I am trying to debug, so that system, whatever it is, is now faced with reentrancy issues.  Effectively I’ve injected soft-mode problems into my hard-mode debugger because the soft-mode features have “contaminated” the hard-mode approach.


The trouble is that the managed expressions you might most want to evaluate are likely to be precisely the ones that involve the use of the interop features that access the native code you are trying to debug.


There are other cases that are worth mentioning. You might say, “well people shouldn’t set breakpoints in weird places like that and expect their methods to work, it’s unreasonable.”  But, tragically, the effective stopping point is often not directly chosen by the user.


Let me give another example:  I observe some strangeness in one of my finalizer methods.  I wish to investigate, so I set a breakpoint in the middle of my finalizer.  Shortly afterwards the debugger stops my program having never hit my breakpoint.  I inspect the stack and I find that I am in the context of a thrown exception that originated in the memory manager which has thrown due to an invalid de-allocation request in a destructor my finalizer called via interop.  Now, since an allocation is in flight the allocator’s internal data structures happen to not be consistent at this point -- I now attempt to evaluate a managed expression to look at state associated with my finalizer but I’m back to the same problem I had before – I can’t because the allocator is out of commission.


Things only get worse in a system that supports multi-threading.  Reentrancy issues can and do create deadlocks in addition to corruption. A normally impossible deadly embrace can be completed by a debugger induced resource grab.


In practice, a composite debugger system faces daunting challenges to keep working robustly in a wide variety of circumstances; limiting damage due to unexpected reentrancy is perhaps the biggest problem.

Posted by ricom | 5 Comments
Filed under:

Celebrating Twenty with a Pint

No it isn't what you think. :)

Today is my 20th anniversary at Microsoft. 

Wow. 

20 years of doing anything is pretty amazing and 20 of this definitely counts as amazing, to me anyway.  It's been a lot of fun.

I got to celebrate in a very special way.  You see, I have Hereditary Hemochromatosis -- an Iron Overload condition -- and I've been having therapy for it for many years.  It's really very pleasant therapy, I go to the blood center every few weeks and they take a pint of blood, it keeps the iron under control.  I even get cookies and juice!

Until today they had to throw that blood away because I was a therapeutic donor.  Today was special. Many years ago studies had shown that the blood of hemochromatic donors was safe to use and the FDA recently approved my local blood center to join others in accepting donations from people with my condition.

So, although they've been taking my blood for 15 years (sometimes as often as once per week!) today was the first time I *gave* blood that could be used by others.

I thought it was a great way to celebrate, I hope lots of you join me.

Posted by ricom | 4 Comments
Filed under:

Rebecca Norlander on Behind the Code

Rebecca is a lovely person and she interviews very well here on the latest Behind the Code episode.  And I don't just like her because she can do good performance work, she's actually makes a great friend and even says yes when I ask her to dance sometimes which is always a nice feature :)

So, for all those reasons, and more you will discover for yourself, go check her out:  http://channel9.msdn.com/shows/Behind+The+Code/Rebecca-Norlander-Challenge-and-Success/

Posted by ricom | 3 Comments
Filed under:

Visual Studio Extenders Conference

If you ever wanted to see Redmond and meet some of the folks on the Visual Studio team this might be just the ticket for you.  I'll be speaking about the future of Visual Studio there, especially from the perspective of those that make extensions for it (which I soon hope will be everyone) and several of my colleages well be talking about different initiatives coming in the next version of Visual Studio and beyond (though, of course we're talking about directions and themes and not so much about specific features).

And I'll probably talk about performance too at least a little, it's in my DNA :)

Anyway if that sounds interesting you can check out the details, here's some info from the original page plus the link for you.

From the event page:

This fall, Microsoft is hosting a developer conference to teach developers how to extend Visual Studio. Visual Studio is an open, extensible platform that can be customized for everything from individual productivity tools to enterprise-wide development needs. The Visual Studio Shell enables developers to leverage a world class IDE to build specialized tools for any number of vertical industries, and can even be custom-branded with your own splash screen and app icon.

When: September 15 & 16, Where: Building 33 on campus.

http://msdn.com/vsx/conference/

Posted by ricom | 1 Comments
Filed under:

My Last Words to Bill

We had a little internal yearbook thing you could sign for Bill last week.  This is what I wrote:

 

Dear Bill,

In not too many weeks now I’ll be celebrating my 20th anniversary at Microsoft.  I think I owe you some thanks for these 20 years, and some from before.

In fall of 1979 I got my first real access to a computer. It was a Commodore PET and it was running Microsoft BASIC.  For me, and many others like me, that exposure caused a radical change in our life trajectories. 

By Christmas I was learning 6502 assembler and those MOS tech handbooks were not exactly rich in examples.  If you wanted to see *real* code you had to disassemble/understand the ROMs.  So I guess what I’m saying is that, at the tender age of 15, I was ripping off your intellectual property.  Sorry about that.

I did manage to get pretty good at 6502 assembler and I like to think some of that code was yours, so I tell my friends I got my first low level programming lessons from Bill Gates.  Of course you didn’t know it, but it was nonetheless successful long-distance education through the magic of software.

Eight years, one diploma, and one degree later, I landed in Redmond.  That was 1988.  Since then, I’ve had many chances to meet, learn from, and work with some great people inside and outside of Microsoft – even Melinda for a time – and in turn affect the lives of others. 

Thank you for the education, the opportunities, and the inspiration.

-Rico

Posted by ricom | 4 Comments
Filed under:

Just another day in the perf lab

Even though I've been doing general architecture work on Visual Studio for nearly a year now, my friends in DDPERF are still plugging away on performance problems and finding some interesting results.

This most recent thread is very interesting because it shows yet another example of how the consequences of hardware changes can be subtle and very hard to predict.

Just today I was working with Cameron McColl again-- this time we were trying to understand why a particular benchmark was sometimes mysteriously slower than normal for no apparent reason.  To our delight we found (well, mostly Cameron, I was the "consultant" <grin>) that the problem was in how the timing was being triggered and so the bulk of the variability seems to be measurement error and not actual test variability.  But to our chagrin, the slower time seems to more accurately reflect reality.  Well at least now we know.

How are these things related?

They remind us all that it's very important to track down anomalies in your reported results because otherwise you have little understanding of why you are making things faster, what works and what doesn't.

In the words of Alastor "Mad Eye" Moody:  "Constant Vigilance!"

P.S. If you're looking for the further adventures of the devdiv perf team, you could do worse than subscribe to their blog.

Posted by ricom | 0 Comments

Shutdown Is No Time For Spring Cleaning

I think my current performance pet peeve is shutdown.  Assorted flavors of it, they all seem to have the same kind of problem.  Sometimes we're stuck with it... but maybe we shouldn't be?

This is one time when our basic training, which normally I love so much, tends to let us down.  We were all taught to clean up our own messes -- programming wise that means freeing your resources after you are done with them.  But this backfires in the shutdown case.

Many times I watch as I hit the [X] close button on some application and my poor computer starts to swap as the application goes about paging in vast amounts of its code and then dutifully walking all its data structures-- more paging -- and giving them back to the operating system.  My reaction to this in a word:  ACK!!!!

When your application is ordered to shutdown the last thing you should do is enumerate every piece of memory you have ever allocated and systematically give them back to the operating system.  Your program has a death sentence, and soon your resources are going back to the operating system whether you like it or not:  what you must do  is look at the minimum possible amount of memory necessary to get to a nice safe stable state and then exit as quickly as possible.  Abandoning your memory like this gives the operating system the best chance to get your process unloaded while swapping in the least amount of memory and causing the least impact to the systems disk and memory caches. 

In short, shutdown is no time for spring cleaning.

And why all this cleaning anyway?  Many people report that they have all these important resources that need flushing and so forth.  They couldn't possibly get to a safe state without considerable work but usually that in itself is symptomatic of assorted problems.  Any application that has important data to manage almost certainly needs to be tolerant of power-failure and if that's the case when the user makes important edits they likely should be automatically saved to a durable location.  In fact at any given moment, probably only a few seconds worth of data should be at risk.  If your application has been idle for any length of time it should be fully saved -- and even if the user hasn't chosen to save their work it's still effectively saved somewhere so that you could restore in the current "dirty" state.

So if you have to do all this work to be resilient to power failures, then take advantage of that logic to simplify your shutdown paths.  Your users will thank you.

Posted by ricom | 18 Comments
Filed under:

Cycles in Computer Science, or Am I Ancient?

It's been a strange couple of weeks.  No, really.

It all started when a friend of mine, let's call him "Desi", posted a question asking about what he should do in his last few months of college.  A reasonable enough question and I guess I'm as qualified as anyone to give him some advice because it was a question about computer science courses he could take rather than general career guidance.  Well maybe I'd be qualified to give advice on that too but anyway... let's at least try to stay on topic.

So far so good.  But here's the kicker, he was considering taking a course in C++ but was dubious about its value because it was "ANCIENT" [emphasis in original].

Eeep.

But... but... it can't be ancient.  I mean, wasn't it just last week I was reading some OOPSLA notes on C++ -- no wait that was 20 years ago.  Could it be true?

But wait it gets better.

Then I went to this other talk where people were talking about the importance of C++ and how many companies [many of which are important MS customers] have great investments in C++; they have highly valueable and important codebases that their company's future is based on.  Microsoft of course has huge investments in C++ that we expect to endure for some time.

Wow ok so all that is true but I had this feeling of déjà vu.  Hadn't I heard a talk just like this one while I was in college?  I think I had... it was about COBOL.  We were too cool for COBOL back then, or perhaps too unenlightened is more accurate.

Let's see, if we take 1960 as the birth of COBOL (1959-61 seems to be the range of initial activity) that would mean that in 1985 it was 25 years old.  Let's say that my opinion of COBOL in 1985 was not especially high.  But wait that same year could arguably be called the birthday of C++ because the first commercial C++ compiler became available then.  So let's see -- it's now 2008.... so C++ is 23 years old.  The same age as COBOL when I started college.

The fact that I can even offer this perspective is already making me feel old :)

But, aside from nostalgia and pointing out what a nub I was in college, what's my point?  I always have a point right.  And I assure you it isn't slamming folks still working on COBOL because high quality COBOL implementations have continued to show the viability/future of that programming language for my entire career.

My point is this:  In many ways the same kinds of problems face our industry today as always.  Migration of existing codebases to the latest technology while preserving their value isn't a new problem, it's an old one.  And I use the word migration loosely because often it's not really some kind of conversion or retirement of code but rather more like a meeting of worlds.

When I was in college one of the big buzzwords we heard about was 4th generation languages, and sometimes even (ooo, aaah) 5th generation languages.  I don't know if I would really score the industry very high in terms of actual *language* evolution in the last 25 years (though I'll be the first to give Anders kudos for enabling your average VB programmer to use functional programming constructs without even having to know it).  Where I will score us highly is in runtime evolution.

So programming languages are somewhat similar, at least broadly speaking, to what they were say 25 years ago.  What's changed is the environments we like to run in.  GUI environments drove event-based programming which caused a need to express those notions.  Object oriented programming fit that need well and had other benefits and so many embraced it, or borrowed notions from it (tell me, what's the difference between a MessageProc (with its associated switch statement) and an object with vtable?  Mostly syntatic sugar.  But the programming model is fundamentally different than say a console application.

Now its happening again, and the real need facing C++ programmers is somewhat the same as what faced COBOL programmers say 25 years ago.  It's not that the language is out of joint -- it isn't.  I mean, ok maybe you like or don't like COBOL syntax but that doesn't doom a language and surely C++ syntax is not the zenith of wonderfulness.  But that isn't what's holding C++ programmers back.  The biggest problem, at least in my opinion, is one of accessing new/modern runtime features that may have a different programming environment from the context of an existing environment.  Hybrid applications are the norm.

Most programming languages you can name have natural environments that they like to run in and natural kinds of application notions they can express well. 

And now, GUI environments are changing; more declarative models like the WPF/XAML programming model are becoming popular.  Multi-threaded needs are becoming more mainstream.  Then there are all the rich client models like AJAX.  And many others.  Many of these have complicated resource management problems that make you want to reach for languages that have automatic management of resources (like the .NET family and Java).

The truly successful environments of the future may have more to do with their ability to mesh together many different kinds of assets than defining some new uber language and uber runtime.  I'd love to do this nowish because, 25 years from now, I bet there will be something newer/better and I don't want to have to migrate an industry again.  :)

Posted by ricom | 12 Comments
Filed under:

Sara Ford goes retro

In honor of the leap year, she's posted a couple of articles I wrote back around 1993 describing things you could do in VC++ 2.0.  Oddly enough most of them are still applicable.

 Cheers Sara

Posted by ricom | 0 Comments

Computer Measurement Group offers past papers to the public!

As you know I presented my performance signatures paper at the Computer Measurement Group's (CMG) conference a little over a year ago and I am happy to announce that CMG is now offering previous proceedings from 2005 and older to the public.  I think this is a fabulous move by them because there is some first rate content here that just wasn't seeing nearly the light that it deserved to see.

I hope to provide more specific recommendations when I have had the chance to look at the site myself in more detail and of course

If you're looking for a way to grow as a performance professional this could be a fabulous resource for you.  I thoroughly enjoyed the event and the presentations when I went.

See http://www.cmg.org/proceedings/

Posted by ricom | 1 Comments

Performance Quiz #13 -- Linq to SQL compiled query cost -- solution

Well is there really a "solution" at all in general?  This particular case I think I constrained enough that you can claim an answer but does it generalize?  Let's look at what I got first, the raw results are pretty easy to understand.

The experiment I conducted was to run a fixed number of queries (5000 in this case) but to break them up so that the compiled query was reused a decreasing amount.  The first run is the "best" 1 batch of 5000 selects all using the compiled query.  Then 2 batches of 2500, and so on down to 5000 batches of 1.  As a control I also run the uncompiled case at each step expecting of course that it makes no difference.  Note the output indicates we selected a total of 25000 rows of data -- that is 5 per select as expected.  Here are the raw results:

Testing 1 batches of 5000 selects
5000 selects uncompiled 9200.0ms 25000 records total 543.48 selects/sec
5000 selects compiled 5401.0ms 25000 records total 925.75 selects/sec

Testing 2 batches of 2500 selects
5000 selects uncompiled 9181.0ms 25000 records total 544.60 selects/sec
5000 selects compiled 5402.0ms 25000 records total 925.58 selects/sec

Testing 5 batches of 1000 selects
5000 selects uncompiled 9169.0ms 25000 records total 545.32 selects/sec
5000 selects compiled 5432.0ms 25000 records total 920.47 selects/sec

Testing 100 batches of 50 selects
5000 selects uncompiled 9184.0ms 25000 records total 544.43 selects/sec
5000 selects compiled 5511.0ms 25000 records total 907.28 selects/sec

Testing 1000 batches of 5 selects
5000 selects uncompiled 9166.0ms 25000 records total 545.49 selects/sec
5000 selects compiled 6526.0ms 25000 records total 766.17 selects/sec

Testing 2500 batches of 2 selects
5000 selects uncompiled 9165.0ms 25000 records total 545.55 selects/sec
5000 selects compiled 7892.0ms 25000 records total 633.55 selects/sec

Testing 5000 batches of 1 selects
5000 selects uncompiled 9157.0ms 25000 records total 546.03 selects/sec
5000 selects compiled 10825.0ms 25000 records total 461.89 selects/sec

And there you have it.  Even at 2 uses the compiled query still wins but at 1 use it loses.  In fact, the magic number for this particular query is about 1.5 average uses to break even.  But why?  And how might it change?

Well, as has been observed in the comments, Linq query compilation isn't like regular expression compilation.  In fact compiling the query doesn't do anything that isn't going to have to happen anyway.  In fact, actually creating the compiled query with Query.Compile hardly does anything at all, it's all deferred until the query is run just as it would have been had the query not been compiled.  So what is the overhead?  Why is it slower at all?  And what's the point of it?

Well the main purpose of that compiled query object is to have an object, of the correct type, that also has the correct lifetime.  The compiled query can live across DataContexts, in fact it could potentially live for the entire life of your program.  And since it has no shared state in it, it's thread-safe and so forth.  It exists to:

1) Give the Linq to SQL system a place to store the results of analyzing the query (i.e. the actual SQL plus the delegate that will be used to extract data from the result set)

2) Allow the user to specify the "variable parts" of the query.  The most common case isn't that the query is exactly the same from run to run, usually it's "nearly" the same... That is it's the same except that perhaps the search string is different in the where clause, or the ID being fetched is different.  The shape is the same.  Creating a delegate with parameters allows you to specify which things are fixed and which things are variable.

Now there was some debate about how to make compiled queries durable, automatically caching them was considered, but this was something I was strongly against.  Largely because of the object lifetime issues it would cause.  First, you would have to do complicated matching of a created query against something that was already in the cache -- something I'd like to avoid.  Secondly you have to decide where to store the cache, if you associate it with the DataContext then you get much less query re-use because you only get a benefit if you run the same query twice in the same data context.  To get the most benefit you want to be able to re-use the query across DataContexts.  But then, do you make the cache global?  If you do you have threading issues accessing it, and you have the terrible problem that you don't know when is a good time to discard items from the cache.  Ultimately this was my strongest point, at the Linq data level we do not know enough about the query patterns to choose a good caching policy, and, as I've written many times before, when it comes to caching good policy is crucial.  In fact, analogously, we had to make changes in the regular expression caching system back in Whidbey precisely because we were seeing cases where our caching assumptions were resulting in catastrophically bad performance (Mid Life Crisis due to retained compiled regular expressions in our cache) --  I didn't want to make that mistake again.

So that's roughly how we end up at our final design.  Any Linq to SQL user can choose how much or how little caching is done.  They control the lifetime, they can choose an easy mechanism (e.g. stuff it in a static variable forever) or a complicated recycling method depending on their needs.  Usually the simple choice is adequate.  And they can easily choose which queries to compile and which to just run in the usual manner.

Let's get back to the overhead of compiled queries.  Besides the one-time cost of creating the delegate there is also an little extra delegate indirection on each run of the query plus the more complicated thing we have to do: since the compiled query can span DataContexts we have to make sure that the DataContext we are being given in any particular execution of a compiled query is compatible with the DataContext that was provided when the query was compiled the first time.

Other than that the code path is basically the same, which means you come out ahead pretty quickly.  This test case was, as usual, designed to magnify the typical overheads so we can observe them.  The result set is a small number of rows, it is always the same rows, the database is local, and the query itself is a simple one.  All the usual costs of doing a query have been minimized.  In the wild you would expect the query to be more complicated, the database to be remote, the actual data returned to be larger and not always the same data.  This of course both reduces the benefit of compilation in the first place but also, as a consolation prize, reduces the marginal overhead.

In short, if you expect to reuse the query at all, there is no performance related reason not to compile it. 

Posted by ricom | 15 Comments
Filed under: , ,

Performance Quiz #13 -- Linq to SQL compiled queries cost

I've written a few articles about Linq now and you know I was a big fan of compiled queries in Linq but what do they cost?  Or more specifically, how many times to you have to use a compiled query in order for the cost of compilation to pay for itself?  With regular expressions for instance it's usually a mistake to compile a regular expression if you only intend to match it against a fairly small amount of text.

Lets do a specific experiment to get an idea.  Using the ubiquitous Northwinds database and getting the same data over and over to control for the the cost of the database accesses (and magnify any Linq overheads) we run this query:

var q = (from o in nw.Orders
            select new {
                OrderID = o.OrderID,
                CustomerID = o.CustomerID,
                EmployeeID = o.EmployeeID,
                ShippedDate = o.ShippedDate
           }).Take(5);

and compare it against:

var fq = CompiledQuery.Compile
(
    (Northwinds nw) =>
            (from o in nw.Orders
            select new
                   {
                       OrderID = o.OrderID,
                       CustomerID = o.CustomerID,
                       EmployeeID = o.EmployeeID,
                       ShippedDate = o.ShippedDate
                   }).Take(5)
);

So now the quiz:  How many times to I have to use the compiled version of the query in order for it to be cheaper to compile than it would have been to just use the original query directly?

Posted by ricom | 28 Comments
Filed under: , ,

Rico's Instrumentation Aphorisms

A few months ago, Mary Gray of the Management Practices Team came to talk to me about good practices for creating performance counters and doing measurements generally.  She interviewed me on the topic for about an hour and was madly scribbling notes the whole time while I talked a mile a minute.  What's below is a slightly edited version what she took away from the interview.  I thought it was interesting enough that you guys might like to see it so here it is.

Mary, thank you for allowing me to share.

Adding instrumentation in the form of events and performance counters to your software is one of the most important things you can do to make your component or application more manageable by IT personnel, more supportable by CSS, more easily tuned and debugged by developers and testers.

The OS already has performance counters you can use for such resources as CPU, disk, memory, and network resources. These are the primary resources that you will need to track for most software. You don't need to add a lot of performance counters or events to your software for raw resources; the trick is to correlate what your software thinks it is doing with the operating system resource impact of those operations.

Judiciously added instrumentation allows you to more easily pinpoint the states that lead to poor performance or failure. Well designed events inform monitoring software and IT admins about whether the software is operating normally, in a degraded state, or has failed completely. Good tracing events in conjunction with perf counters related to the work of the software allow diagnosis and tracking of trends. Events targeted to the administrator can identify what work was being done for which user context when a failure occurs.

Rico's Instrumentation Aphorisms

Instrumentation aphorism #1: Attribute the cost, don't describe it.

To attribute costs, the important word is "correlation". You want to correlate what your software thinks it is doing to what the operating system knows about resource usage. You can use (e.g.) ETW tracing events to mark the beginning and end of "jobs" or transactions in your software's work life.

What is a “transaction” in the runtime life of your software?  Is it a mouse click event?  A business transaction of some kind?  An HTML page delivered to the user?  A database query performed?  Whatever it is, look at your critical resources and consider the cost per unit of work.  For example, consider CPU cycles per transaction, network bytes per transaction, disk i/o’s per transaction, etc.  

Tracing events, to be useful, need to be associated with the higher level transactions of the software rather than associated with the life of single objects. You can have too many events and events at too low a level or marking time intervals that are too short to be useful. This use of events and perf counters just creates overwhelming noise and does not allow you to see trends easily.

This correlation between the work of the software and resources should also be used in administrator events marking changes of state, not just tracing events. Administrators are running the software for a reason and have every interest in knowing why (e.g.) MOM 2005 is reporting a degraded state for it - why the system is slowing down or why the software is banging away at the disks continuously. These events, as opposed to tracing events should provide actionable advice.

Instrumentation aphorism #2: Account for consumption.

To account for consumption, you will want to calculate rates rather than just measure occurrences. Look at the resource costs per unit-of-work of work. What is your software accomplishing to justify its consumption of CPU, memory, disk, network , or other resources? Expressing resource costs in a per-unit-of work fashion will help you to see which costs are reasonable and which are problems. You want to be able to trace or to inform adminstrators what resources are being used.

The operating system already gives you a variety of performance counters that measure CPU consumption, disk I/Os, memory usage, and network activity. These are the primary measuring sticks you need to compare to what your software is doing. The performance counters you add are most useful when they calculate the rate of work accomplished.

You can generate tracing events that tell you the rate of work, what the user context is. The combination of events that mark the start and end of transactions with rate counters allows developers and CSS people to pinpoint the resource that is being pinched and wrecking performance or starting a death spiral to failure.

If you are considering a design which sequesters a chunk of memory which your software managed, you may want to think twice about it. The OS already tracks memory resources. If you manage your own memory, then you have to duplicate the operating system plumbing to be able to diagnose performance problems and failures. The programming and maintenance costs for this may outweigh the hoped-for design benefits. 

Posted by ricom | 1 Comments
More Posts Next page »
 
Page view tracker