Predicting and measuring processor utilization



By paul ~ October 8th, 2007. Filed under: Best Practices, FAQ, Tips & Tricks.

One of the areas where I have been kept busiest as a Foresight modeling consultant is in creating models that predict processor utilization.  Frequently, system designers or customers want to design-in some CPU “head-room” capacity.  This may be motivated either by a desire to over-design slightly in order to ensure sufficient capacity in peak load situations and minor software revs, or a need to plan for future growth.  These requirements appear in a number of forms, but they all imply the need to predict and measure processor utilization.

In complex embedded systems with multi-team applications development, such as software-defined radio, “design to CPU utilization” can be very challenging.  One cannot simply allocate processor bandwidth to software units in the same way that one can allocate memory.  (Actually, you can, if the system is simple enough.  You can even enforce it like you can with memory.  An RTOS like Green Hills’ Integrity will let you create different “time” partitions.  Unfortunately, this methodology breaks down if the system is complex and has unpredictable loading behaviors.)  It becomes necessary to predict (and often to track) compliance with the CPU utilization requirement from the early stages of software design through test and signoff.

If you’re faced with a CPU utilization requirement, what are your options?  Here is a quick summary of a few methodologies that you should consider:

Adding Up MIPS

This mechanism is the easiest, most obvious, and most often employed CPU utilization prediction methodology.  In it’s simplest form, it is the application of the following formula:

% Util = SUM_OF_ALL_TASKS(Task Instructions * Task Rate) / Available MIPS

The way this is typically accomplished is the development of an Excel workbook model. I’ve seen all levels of sophistication from the simple to the “WOW!”.  For each task (or at least each task that is considered significant) an estimate is made of the number of lines-of-code that will be executed per execution and the frequency of execution. These models usually contain some conversion function that is used to convert lines-of-code to machine instructions.

Unfortunately, any given configuration of the workbook reflects only one of many possible paths through the system.  This often gives a very poor picture of system performance.  It is possible for a 50% utilized processor to miss a hard deadline making the system non-functional.  Also, it may be that two tasks never run concurrently, or the shortest path through a task is taken 90% of the time.  If the workbook model does not accurately reflect these facts, the system can be severely over-designed.  Except for the simplest systems, this methodology should not be used alone.

Advantages

  • Quick and easy for simple systems
  • No special software or tools necessary

Disadvantages

  • Does not take into account task interdependence or deadlines
  • If worst-case task rate is used, can result in severe over-design
  • Usually does not take into account “hidden” costs (inter-task communication, RTOS overhead, etc.)
  • Task cost estimates (lines-of-code) only considers average or worst path through task, not all paths.
  • Gives a false sense of security while telling the designer virtually nothing about how well the system will work.
  • If you overcome all of the previous disadvantages by creating a more sophisticated, complex model, your model will be expensive to create, hard to use, and very expensive to maintain.

Schedulability Analysis

Schedulability analysis verifies that a system of tasks is “schedulable”, which means that it is possible for every task to be scheduled in such a way that it meets its deadline.  The means by which this is accomplished is called Rate-Monotonic Analysis (RMA.) For an example of a tool that supports/enables schedulability analysis, see PERTS, by Tri-Pacific  Software, Inc.

Schedulability analysis is a static analysis mechanism that requires similar inputs to the “Adding Up MIPS” methodology described above.  However, it adds important information in the form of the sequencing of tasks which takes into account task interaction.  This significantly improves the quality of the results in ensuring acceptable system performance.  For certain kinds of systems, schedulability analysis can ensure a “correct-by-design” system.  At the very least, RMA can be used to develop priority and schedule information for parts of even complex systems.

Advantages

  • Pretty quick and easy
  • Can yield schedule and priority information that can be invaluable in design
  • Works well for simple embedded systems

Disadvantages

  • Can not easily handle complex interactions between tasks (control flow, for instance.)
  • Prefers that tasks be periodic or describable as periodic
  • Many real-time systems do not fit and can not be analyzed
  • Can not give latency or other detailed information about system performance

Note that this is a very brief description of a pretty complex subject.  For more information on the capabilities and limitations of RMA, please visit the links above.

Resource-Based Simulation

In what I call “Resource-Based Simulation” a behavioral simulation model of the system is constructed that includes data and control flow and this model is mapped to a platform model that includes the processors and other resources over which the application is implemented.  This is the methodology supported by the Foresight tool offered by Foresight Systems M & S.  Foresight provides a powerful and simple, graphical modeling language for capturing system, application, platform, and resource behavior.  Both software and hardware behaviors can be captured simultaneously.  In fact, the behavior can be described in an implementation indepent fashion and alternative mappings can be used to evaluate the optimal implementation strategy.  Much more information regarding the capabilities of Foresight can be found on the Foresight web site, above.

By having a complete data and control flow model mapped to a resource-based platform model, not only can processor and bus utilization be measured directly at the resources, but latency, throughput, and failure/error behaviors can be observed directly.  Traditional simulation-based tradeoff analysis can be employed for system optimization.  The independence of behavioral model and platform model makes it possible to evaluate the application on a variety of platforms, or multiple applications on a single platform.  This can be invaluable for portability testing.

In the early stages of the system design, this methodology relies, like the other methodologies, on estimates of processing cost for each path through a component or thread.  One of the advantages of resource-based performance modeling, however, is that the control flow can easily and accurately represented and these estimates applied.  The estimates can be refined over the course of the product design process in order to improve accuracy.  At the early stages of the design, the model can be used for budgeting and allocating processor bandwidth to software components.

One of the great advantages of having a simulatable model is the ability to exercise it with various load models.  Load models are models of the inputs that “load” the system in various ways.  These can be used to exercise the system with typical, worst-case, or exceptional inputs to predict system behavior under these circumtances.  Processor utilization can be directly measured from the resources under all conditions.  Sensitivity analysis can be performed on the estimated software cost parameters to identify the components that the system is most sensitive to.  Importantly, if I have a 50% processor utilization requirement, I can simply reduce the available processor bandwidth by 50%, re-run the simulations and verify that the system can still meet it’s deadlines and function correctly.  This is a better verification of a 50% growth requirement than any other mechanism.

A properly designed resource-based performance model will be highly flexible and able to predict not only the performance of the system in the current revision, but for the inevitable revisions and configuration options to come.

Advantages

  • Accurate prediction of resource utilization
  • Accurate prediction of latency, throughput and other timeliness metrics, even in the presence of resource contention and scheduling policies
  • Direct modeling of layered resource behavior is possible increasing flexibility
  • Functionality to resource mapping tradeoffs can be performed
  • Model is very versatile and can (should?) be used for model-driven design
  • Model looks like the system and is easy to maintain

Disadvantages

  • Modeling is typically more expensive than the static analysis methodologies
  • Model validation can be challenging
  • The tools are more expensive and harder to learn to use effectively

I have used resource-based performance modeling (the Foresight tool specifically) extensively in very complex embedded system design and believe that it provides more “bang-for-the-buck” than the other two methods.  That being said, for many systems design efforts, I would still use RMA to help with priority assignment and use the simulation model to verify the strategy.

Model validation and calibration can be accomplished as implementation progresses by replacing estimates with measurements and comparing against measured results from various testing activities at the component and subsystem levels.

Direct Measurement

Analysis & Modeling are fine for design and prediction, but what do you do if measurement is required by you or your customer for signoff?  If you have such a requirement, you’re going to have to design-in a measurement mechanism of some kind.  Your RTOS or development tool vendor may provide something that you can easily use.  If not, this article has some great tips on measuring processor utilization.

I recommend that you make processor utilization one of the standard metrics that you gather from all of your testing activities.  Build the mechanism in, and gather the information all the time.  This information can be used from even the simplest tests that exercise only a thread of functionality to help validate/calibrate simulation models.

Summary

Processor utilization can have a direct economic impact and is a very important design consideration.  Getting the processor sizing wrong can result in product failure, unnecessarily large, hot, short battery life, expensive products.  Get it right the first time by designing for processor utilization from the outset using the appropriate CPU utilization prediction methodologies.  Resource-based performance analysis (as supported by Foresight) is often the best methodology available for design-time performance analysis and processor utilization prediction. 

Comments are closed.