Architecture Quality Attributes — Performance

4 min readOct 29, 2020

Performance is about time and system’s ability to meet timing requirements. For a web based system performance can be measured in number of events the system can service in a certain time or time taken to service individual event.

Performance is often linked with scalability such that increasing your system’s capacity for work while performing well. Even if performance is explicity mentioned , it’s part of basic requirement and expectation from the user.

Tactics for performance

At any instant during period after event arrives in the system before it’s completed either system is working on the event or blocked on some resource for completing the event.

We can say that tactics for performance can considered in two basic categories . Processing time and blocked time.

Processing Time

Processing consumes resource either hardware or software , which takes time. Each of the resources behave differently as their utilization approaches their limit of capacity.

Blocked Time

A computation can be blocked because of following reasons

Contention of resources — Many resources can be used only one clinet at a time , which means that others will have to wait and lead to idle/blocked time

Availability of resources — Even in the absense of contention , a computation can not proceed if resources are not available. For eg , servers going offline.

Dependency on other computations — A computation may have wait because it must synchronize with the results of another computation.

With these background we can turn our tactics to into following categories

Control resource demand

One way to increase performance is to manage the demand which can be done in following ways

Throttle input events — It’s possible that we can throttle the input events befing processed

Limit event reponse — We can limit our systems response by queueing the input and responding to them in a controlled manner.

Prioritze events — If not all events are of same priotiy , we can process high priority ones, but it have a catch that if not enough resources are available it may lead to starvation of low priority resources.

Reduce overhead —Remove the intermediaries like reduce network latencies by co-locating the services.But it’s a trade off between modifiability and performance.

Bound execution time — Place a limit on how much execution time an event can consume .For example bulkhead pattern.

Increase resource efficiency — Improve the algoritham or efficency of the resource and hence reduce resource overhead.

Manage resources

Even if the input events can’t be controlled as in most public web applications , we can manage the resources.

Increase resources — Increase processor capacity or faster networks .

Introduce concurrency — If same computation can be performed parallely , it will reduce the block time. For this we can use multiple threads or parallelize wherever it’s possible.

Maintain multiple copies of computations — Multiple servers in a client-server model are replicas of computations. The reason behind this replication is to reduce the contention which will happen , if all requests are serviced by single instance.

Maintain multiple copies of data — Identify which data is accessed most and based on that we may be able to maintain a copy of that in storage with higher access capability. For example frequently read data can be cached and reduce load on database.

Bound queue sizes — This controlls maximum number of queued arrivals. If you adopt this tactics should consider what’s the strategy when queue overlfow happens.

Schedule resources — Whenever there is a contention of resources , it must be scheduled. We can use various strategies like FIFO, Fixed priority scheduling or dynamic priority scheduling.

Design checklist

Allocation of responsibilities — Determines the system’s responsibilities that will involve heavy loading or time critical operations.

Co-ordination model — Choose the communication and co-ordination model that will support concurrencyand ensure required performance can be obtained.

Data model — Determine parts of data model that will come under heavy load and determine if multiple copies of them needs to be maintained or patitioning will benefit performance.

Mapping among architectural elements — Determine if co-locating elements will improve performance or choice of threads of control.

Resource Management — Determine which resources are critical and monitor and managed under normal and load situations.

Biding time — For elements that can be bound or linked at the time determine time will take for binding and additional overhead introduced due to late binding.

Choice of Technology — Will you choice of technology inhibit or enable scheduling policy , setting priorities or setting policies of reducing demand.

Performance is about management of system resources in the face of certain load patterns of input events to achieve acceptable system behaviour to end users.

Architecture Quality Attributes — Performance

Tactics for performance

Processing Time

Blocked Time

Design checklist

Written by Jijoy