Following the data flow
While there are many different techniques and strategies for increasing system performance, the starting point is always increasing visibility of the system performance characteristics.
This means that the system may need to be instrumented so that statistics and telemetry can be gathered from the system. This might be done via:
- application specific counters that are published via tools like
statsd
,graphana
,prometheus
etc. - existing tools like
VisualVM
for Java orstrace
for Unix processes. - combinations of code changes and tools like
dtrace
. - system measurements like CPU usage, network tx/rx values etc.
With this data in hand, we can start to look for opportunities for improvement. While this is an exploratory exercise, and finding optimisations can not be guaranteed, there are often low hanging fruit. These might include:
- accidental inclusion of debug logging,
- double copying of memory buffers,
- unnecessary allocation of objects,
- overly conservative synchronisation locks,
- etc.
Once these avenues have been exhausted, we might find that the performance gains are sufficient for the business needs. That is, the results may lower latency to acceptable levels for individual users, or resource usage might make it viable for the business to engage in the next phase of scaling.
However, where the performance gains are still not sufficient we can start to look at gaining a deeper understanding of the data flows and looking for other candidates for change. This might involve more technical changes such as:
- converting from boxed types to primitive types in order to avoid memory allocation overhead or reduce the complete memory footprint,
- restructuring subsystems to reduce unnecessary data movement,
- inverting the use of I/O buffers to be able to manage populated buffers across the full I/O marshalling and transport stack rather than incurring additional data copies,
- converting data structures from lock-based synchronisation to lock-free implementations based on CAS operations and memory barriers,
- preallocating bookkeeping data structures and explicitly managing resource pools rather than offloading this work to the garbage collection layer,
- introducing back-pressure in between concurrently operating subsystems so as to be able to run closer to capacity without resource breaches,
- etc.
As can be seen, there are many different techniques and avenues that could be explored in order to improve the performance of the system. Sometimes these performance gains can be in multiple orders of magnitude, and make the difference to the bottom line that moves a business from being unviable to having a competitive advantage.