Open questions about performance

Introduction

We live in a fantastic time with cheap resources and advanced chipsets in almost all devices. Moore’s law¹ predicted a bright future with continued chip performance growth. This prediction offered us the future where we can meet our performance expectations by doing nothing in our code by making affordable investments in our hardware infrastructure.

While this approach works well for most projects, after seeing tens or even hundreds of them, I can assume that these performance-neutral decisions postpone architectural complexity.

How do we measure performance?

When we speak about the performance of any application, we can mean many different parts that affect our business flow. The specific topic we can discuss will depend on the use case we want to review and the business problem(s) that solves our application. It could be any mix of:

Network
Memory (RAM²)
Disk
CPU

There is a couple of examples of performance measurement:

CPU utilization per task³
Time execution per task
Number of tasks executed per second⁴

If you measure CPU utilization per task³, this introductory article will likely not help you because you apply all possible optimizations. But we will focus on two other types of measurement.

Time execution per task

This is a straightforward way to measure the task.

We find a couple of paths in the task and identify payloads to trigger them.
We trigger tasks by identified parameters.
We calculate the execution time of our action (without transport time):
- We can collect start and end times in nanoseconds (micro or milliseconds, depending on the performance of the task).
- We collect the difference somewhere.
Or we can calculate an execution time for an external spectator:
- For example, we can run curl with the time function.
- When the function finishes, it will show us time calculations.

$ time sleep 1

________________________________________________________
Executed in 1.01 secs fish external
 usr time 1.18 millis 0.16 millis 1.02 millis
 sys time 4.25 millis 1.88 millis 2.38 millis

Number of tasks executed per second

We will not dive deep into this topic, but we must mention some tools to help you dive into the question.

To calculate a number an RPS⁴ number, we can use one of the various available tools for that, like k6, JMeter, Apache Bench (ab), Siege, etc.

After running the chosen application with the required configuration, we will get a result with some numbers, like RPS, average request execution time, total time, number of performed requests, etc.

On-going metrics

It’s not a type of synthetic measurement but accurate data generated by real users.

One of the most critical parts that barely related to the topic, but I couldn’t avoid it. Always collect at least high-level metrics for the real processing of your applications. Use public services or Open Source solutions, but always collect high-level metrics per endpoint.

Ideally, we want to collect more detailed metrics for all incoming and outcoming IO⁵.

Why is it important to measure performance?

There are many reasons to know the performance of the application:

Predict scaling hardware investments in case of glow
Understand the limitations of an application
Find the technical issues that could be reasons to leave your service customers
- For example, response time is more than 1 second
Find places for improvement to solve
- Technical issues
- Scaling issues
- Save on hardware by decreasing the number of required resources

Complex question

When we already have a system where we want to improve performance significantly, it requires:

Deep understanding in performance analyzing: collecting metrics, profiling an application, reviewing code, etc.
Good knowledge of the product (good to have).
Understanding of how to do refactoring with minimal effect on business processes.

All applications sooner or later require these complicated steps due to the history of the application evolution. Usually, projects at this stage people name as legacy⁶. Unfortunately, it’s the unavoidable result of every project.

At the same time, we can keep writing more effective code to keep performance as good as possible to avoid premature performance issues.

The best practices

Following the best practices during everyday work could be challenging. But in reality, everything repeated many times will lead us to habits that we will apply without mental overwhelming.

We will not speak here about the actual best practices because it’s a big question for a book or a couple of books. So, all these snippets and suggestions will be highlighted in the later articles.

However, the basic knowledge of:

Computer science
Performance testing
Code simplicity
Benchmarking

It can help to postpone the tech debt. Follow the blog for the next articles for real snippets and examples.

Conclusion

This article was written as a reference for the following snippets and articles to avoid repetition. It will be a good starting point for someone. And this article could be eventually updated to be more effective. Any feedback is welcome!

Moore’s law is not a law. It’s an observation. And it looks like it has already stopped working. Anyway, it’s too important a part of the computer’s history to forget the name. ↩
RAM (Random-access memory) is a type of memory usually used to store temporary data related to currently running applications. ↩
By task, I mean any work to be done in the scope of the process - for example, HTTP request processing, RabbitMQ message processing, cron-job scheduled tasks, etc. ↩ ↩²
RPS or QPS (queries per second) - measurement where we calculate how many requests we can process per second in the given environment, like a node in the cloud, laptop, dedicated server, etc. ↩ ↩²
IO - input/output is the communication between an information processing system. ↩
In this article legacy word does not involve any negative connotations. It’s used here to highlight that project on the life-cycle stage where refactoring and improvements are required for future support and feature growth. ↩