Diagnosing performance issues for Flow Production Tracking Enterprise Docker
Local installations of Flow Production Tracking are no longer offered. This documentation is intended only for those with existing instances of Flow Production Tracking Enterprise Docker. Click here for a list of our current offerings.
Contents
- Scope
- Flow Production Tracking architecture at a glance
- Understanding performance issues
- Investigating performance issues
- Generic debugging tips
This page gives information about how to troubleshoot performance issues.
For urgent issues that require immediate support, please open a ticket on Flow Production Tracking Support.
Scope
This document is intended for Flow Production Tracking Enterprise System Administrators (local installs).
Flow Production Tracking architecture at a glance
At a very high level, Flow Production Tracking is composed of two main components:
- Application
- Database
Application - Containers inside which the Web Application Components are installed (Apache, Passenger, Rails, Flow Production Tracking App). Depending on your topology, you can have many application containers, specialized for serving the Web UI and API requests.
Database - The server on which the database server is installed. It can be the same server as the application server.
Requests are received by Apache and passed along to a Passenger process running an instance of the Flow Production Tracking Application. If no processes are available, it is inserted in a queue. Then, each Flow Production Tracking process passes requests to the database, and is returned upon completion. Each request to Flow Production Tracking usually results in many requests to the database.
Understanding performance issues
The main causes of performance and connectivity issues are:
- Database locks
- Slow queries
- Contention on Flow Production Tracking servers
Database locks - While rare, some operations can lock tables, preventing even read access to these tables.
Slow queries - Queries that take a long time to execute not only cause pages and scripts that are executing these queries to be slow, they can also affect other pages and queries. Slow queries monopolize Passenger and Database connections, and can cause the database server to swap a lot of data from disk to memory. Swapping is a costly operation that affects all other requests being handled by the database.
System contention. High load can cause contention at various levels. The first thing to do when Flow Production Tracking is having periodic performance issues is to identify where that contention happens.
These issues can have a similar impact on Flow Production Tracking. If more requests are made than the system can process, the request queue fills up and eventually, requests are rejected. For the end users, this is reflected by a 503 error, misleadingly letting users think the site is in maintenance.
Investigating performance issues
Configure database logs
The first step is to configure PostgreSQL logs to output useful information.
When an incident happens, it is good practice to persist the database logs, as they are rotated on a weekly basis.
Database locks
In situations where Flow Production Tracking becomes unresponsive, the first thing to investigate is database locks. The main blocking operation in Flow Production Tracking is adding new fields. Adding a new field requires PostgreSQL to lock the modified table, effectively blocking any subsequent requests on that table. The best way to find locks is in the database logs.
If logs are properly configured, the following queries on the database logs will help underline locks:
# Find request that were waiting for a lock
grep "waiting for AccessShareLock" /var/lib/pgsql/15/data/pg_logs/postgres-Thu.log
# Find alter statement, which are adding the field to a column
grep "ALTER" /var/lib/pgsql/15/data/pg_logs/postgres-Thu.log
Resolution
This behavior is a PostgreSQL limitation, not a Flow Production Tracking one. Flow Production Tracking is dynamic, allowing addition of new fields in a table. The bad thing is that it does lock access to the table during the modification.
The only way around it is to create new fields outside your studio’s active hours.
Slow queries
From a simplified point of view, Flow Production Tracking is a GUI over a database. Flow Production Tracking allows users to craft their own queries. If these queries are not correctly built, they can be slow. If more requests are made than the system can process, the request queue fills up and eventually, requests are rejected.
Identifying slowest queries
The following script will report what are the most expensive queries, most expensive pages, users that are using the most resources, etc. Simply feed it a Flow Production Tracking Production log and a simple report will be produced. See the script header for more details on how to use it.
It can be run on the complete log, or on specific time ranges extracted with grep to investigate specific periods.
Optimizing slow queries
When the slowest queries have been identified, the following actions can be taken:
- Optimize the query. See Flow Production Tracking query performance recommendations.
- Execute that query less often. This is a fallback solution for API calls that can’t be optimized.
System contention
Most common causes of contention are:
- Passenger contention
- Database contention
- High load on server
Passenger contention - Too many requests are executed and Passenger can’t process them all. Requests end up being queued, and are eventually rejected, resulting in 503 errors.
Database contention - Too many connections to the database at the same time. This is usually because too many requests are being made, or requests that are too slow to execute.
High load on server - The server is under heavy load. When this happens, it may also result in Passenger or Database contention.
Flow Production Tracking can scale by increasing the number of Application Servers or improving the Database Server specifications. Most Flow Production Tracking Clusters are single server clusters. All components are running on the same server. As your studio’s Flow Production Tracking usage grows, this configuration may no longer be performant enough. But before going there, the problem must be identified.
There are key metrics that can be gathered on your Flow Production Tracking servers to help diagnose these issues. If not already in place, consider using monitoring tools to monitor your servers. Key metrics to keep an eye on are:
- System Load
- CPU Usage
- Memory Usage
- Disk I/O
- Passenger Queue Size
- Postgres Active Connections
While most metrics are standard monitoring metrics, Passenger queue size and PostgreSQL active connections are more custom metrics.
For more information on PostgreSQL monitoring, please refer to https://wiki.postgresql.org/wiki/Monitoring.
Generic debugging tips
pg_top
pg_top is the classic UNIX program top specialized for PostgreSQL. In addition to classic top information gives, it allows you to:
- View how many reads and writes are currently in progress.
- View currently running SQL statements of a process.
- View the query plan of a currently running SQL statement.
- View locks held by a process.
- View user index and table statistics.
We highly recommend you install it on your database server, if not already installed.
Find running database queries
From the application server:
psql <site_name> -c "SELECT datname,query FROM pg_stat_activity where state = 'active';"