ClickHouse embedded database

Starting from version 4.3 Staffcop Enterprise has a built-in ability to enable Clickhouse database.

This open source DBMS allows the ability of a hybrid operating mode for StaffCop Enterprise. This means that the most frequently used part of the data is stored in the fast Clickhouse database, and most of the remaining data remains in reliable Postgresql.

In practice, the hybrid mode allows you to achieve outstanding performance even on databases with hundreds of millions and even several billion entries. In our first pilot implementation project, the speed of working with real data in some cases increased by 50 times.

Clickhouse database requires additional resources for operation:

  • additional space on hard disk, about 5-10% of the PostgreSQL database size

  • and memory, the consumption of which depends heavily on the size of the database and the load, but usually does not exceed the minimum system requirements for this database

Enable Clickhouse Database

By default, StaffCop Enterprise works in the usual mode using the Postgresql database. To enable the new database and work in hybrid mode, starting with StaffCop Enterprise 4.3, you need to run the command for enabling the new mode in Linux command line:

sudo staffcop clickhouse-install

The installation script will download Clickhouse from the Internet and install it on the system. Then it will start the process of migrating the necessary data from PostgreSQLto Clickhouse.

When the new database is installed, StaffCop Enterprise will start working in a hybrid mode of operation, the data in the Constructor and parts of events will appear with a delay of 5-10 minutes from real time, but the entire system will work much faster than it was before, especially if you have everything placed on usual HDDs.

The migrating process of a large database may take some time. To trace it, open the page “Admin - Server status”

This page will reflect the name of the polcicy and the amount of events that haven’t yet been processed. By default, policies are processed each 5 minutes. The page automatically updates when the amount of events still needed to be migrated to ClickHouse is close to zero and you can use the admin interface to work with the new database.

After that, an additional menu for switching database modes will appear in the admin panel:

../_images/clickhouse_1.png

Although in Admin - About you will see that the database is changed from PostgreSQL to ClickHouse.

../_images/clickhouse_2.png

Note

Keep in mind that for large databases (from 10 and more GB), the migration process to operation hybrid mode can take considerable time, from dozens of minutes to several hours, at this time events and reports may not be displayed in the web-based admin interface.

Dedicate large memory volume for Clickhouse

There may be situations when the amount of memory dedicated to Clickhouse doesn’t suffice. In this case you can enlarge the amount of memory. To do that open file /etc/clickhouse-server/users.xml

<yandex>
<!-- Profiles of settings. -->
<profiles>
<!-- Default settings. -->
<default>
<!-- Maximum memory usage for processing single query, in bytes. -->
<max_memory_usage>10000000000</max_memory_usage>

In the line

<max_memory_usage>10000000000</max_memory_usage>

Specify the amount of memory you want to dedicate to ClickHouse, in bytes.

After resizing the database you need to restart staffcop services.

sudo service clickhouse-server restart
sudo staffcop restart

Disable Clickhouse database

To disable the hybrid database operation mode in StaffCop Enterprise you just need to stop the Cliskhouse service.

sudo service clickhouse-server stop && sudo service staffcop restart

To completely remove the database from the system - you need to remove the ClickHouse package.

sudo service clickhouse-server stop && sudo apt-get -y purge clickhouse-server-common && sudo service staffcop restart

If the connection to the ClickHouse database is lost, StaffCop Enterprise will work with Postgresql.

Assessing the resources occupied by the Clickhouse database

To estimate the amount of disk space occupied by a new database, you can run this query

sudo du -sm /var/lib/clickhouse/

The number displayed indicates the size of the Clickhouse database in megabytes.

To estimate how many bytes the new database takes in RAM - you need to run the command

sudo ps -aux | grep -i clickhouse-server | awk '{print $5}'

Advanced commands for managing Clickhouse в Staffcop