ClickHouse embedded database

Starting with version 4.3 Staffcop Enterprise has a built-in ability to enable the new Clickhouse database.

This open source DBMS allows the ability of a hybrid operating mode for StaffCop Enterprise. This means that the most frequently used part of the data is stored in the fast Clickhouse database, and most of the remaining data remains in reliable Postgresql.

In practice, the hybrid mode allows you to achieve outstanding performance even on databases with hundreds of millions and even several billion entries. In our first pilot project, the speed of working with real data in some cases increased by 50 times.

Clickhouse database requires additional resources for operation

  • additional space on hard disk, about 5-10% of the Postgres database size
  • and memory, the consumption of which depends heavily on the size of the database and the load, but usually does not exceed the minimum system requirements for this database

Enable Clickhouse Database

By default, StaffCop Enterprise works in the usual mode, using the Postgresql database, to enable the new database and work in hybrid mode, starting with StaffCop Enterprise 4.3, you need to run the command for enabling the new mode in Linux command line

sudo staffcop clickhouse-install

The installation script will download a new Clickhouse database from the Internet and install it in the system. Then it will start the process of migrating the necessary data from Postgresql to Clickhouse.

When the new database is installed, StaffCop Enterprise will start working in a hybrid mode of operation, the data in the Constructor and parts of events will appear with a delay of 5-10 minutes from real time, but the entire system will work much faster than it was before, especially if you have everything placed on usual HDDs.

You can notice the work of the new database in lines of the log in the Linux command line

sudo tail -f /var/log/staffcop/process.log | grep -i pump

If you see lines with the output

2018-06-09 17:15:55,397 [INFO] process: Clickhouse pump done
2018-06-09 17:17:03,800 [INFO] process: Clickhouse pump done

This will mean that the Clickhouse is installed and works.

After that an additional menu of switching database mode in the interface will open:

../_images/clickhouse_1.png

Also in Menu -> About, you can see that the current database has changed from Postgresql to Clickhouse.

../_images/clickhouse_2.png

Note

It should be understood that for large databases (from 10 and more GB), the migration process to operation hybrid mode can take considerable time, from dozens of minutes to several hours, at this time events and reports may not be displayed in the web-based admin interface.

Dedicate large memory volume for Clickhouse

There may be situations when the amount of memory dedicated to Clickhouse doesn’t suffice. In this case you can enlarge the amount of memory. To do that open file /etc/clickhouse-server/users.xml

<yandex>
<!-- Profiles of settings. -->
<profiles>
<!-- Default settings. -->
<default>
<!-- Maximum memory usage for processing single query, in bytes. -->
<max_memory_usage>10000000000</max_memory_usage>

In the line

<max_memory_usage>10000000000</max_memory_usage>

Specify the amount of memory you want to dedicate to StaffCop, in bytes.

After resizing the database you need to restart staffcop services.

sudo service clickhouse restart
sudo staffcop restart

Disable Clickhouse database

To disable the hybrid database operation mode in StaffCop Enterprise you just need to stop the Cliskhouse service.

sudo service clickhouse-server stop && sudo service staffcop restart

To completely remove the database from the system - you need to remove the Clickhouse package.

sudo service clickhouse-server stop && sudo apt-get -y purge clickhouse-server-common && sudo service staffcop restart

If the connection to the Clickhouse database is lost, StaffCop Enterprise will work with Postgresql.

Assessing the resources occupied by the Clickhouse database

To estimate the amount of disk space occupied by a new database, you can run this query

sudo du -sm /var/lib/clickhouse/

The number displayed indicates the size of the Clickhouse database in megabytes.

To estimate how many bytes the new database takes in RAM - you need to run the command

sudo ps -aux | grep -i clickhouse-server | awk '{print $5}'

advanced commands for managing Clickhouse in StaffCop