ClickHouse embedded database¶
Starting with version 4.3 Staffcop Enterprise has a built-in ability to enable the new Clickhouse database.
This open source DBMS allows the ability of a hybrid operating mode for StaffCop Enterprise. This means that the most frequently used part of the data is stored in the fast Clickhouse database, and most of the remaining data remains in reliable Postgresql.
In practice, the hybrid mode allows you to achieve outstanding performance even on databases with hundreds of millions and even several billion entries. In our first pilot project, the speed of working with real data in some cases increased by 50 times.
Clickhouse database requires additional resources for operation
- additional space on hard disk, about 5-10% of the Postgres database size
- and memory, the consumption of which depends heavily on the size of the database and the load, but usually does not exceed the minimum system requirements for this database
Enable Clickhouse Database¶
By default, StaffCop Enterprise works in the usual mode, using the Postgresql database, to enable the new database and work in hybrid mode, starting with StaffCop Enterprise 4.3, you need to run the command for enabling the new mode in Linux command line
sudo staffcop clickhouse-install
The installation script will download a new Clickhouse database from the Internet and install it in the system. Then it will start the process of migrating the necessary data from Postgresql to Clickhouse.
When the new database is installed, StaffCop Enterprise will start working in a hybrid mode of operation, the data in the Constructor and parts of events will appear with a delay of 5-10 minutes from real time, but the entire system will work much faster than it was before, especially if you have everything placed on usual HDDs.
You can notice the work of the new database in lines of the log in the Linux command line
sudo tail -f /var/log/staffcop/process.log | grep -i pump
If you see lines with the output
2018-06-09 17:15:55,397 [INFO] process: Clickhouse pump done
2018-06-09 17:17:03,800 [INFO] process: Clickhouse pump done
This will mean that the Clickhouse is installed and works.
After that an additional menu of switching database mode in the interface will open:
Also in Menu -> About, you can see that the current database has changed from Postgresql to Clickhouse.
Note
It should be understood that for large databases (from 10 and more GB), the migration process to operation hybrid mode can take considerable time, from dozens of minutes to several hours, at this time events and reports may not be displayed in the web-based admin interface.
Dedicate large memory volume for Clickhouse¶
There may be situations when the amount of memory dedicated to Clickhouse doesn’t suffice. In this case you can enlarge the amount of memory. To do that open file /etc/clickhouse-server/users.xml
<yandex>
<!-- Profiles of settings. -->
<profiles>
<!-- Default settings. -->
<default>
<!-- Maximum memory usage for processing single query, in bytes. -->
<max_memory_usage>10000000000</max_memory_usage>
In the line
<max_memory_usage>10000000000</max_memory_usage>
Specify the amount of memory you want to dedicate to StaffCop, in bytes.
After resizing the database you need to restart staffcop services.
sudo service clickhouse restart
sudo staffcop restart
Disable Clickhouse database¶
To disable the hybrid database operation mode in StaffCop Enterprise you just need to stop the Cliskhouse service.
sudo service clickhouse-server stop && sudo service staffcop restart
To completely remove the database from the system - you need to remove the Clickhouse package.
sudo service clickhouse-server stop && sudo apt-get -y purge clickhouse-server-common && sudo service staffcop restart
If the connection to the Clickhouse database is lost, StaffCop Enterprise will work with Postgresql.
Assessing the resources occupied by the Clickhouse database¶
To estimate the amount of disk space occupied by a new database, you can run this query
sudo du -sm /var/lib/clickhouse/
The number displayed indicates the size of the Clickhouse database in megabytes.
To estimate how many bytes the new database takes in RAM - you need to run the command
sudo ps -aux | grep -i clickhouse-server | awk '{print $5}'