Logo
Sign in
  1. Logpoint Service Desk
  2. Knowledge Center
  3. Design & Architecture

Logpoint Service Parameter Tuning

Avatar Basu Nepal
September 30, 2024 08:03
Follow

Introduction

Logpoint is a platform consisting of multiple products. Each product is made up of interconnected components. These individual components, which have their own functionalities, are named Service.

A Service is designed to perform a specific function and works with other services to provide a fully functional Logpoint. If any one of the services is not performing efficiently, it can impact other dependent services, leading to issues in Logpoint's functionalities. For that reason, it is essential for these services to always function efficiently.

To ensure the overall efficiency of Logpoint, these services can be tuned. For example, the percentage of CPU and RAM a service can use. Increasing or decreasing the CPU and RAM percentage will prevent bottlenecks for the service but also not underutilize the allocated CPU and RAM. If one of the services in your Logpoint doesn’t use much resources, its CPU and RAM can be set to low so that other services can utilize that memory and CPU. However, if that service is under heavy load, it can be tuned to use the maximum CPU and RAM so the service does not cause a system bottleneck. For that reason, service parameters are configurable to match system load and requirements.

Every service has its own parameters stored in Service Config File. Configuring a parameter can optimize the performance of a service leading to an overall efficient system.

Service Config File

Each service has its own config file that stores all the configuration parameters that are important for a service to run. These config files are generated every time the system boots up or when there are any changes in the MongoDB collection that needs to be reflected in the config files. The config file stores all the configuration parameters that a service is using to run. If the configuration changes and the service does not respond to the change, it may be because of the config file. In such cases, Logpoint recommends manually regenerating the config file. Regenerating the config file ensures that a service responds to the change in the configuration parameters.

Service config regeneration command:

 /opt/immune/installed/config_updater/apps/config_updater/regenerate_config.sh

Autotuner

Autotuner is a service that automatically tunes service parameters depending on system requirements. It uses a certain threshold to determine whether the parameter need tuning or not. For example: If the premerger service is using up all the allocated RAM and is facing a lack of memory issue, then Autotuner can tune the parameter to increase the allocated RAM and ensure the smooth working of the service. After Logpoint v7.2.0, Autotuner is significantly enhanced to tune the service based on the requirement and the complexity of the task it is performing. If the service is not giving results at the set efficiency, then autotuner can add memory to the service to increase its efficiency and enhance the performance of the service and overall system performance. Autotuner can now dynamically restart a service or allocate additional memory to it if the performance of a service falls below a defined threshold, preventing service crashes and ensuring stable performance. 

Manual Tuning of Service

Autotuner can tune the parameters of the service to prevent service crashes and increase service efficiency but it is not perfect. Depending on the use case and system requirements, human intervention through manual tuning is essential to ensure every service works efficiently. Manual tuning is performed from the lp_service_config file.

LP Services Config

You can also tune the services from one config file lp_services_config.json located at /opt/immune/storage

To do so,

  1. Access Logpoint via the terminal using ssh support@<Machine IP> and enter the support password
  2. Go to /opt/immune/storage and see if there is a lp_services_config.json. By default, it is not there.
  3. Create a file lp_services_config.json and add the parameter in proper JSON format
  4. Regenerate the config using the command: /opt/immune/bin/lido /opt/immune/installed/config-updater/apps/config_updater/regenerate_all.sh
  5. Restart the corresponding service: sudo sv restart /opt/immune/etc/service/<service_name>

Syntax:

{
"<service_name>": {"<parameter>": "<value>"}
}

Tuning one service parameter:

{
"merger": {"heap_size": 10240}
}

Tuning multiple service parameters:

{
"premerger": {"heap_size": 4096, "jsonThreads":4},
"index_searcher_PaloAlto":{"heap_size":10240, "num_of_indexing_threads": 5, "num_index_cache": 32},
"normalizer": {"no_of_services": 20},
"file_keeper":{"no_of_threads": 3,"heap_size": 2048}
}

Frequently Tuned Services

These are the commonly tuned services:

  • Premerger
  • Merger
  • Analyzer
  • Indexsearcher
  • Normalizer
  • File Keeper
  • Syslog Collector
  • Enrich_db_populator
  • Autotuner
  • Batch Processor
  • Enrichment Service

Premerger

It is responsible for processing the queries used in alerts and dashboards. It sends the query request, receives the query result, aggregates the result, and gives the response. Based on the response, the dashboard is populated or an alert incident is triggered.

If Premerger is not running efficiently, it affects alert and dashboard. For that reason, it is crucial to tune its parameters to ensure optimal performance.

Example

{
"premerger": {"heap_size": 1024}
}
Parameter What is it? When to tune? Values Effect
heap_size Amount of heap memory allocated to Premerger G1 Garbage Collector(G1GC): If Full GC is running continuously
 
Shenandoah GC: If memory consumption by the service is equal to the allocated heap size and If Out Of Memory (OOM) occurs frequently for the service

Depends on System Load, EPS, and No. of Searches.
Min. 512 MB
Max. 32 GB (Recommended)

Overall system memory increases or decreases
max_free_heap_ratio Percentage of available heap the service can hold after running GC Only for G1GC, only tuned if heap_size is tuned 60 Recommended Excess memory is released back to the OS
maxClauseCount Maximum number of clauses permitted per BooleanQuery When there are large no. of entries in a search query list

Default: 1024

Recommended: Number of entries in biggest list * 2.5

Search may be slow if the value is increased

Merger

Merger takes the request from the requester, forwards it to IndexSearcher, merges all the responses, and gives the response back to the requester. The requester can be any other services such as Permerger, Websearcher, or API. Merger takes those requests and sends them to the respective service to give the response.

If Merger is not running efficiently, it affects alert, dashboard, search and report.

Example

{
"merger": {"heap_size": 1024}
}
Parameter What is it? When to tune? Values Effect
heap_size Amount of heap memory allocated to Merger. G1 Garbage Collector(G1GC): If Full GC is running continuously.
 
Shenandoah GC: If memory consumption by the service is equal to the allocated heap size and If Out Of Memory (OOM) occurs frequently for the service.

Depends on System Load, EPS, and No. of Searches.
Min. 512 MB
Max. 32 GB (Recommended)

Overall system memory increases or decreases.
max_free_heap_ratio Percentage of the available heap that the service can hold after running GC Only for G1GC, only tuned if heap_size is tuned 60 Recommended Excess memory is released back to the OS
no_of_threads Number of Merger services to run parallelly in different threads Logpoint has enough resources and a large number of searches Default: Depends on the No. of cores.
Min: 1
Max: Number of cores / 2 (Recommended)
Increase in CPU/Memory usage

 

Index Searcher

Index Searcher performs two functions: indexing and searching. It indexes the normalized log and searches the indexed log. It is a repo-dependent service, which means each repo has its own index searcher. 

Indexing

It receives the normalized log forwarded by the file keeper, indexes it based on log_ts, adds index_ts to the key-value pair, and stores it in the storage. index_ts is the time at which the log is indexed by the index searcher.

Searching

When a search request comes to the index searcher, it searches the index and returns the log as per the search query.

If IndexSearcher is down or performing poorly, it can affect alerts, search, and dashboards. At worst, incidents aren’t generated, search doesn’t work and dashboards aren’t populated.

Example

{
"index_searcher_default": {"heap_size": 1024}
}
Parameter What is it? When to tune? Values What's its effect?
heap_size Amount of heap memory allocated to Index Searcher G1 Garbage Collector(G1GC): If Full GC is running continuously.
 
Shenandoah GC: If memory consumption by the service is equal to the allocated heap size and If Out Of Memory (OOM) occurs frequently for the service.
Depends on System Load, EPS, and No. of Searches.
Min. 512 MB
Max. 32 GB (Recommended)

Overall system memory increases or decreases.

max_free_heap_ratio Percentage of the available heap that the service can hold after running GC Only for G1GC, only tuned if heap_size is tuned 60 Recommended Excess memory is released back to the OS
merge_factor Defines how often the segments are merged. Default value is 10, a new segment is created for every 10 documents. When the number of segments reaches 10, the segments themselves are merged to create a single segment.

When there are a lot of active merging thread

or

When merging takes significant time

Default 10

Increase when there are a lot of active merging threads and decrease when there are fewer active merging threads

Increase when the merge time is high and decrease when it is low

Increasing it will increase CPU and Disk Read/Write

num_of_indexing_threads Number of indexing threads to run for indexing logs When indexing MPS is high, and a queue is seen in the index searcher

Default: Max(1,  No. of cores / 8)

Suggested Tuning:  Min(No. of cores / 2, 20)

CPU usage increases or decreases
num_live_threads Number of searching threads to run for the search 

When there are large No. of search requests in the index searcher's benchmarker log

or

If no. of responses is significantly less than no. of requests in the premerger benchmarker log

Default: Max(10, No.of cores / 2)

Suggested Tuning: Max(No. of cores) 

CPU usage increases or decreases

Heap size increases or decreases

num_index_cache Maximum number of indexes that a cache can hold. 

If Logpoint has a large no. of live searches with a large time range

and

When there is sufficient system memory available

Depends on the average time range/search interval of the alert

Default: Max(5, No.of cores / 2)

Heap memory of index searcher increases

maxClauseCount Set the maximum number of clauses permitted per BooleanQuery. When there are large no. of entries in the search query list 

Default: 1024

Recommended: No of entries in largest list * 2.5

Search may be slow if the maxClauseCount value increases
num_of_db_indexing_threads Number of indexing threads to run for indexing delayed logs (logs older than maxNormalTimePeriod defined in file keeper)

Increase when indexing MPS is high, and a queue is seen in the index searcher

Decrease when indexing MPS is low and there isn't any queue

 

Depends on system load

Default: Max(1, No. of cores/8)

Suggested Tuning: Min(No. of cores / 2,  20)

CPU usage increases or decreases

Analyzer

It processes the pattern finding query. When the merger gets the pattern finding query, it forwards it to the analyzer. Analyzer can process a maximum of 10 of these queries at a time.
If the Analyzer service performs poorly, alerts and dashboards with pattern finding queries are affected

Example

{
"analyzer": {"heap_size": 1024}
}
Parameter What is it? When to tune? Values Effect
heap_size Amount of heap memory allocated to Index Searcher G1 Garbage Collector(G1GC): If Full GC is running continuously.
 
Shenandoah GC: If memory consumption by the service is equal to the allocated heap size and If Out Of Memory (OOM) occurs frequently for the service.
Depends on System Load, EPS, and No. of Searches.
Min. 512 MB
Max. 32 GB (Recommended)

Overall system memory increases or decreases.

max_free_heap_ratio Percentage of the available heap that the service can hold after running GC Only for G1GC, only tuned if heap_size is tuned 60 Recommended Excess memory is released back to the OS
allowable_concurrent_analysis No. of correlation queries that can be run concurrently

When the system is processing a large No. of correlation queries 

or

When the system is processing very less No. of correlation queries than the set value of allowable_concurrent_analysis

Default: 10

Increase if there are a large number of correlation queries running in the system

Decrease if there are fewer correlation queries running than the set value

Decrease if you want to run analyzer service without allocating too much memory

CPU, Memory, and Disk Read/Write increases or decreases

max_num_queuable_analysis

No. of correlation queries that can be queued in the buffer

When the system is processing a large No. of correlation queries 

Default: 100

Increase if there are a large number of correlation queries running in the system

CPU, Memory, and Disk Read/Write increases or decreases

searcher_response_timeout Number of seconds after which searcher/merger response will be timed out

When there are a large number of queries getting time out

Default: 300s CPU, Memory, and Disk Read/Write increases or decreases
maxClauseCount Set the maximum number of clauses permitted per BooleanQuery. When there are large No. of entries in the list in a search query 

Default: 1024

Recommended: No of entries in largest list * 2.5

Search may be slow if the maxClauseCount value is increased

Normalizer

It checks the normalization_policy attached to the log and uses it to extract key-value pairs from the raw log in order to normalize it. It also adds log_ts to the key-value pair and checks if the log needs to be enriched or not. If enrichment is required, it forwards the log to the enrichment service. If not, it sends the log to the store handler. Normalizer can also forward the normalized log to the remote logpoint if it is set as Raw Syslog Forwarder.

If Normalizer performs poorly, it results in a buffer in the log collection pipeline and ultimately halts log collection.

Example

{
"normalizer": {"no_of_services": 8}
}
Parameter What is it? When to tune? Values Effect
no_of_services Number of normalizer services to run

When there is a queue in the Normalization Layer (port 5502) and each individual normalizer has large enough throughput (500 MPS)

Default: No. of cores / 4

Min: 1

Max: No. of cores / 2 (Recommended)

CPU and Memory increases

File Keeper

It is a repo-dependent service, which means each repo has its own file keeper. It receives the normalized log from the store handler, stores the log in the repo (according to the routing_policy attached to the log), adds an _offset (location of the stored raw logs) to the key-value pair, and forwards it to the index searcher.

If File Keeper performs poorly, it can cause buffer in the previous layer and eventually halt the log collection.

Example

{
"file_keeper_default": {"no_of_threads": 16}
}
Parameter What is it? When to tune? Values Effect
no_of_threads Number of File Keeper threads to run for storing logs

When the indexing MPS is high and there is a queue in File Keeper

 

Depends on system load

Default: Max(1,  No. of cores / 8)

Suggested Tuning: Min(No. of cores / 2, 20)

CPU usage increases
heap_size Amount of heap memory allocated to Index Searcher G1 Garbage Collector(G1GC): If Full GC is running continuously.
 
Shenandoah GC: If memory consumption by the service is equal to the allocated heap size and If Out Of Memory (OOM) occurs frequently for the service.
Depends on System Load, EPS, and No. of Searches.
Min. 512 MB
Max. 32 GB (Recommended)

Overall system memory increases or decreases.

max_free_heap_ratio Percentage of the available heap that the service can hold after running GC Only for G1GC, only tuned if heap_size is tuned 60 Recommended Excess memory is released back to the OS
storage.base.path Main repo path. Determines where to store the primary and the buffered logs. When file keeper buffers are consistently higher than the default location at /opt/makalu/storage

Default: /opt/makalu/storage

New path should be writable by log inspect user

Changes the location of the repo to store logs.
max_normal_time_period

Defines the maximum interval (in hours) up to which the difference between log_ts and current time can be considered “real-time”.

If the difference between the current time and the log_ts is greater than this value, logs are treated as delayed logs and sent to OldLogsKeeper DB.

When you want to redefine real-time logs and delayed logs for file keeper.

or

When you need to process old logs as real-time logs

Default: 15 

If this value is set low, most of the logs are treated as delayed logs

If this value is set high, most of the logs are treated as real-time logs

no_of_db_storage_threads Number of File Keeper threads to run for handling delayed logs When there is a large buffer in OLDLogsKeeper.

Default: 2

Min: 1

Max: 5 (Recommended)

CPU and Disk Read/Write increases or decreases
max_open_files

Determines the no. of files that File Keeper can open at one time.

If the number of open files exceeds this value, the oldest entry is added to the CompressionQueue.

When you want to change the no. of files that the file keeper can open at one time.

If the value set is to too high, it can cause the "max open files reached" error.

If the value set is to too low, it can cause the "Compression Queue Size Limit Reached" error.

Default: 75

Number of open files increases

Syslog Collector

It collects logs from either external devices or other logpoint machines, parses the collected logs, and forwards them to either the normfront or store handler. If Syslog Collector is performing poorly, syslog collection is impacted and can eventually stop entirely. 

Example

{
"syslog_collector": {"no_of_threads": 16}
}
Parameter What is it? When to tune? Values Effect
ssl_ciphers It is the list of supported cipher suites that are accepted by the Syslog Collector for TLS communication. When you need to add, remove or change the supported TLS ciphers.

Default:

TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384

TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256

TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256

TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256

TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384

TLS_ECDHE_ECDSA_WITH_AES_256_CCM

TLS_ECDHE_ECDSA_WITH_AES_128_CCM

TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256

It changes the cipher suites for TLS communication.
no_of_threads Number of processing threads for UDP server.

When EPS is high but the UDP server is unable to process. Only if enough CPU resource is available.

Default: Min(No. of cores, 8)  CPU usage increases
queue_size Queue size for incoming UDP logs. IF EPS is high and UDP Task pool full warning is seen in syslog collector service. Default: 1000000
Max: 5000000 (Recommended)
Memory consumption increases

Enrich_db_populator

It takes the information from enrichment sources like CSV, ThreatIntelligence, or LDAP and populates the enrichment database. enrichment database has additional data that can be added to the log. If enrich_db_populator performs poorly, it can result in logs not being enriched.

Example

{
"enrich_db_populator": {"max_total_enrich_db_size": 5}
}
Parameter What is it? When to tune? Values Effect
Max_total_enrich_db_size

The size limit of the enrichment.db

When the amount of enrichment data fills the default database size

Default: 4 GB

Disk usage increases

Enrichment Service

It enriches logs by using the data from the enrichment database. It adds new information from the enrichment database to the log and forwards the log to the store handler. It is also called the enrichment layer and is an optional layer.

If it is not performing well, it can impact the whole log collection pipeline including the halt of log collection.

Example

{
"enrichment_service": {"heap_size": 1024}
}
Parameter What is it? When to tune? Values Effect
heap_size Amount of heap memory allocated to Enrichment Service G1 Garbage Collector(G1GC): If Full GC is running continuously.
 
Shenandoah GC: If memory consumption by the service is equal to the allocated heap size and If Out Of Memory (OOM) occurs frequently for the service.
Depends on System Load, EPS, and No. of Searches.
Min: 512 MB
Max: 32 GB (Recommended)

Overall system memory increases or decreases.

number_of_primary_threads Number of threads used for enrichment service. If EPS is very high and Enrichment request is very high. Only If queue exists in 5540 port.

Default: 4
Min: 2
Max: No. of cores (Recommended)

CPU usage increases.

Batch Processor

It processes the files collected and forwarded by collectors/fetchers. Some of the collectors/fetchers send the files to the batch processor for processing and it parses the logs inside the file by applying the parsing rule attached to the log. If it is performing poorly, the files a collector or fetcher retrieves are not processed.

Example

{
"batch_processor": {"max_workers": 8}
}
Parameter What is it? When to tune? Values Effect
max_workers No. of workers process for batch processor

When it is lagging behind during log collection

Default: 4

Min: 1

Max: No. of CPU / 4 (Recommended)

CPU and Memory usage increases

Autotuner

Autotuner is a service that is responsible for automatically tuning service parameters depending on the requirements of the system. It uses a certain threshold number to determine if the parameters need tuning or not.

Example

{
"autotuner": {"min_allowed_throughput": 60}
}
Parameter What is it? When to tune? Values Effect
heap_increment_pct Percentage of heap to be increased during one up tuning When a service is constantly facing a Lack of heap memory issue

Default: 

Min of 512MB(for system < 64 GB RAM) or 1024MB(for system > 64 GB RAM).

or

20%

Increases the heap memory by the applied percentage.
min_allowed_throughput

Minimum threshold for service throughput, after which Autotuner tunes the parameter

When a service is not performing efficiently Default: 50% Increases the heap if service efficiency is below the set threshold.
restart_action Whether to restart the service, when the threshold is below min_allowed_throughput To allow Autotuner to restart the service

Default: Enable

and Disable

It allows the autotuner to restart the service.
restart_count_threshold

Minimum threshold for the number of times a service is restarted before increasing its heap size

When you want to increase heap size immediately after a service restarts once or after it restarts multiple times

Default: 5 It increases the heap size after a service restarts for set no. of times
run_interval

The frequency the autotuner service loop should run

To change the number of frequency autotuner runs to check for service throughput Default: 300s Autotuner runs in the specified interval.

 

 

Comments

Article is closed for comments.

Related articles

  • SentinelOne
  • ChatGPT Integration
  • Universal Normalizer
  • Percentile
  • Lookup
Was this article helpful?
2 out of 2 found this helpful
Privacy policy    EULA    Terms of service   
Copyright © , Logpoint. All rights reserved.

Note: We use cookies that are essential for the smooth functioning of our website.