Load balancing a Wazuh server cluster using NGINX

May 29th 2024 | by David Iglesias | Wazuh 4.7.4

Load balancing refers to the distribution of workloads across multiple servers within a network infrastructure, including both backend and front-end servers. The use of a load balancer (LB) in a Wazuh server cluster enhances the inherent load distribution capabilities of the cluster by more effectively optimizing the usage of processing resources (Wazuh server cluster nodes) and avoiding node overload.

A Wazuh server cluster consists of interconnected Wazuh server nodes that collaborate to improve the availability, scalability, and reliability of security event management across distributed systems. By incorporating a load balancer, we can efficiently manage a larger number of Wazuh agents by distributing incoming traffic across multiple Wazuh server nodes. This traffic includes security event data and log information, which is spread across multiple Wazuh server nodes.

In this blog post, we demonstrate how to configure an NGINX network load balancer using the hash algorithm. This algorithm ensures that similar requests, based on the client’s IP address are consistently routed to the same Wazuh server node. This method optimizes the handling of repeated tasks by maintaining session persistence and distributing the load more evenly.

Infrastructure

We use the following infrastructure to distribute Wazuh agent traffic, including security data and log information across a Wazuh server cluster using a load balancer:

Wazuh central components made up of a Wazuh indexer 4.7.4, a Wazuh server 4.7.4 cluster with three nodes (master and two (2) workers), and a Wazuh 4.7.4 dashboard. Each component is installed on a separate Amazon Linux 2 server. Follow the step-by-step installation guides for Wazuh indexer, Wazuh server cluster, and Wazuh dashboard.
An Amazon Linux 2 server to install the NGINX load balancer. We use the endpoint for testing the load balancing of the Wazuh server cluster.
Three (3) Amazon Linux 2 endpoints with Wazuh agent 4.7.4 installed. Refer to the Wazuh agent installation guide for installing the Wazuh agent. We name the Wazuh agents Amazon1, Amazon2, and Amazon3 respectively when enrolling them to the Wazuh server master node.

Load balancing server cluster infrastructure — Figure 1: Load balancing a Wazuh server cluster infrastructure.

Configuration

Typically, Wazuh agents are configured to report security event data directly to a specific Wazuh server, usually the master server, using its IP address or Fully Qualified Domain Name (FQDN). In this section, we configure the load balancer to receive all Wazuh agents’ events and handle its distribution between the available Wazuh server nodes. All agents are initially enrolled to the master Wazuh server node to streamline the setup and ensure proper enrollment.

Wazuh server

In this setup, the Wazuh server nodes, and Wazuh agents operate independently (blind to the existence of each other), with the load balancer serving as an intermediary.

To ensure the Wazuh agents enroll correctly with the Wazuh server master node, we set use_source_ip in the Wazuh server master /var/ossec/etc/ossec.conf configuration file to no. This allows the master node to see the Wazuh agent’s IP address instead of the load balancer address.

1. Run the Wazuh /var/ossec/bin/cluster_control tool on any Wazuh server node to confirm which Wazuh server node is the master:

# /var/ossec/bin/cluster_control  -l

Output:

NAME     TYPE    VERSION  ADDRESS
wazuh-1  master  4.7.4    172.20.10.12
wazuh-3  worker  4.7.4    172.20.10.11
wazuh-2  worker  4.7.4    172.20.10.13

2. Ensure <use_source_ip> is set to no in the /var/ossec/etc/ossec.conf file of the Wazuh server master node:

<auth>
  <disabled>no</disabled>
  <port>1515</port>
  <use_source_ip>no</use_source_ip>

3. Restart the Wazuh manager to implement the changes:

# systemctl restart wazuh-manager

NGINX LB

Install the NGINX package on the Amazon Linux 2 endpoint and configure two upstream servers, master and mycluster with the hash distribution algorithm. Each serves distinct purposes:

master facilitates Wazuh agent enrollment via the authd process on port 1515, utilizing the Wazuh server master.
mycluster redirects Wazuh agents traffic to the specified cluster nodes using port 1514, which, in this case, are both the Wazuh server master node and the Wazuh server worker nodes. For further details on Wazuh enrollment and connection services, refer to the Wazuh documentation.

1. Run the following commands to install the NGINX package on the Amazon Linux 2 endpoint:

# yum update -y 
# amazon-linux-extras list | grep nginx 
# amazon-linux-extras enable nginx1 
# yum clean metadata 
# yum -y install nginx 
# systemctl enable nginx
# systemctl start nginx

2. Run the command below to confirm that the version of NGINX installed supports stream as a dynamic module with-stream=dynamic. This module is essential for handling TCP/UDP traffic in NGINX:

# nginx -V 2>&1 | grep dynamic

Output:

configure arguments: --prefix=/usr/share/nginx
--sbin-path=/usr/sbin/nginx --modules-path=/usr/lib64/nginx/modules
--conf-path=/etc/nginx/nginx.conf
--error-log-path=/var/log/nginx/error.log 
--http-log-path=/var/log/nginx/access.log
--http-client-body-temp-path=/var/lib/nginx/tmp/client_body
--http-proxy-temp-path=/var/lib/nginx/tmp/proxy
--http-fastcgi-temp-path=/var/lib/nginx/tmp/fastcgi
--http-uwsgi-temp-path=/var/lib/nginx/tmp/uwsgi
--http-scgi-temp-path=/var/lib/nginx/tmp/scgi --pid-path=/run/nginx.pid
--lock-path=/run/lock/subsys/nginx --user=nginx --group=nginx
--with-compat --with-debug --with-file-aio
--with-google_perftools_module --with-http_addition_module
--with-http_auth_request_module --with-http_dav_module
--with-http_degradation_module --with-http_flv_module
--with-http_geoip_module=dynamic --with-stream_geoip_module=dynamic
--with-http_gunzip_module --with-http_gzip_static_module
--with-http_image_filter_module=dynamic --with-http_mp4_module
--with-http_perl_module=dynamic --with-http_random_index_module
--with-http_realip_module --with-http_secure_link_module
--with-http_slice_module --with-http_ssl_module
--with-http_stub_status_module --with-http_sub_module
--with-http_v2_module --with-http_xslt_module=dynamic
--with-mail=dynamic --with-mail_ssl_module --with-pcre --with-pcre-jit
--with-stream=dynamic --with-stream_realip_module
--with-stream_ssl_module --with-stream_ssl_preread_module --with-threads
--with-cc-opt='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions
-fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches
-specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -m64 -mtune=generic'
--with-ld-opt='-Wl,-z,relro
-specs=/usr/lib/rpm/redhat/redhat-hardened-ld -Wl,-E'

3. Create the load balancer configuration file wazuh_load_balancer.conf within the /etc/nginx/conf.d/ directory:

# touch /etc/nginx/conf.d/wazuh_load_balancer.conf

4. Add the following hash algorithm with consistent parameter configuration to the NGINX /etc/nginx/conf.d/wazuh_load_balancer.conf file:

stream {
   upstream master {
       server <MASTER_NODE_IP_ADDRESS>:1515;
   }
   upstream mycluster {
   hash $remote_addr consistent;
       server <MASTER_NODE_IP_ADDRESS>:1514;
       server <WORKER_NODE_IP_ADDRESS>:1514;
       server <WORKER_NODE_IP_ADDRESS>:1514;
   }
   server {
       listen 1515;
       proxy_pass master;
   }
   server {
       listen 1514;
       proxy_pass mycluster;
   }
}

Where:

master is an upstream server listening for agent enrollment connections on port 1515.
mycluster is another stream server listening on port 1514 and distributing security event data and heartbeat messages from the Wazuh agents between the master and worker nodes.

Replace:

The <MASTER_NODE_IP_ADDRESS> variable with the IP address of the Wazuh server master node in your cluster.
The <WORKER_NODE_IP_ADDRESS> variable with the IP address of the Wazuh server worker node in your cluster.

Note: This configuration is scalable by adding more Wazuh server nodes to increase failover points.

5. Test the configuration to check for correct syntax:

# nginx -t

If the output of the test is as seen below follow step 6 to correct the error. If not continue to step 7.

Output:

nginx: [emerg] "stream" directive is not allowed here in /etc/nginx/conf.d/wazuh_load_balancer.conf:1

6. Install the following module nginx-mod-stream.x86_64. It is necessary for handling stream traffic:

# yum install nginx-mod-stream.x86_64 -y

7. Modify the /etc/nginx/nginx.conf file and move the include /etc/nginx/conf.d/*.conf; line of code from within the http{ } directive to above it as seen below:

# For more information on configuration, see:
#   * Official English Documentation: http://nginx.org/en/docs/
#   * Official Russian Documentation: http://nginx.org/ru/docs/

user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log;
pid /run/nginx.pid;

# Load dynamic modules. See /usr/share/doc/nginx/README.dynamic.
include /usr/share/nginx/modules/*.conf;
include /etc/nginx/conf.d/*.conf;

events {
    worker_connections 1024;
}

http {
    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  /var/log/nginx/access.log  main;

    sendfile            on;
    tcp_nopush          on;
    tcp_nodelay         on;
    keepalive_timeout   65;
    types_hash_max_size 4096;

    include             /etc/nginx/mime.types;
    default_type        application/octet-stream;

    # Load modular configuration files from the /etc/nginx/conf.d directory.
    # See http://nginx.org/en/docs/ngx_core_module.html#include
    # for more information.
    include /etc/nginx/conf.d/*.conf;

    server {
        listen       80;
        listen       [::]:80;
        server_name  _;
        root         /usr/share/nginx/html;

        # Load configuration files for the default server block.
        include /etc/nginx/default.d/*.conf;

        error_page 404 /404.html;
        location = /404.html {
        }

        error_page 500 502 503 504 /50x.html;
        location = /50x.html {
        }
    }

# Settings for a TLS enabled server.
#
#    server {
#        listen       443 ssl http2;
#        listen       [::]:443 ssl http2;
#        server_name  _;
#        root         /usr/share/nginx/html;
#
#        ssl_certificate "/etc/pki/nginx/server.crt";
#        ssl_certificate_key "/etc/pki/nginx/private/server.key";
#        ssl_session_cache shared:SSL:1m;
#        ssl_session_timeout  10m;
#        ssl_ciphers PROFILE=SYSTEM;
#        ssl_prefer_server_ciphers on;
#
#        # Load configuration files for the default server block.
#        include /etc/nginx/default.d/*.conf;
#
#        error_page 404 /404.html;
#            location = /40x.html {
#        }
#
#        error_page 500 502 503 504 /50x.html;
#            location = /50x.html {
#        }
#    }

}

8. Test the configuration again:

# nginx -t

Output:

nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful

9. Restart the NGINX service to implement the changes:

# systemctl restart nginx

Similarly to this NGINX configuration example, we could set up an LB in a Wazuh cluster using several other technologies.

Wazuh agent

In a typical setup, Wazuh agents report to a designated Wazuh server. However, we are shifting to a load balancer model for improved performance and reliability. To implement this change, we need to add the load balancer’s IP address to each Wazuh agent’s /var/ossec/etc/ossec.conf file. Also for consistent data transmission, we recommend using the TCP protocol over UDP due to its reliability and error checking.

1. Edit each Wazuh agent’s /var/ossec/etc/ossec.conf file and add the load balancer’s IP address within the <address> tag. Additionally, confirm <protocol> is set to tcp:

<ossec_config>
 <client>
   <server>      
     <address><NGINX_LOAD_BALANCER_IP></address>
     <port>1514</port>
     <protocol>tcp</protocol>

2. Restart the Wazuh agent to implement the changes:

# systemctl restart wazuh-agent

Utilizing a configuration management tool, such as Ansible helps streamline the setup process for environments with numerous Wazuh agents. For detailed instructions on configuring Wazuh agents on multiple endpoints using Ansible, refer to this use case standalone Wazuh agents deployment. In step 3 of the configuration, replace <WAZUH_SERVER_IP> with the IP address of the load balancer.

Testing the configuration

In our current setup, utilizing the hash distribution algorithm on the NGINX configuration, three (3) Wazuh agents Amazon1, Amazon2, and Amazon3 are enrolled and reporting their event data to the Wazuh servers through the LB. We conduct tests to see how the hash distribution algorithm distributes Wazuh agent connections and maintains persistence to reduce disruptions. We stop the Wazuh manager service from running on one of the Wazuh server nodes and observe the Wazuh agent connections on one of the Wazuh agents. We also use the API Console tool on the Wazuh dashboard to see how each Wazuh agents are assigned to each available Wazuh server node. You can get more information on the Wazuh server API and their requests in the reference section.

Wazuh dashboard

To determine how the Wazuh agent connections are assigned to the Wazuh server nodes, we query the Wazuh server API from the API console on the Wazuh dashboard.

1. Navigate to API Console by clicking the dropdown beside the Wazuh logo, then click on Tools > API Console.

2. List your Wazuh agent’s information from the API Console with the following and click on the run icon:

GET /agents?pretty&select=node_name

Output:

{
  "data": {
    "affected_items": [
      {
        "node_name": "wazuh-1",
        "id": "000"
      },
      {
        "node_name": "wazuh-3",
        "id": "001"
      },
      {
        "node_name": "wazuh-2",
        "id": "002"
      },
      {
        "node_name": "wazuh-3",
        "id": "003"
      }
    ],
    "total_affected_items": 4,
    "total_failed_items": 0,
    "failed_items": []
  },
  "message": "All selected agents information was returned",
  "error": 0
}

The output from the API call indicates there are four items.

Where:

ID 001 is Wazuh agent Amazon1
ID 002 is Wazuh agent Amazon2
ID 003 is Wazuh agent Amazon3
wazuh-1 is the Wazuh master server node given ID 000.
wazuh-2 and Wazuh-3 are the Wazuh worker server nodes.

We observe that the hash consistent algorithm maps one of Wazuh agent’s connections to wazuh-2 and others to wazuh-3. These connections include security event data and information logs sent from the Wazuh agents. It’s worth noting that the consistent hashing algorithm aims to evenly distribute requests among the available Wazuh servers based on the input data (in this case, the IP addresses of the agents). This can also be seen properly on the Agents page on the Wazuh dashboard.

It also ensures that the Wazuh agents’ connections mapped remain consistent with the Wazuh server assigned.
This ensures that even if servers are added or removed from the cluster, only a small number of Wazuh agents will need to be reassigned to different servers, minimizing the disruption to the distribution of requests.

To show this we will stop one of the servers and bring it back up to see how it effectively redistributes the agents and ensure consistency. We will stop one of the Wazuh server worker nodes wazuh-2 that has Amazon2 agent with ID 002 reporting to it.

Wazuh server

On the wazuh-2 worker node, we stop and start the Wazuh manager service to see how the Wazuh agents are distributed among the other Wazuh server nodes.

1. Stop the Wazuh manager service on the wazuh-2 worker node:

# systemctl stop wazuh-manager

Wazuh agent

On the Amazon2 Wazuh agent whose agent connections are mapped to the wazuh-2 Wazuh server worker node, we investigate how the load balancer redistributes its agent connections and if the session is persisted on the new Wazuh server node it is mapped to.

Check the /var/ossec/logs/ossec.log file on the Amazon2 Wazuh agent:

# tail -f /var/ossec/logs/ossec.log

Output:

2024/04/10 09:20:50 wazuh-agentd: WARNING: Server unavailable. Setting lock.
2024/04/10 09:20:50 wazuh-agentd: INFO: Closing connection to server ([172.20.10.8]:1514/tcp).
2024/04/10 09:20:50 wazuh-agentd: INFO: Trying to connect to server ([172.20.10.8]:1514/tcp).
2024/04/10 09:20:53 wazuh-agentd: INFO: (4102): Connected to the server ([172.20.10.8]:1514/tcp).
2024/04/10 09:20:53 wazuh-agentd: INFO: Server responded. Releasing lock.

Observe that the Wazuh agent almost immediately reconnects to the cluster through the load balancer to the wazuh-1 Wazuh master server node, this is seen on the Wazuh dashboard as described in the next section.

Wazuh dashboard

1. On the Wazuh dashboard, use the API Console once more to confirm what nodes the Wazuh agents are now reporting to:

GET /agents?pretty&select=node_name

Output:

{
  "data": {
    "affected_items": [
      {
        "node_name": "wazuh-1",
        "id": "000"
      },
      {
        "node_name": "wazuh-3",
        "id": "001"
      },
      {
        "node_name": "wazuh-1",
        "id": "002"
      },
      {
        "node_name": "wazuh-3",
        "id": "003"
      }
    ],
    "total_affected_items": 4,
    "total_failed_items": 0,
    "failed_items": []
  },
  "message": "All selected agents information was returned",
  "error": 0
}

We observe that the Amazon2 Wazuh agent with ID 002 is now reporting to the wazuh-1 Wazuh server master node. while the other Wazuh agents remain consistent with reporting to the wazuh-3 Wazuh server worker node.

You can also view these changes on the Agents page on the Wazuh dashboard.

Wazuh server

We start the Wazuh manager service on the wazuh-2 worker node to ensure all three (3) Wazuh server nodes are available. This is to see how the distribution algorithm of the load balancer maintains the persistence of Wazuh agent connections.

Start the Wazuh manager service on the wazuh-2 worker node:

# systemctl start wazuh-manager

Wazuh dashboard

After the Wazuh manager service is restarted on the wazuh-2 Wazuh worker node, the Wazuh agents remain consistent, reporting to the same nodes, Amazon1 and Amazon3 continue reporting to the wazuh-3 Wazuh worker server node while Amazon2 continues reporting to the wazuh-1 Wazuh master server node. This is seen on the Agents page on the Wazuh dashboard.

Conclusion

In this post, we cover the essential steps for adding a load balancer to a Wazuh server cluster using NGINX. We demonstrate how to configure the NGINX network load balancer to ensure efficient traffic distribution and enhanced resource utilization between the Wazuh agents and Wazuh server nodes. The consistent hashing algorithm used in this blog post maintains stable Wazuh agent connections, even when Wazuh server nodes are added or removed, minimizing disruptions and ensuring continuous monitoring.

Wazuh is an open source SIEM and XDR platform that provides comprehensive visibility and analysis of security events. If you have any questions, check out our documentation to learn more about Wazuh or join our community where our team and contributors will help you.

References

Wazuh server

NGINX LB

Wazuh agent

Testing the configuration

Wazuh dashboard

Wazuh server

Wazuh agent

Wazuh dashboard

Wazuh server

Wazuh dashboard

Related content

Wazuh for CMMC compliance

Detect and respond to BlackSuit ransomware with Wazuh

Detecting malicious URLs using Wazuh and URLhaus

Ransomware protection on Windows with Wazuh

Use cases

Regulatory Compliance