03、ElasticSearch 实战:初探ES的主要配置文件(以6.6.0版本为例)

1 elasticsearch.yml(ES服务配置)

文件位置: ${ES_HOME}/config/elasticsearch.yml

# ======================== Elasticsearch Configuration =========================
#
# NOTE: Elasticsearch comes with reasonable defaults for most settings.
#       Before you set out to tweak and tune the configuration, make sure you
#       understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.
#
# Please consult the documentation for further information on configuration options:
# https://www.elastic.co/guide/en/elasticsearch/reference/index.html

1.1 Cluster集群配置

# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
# 集群名称, 具有相同名称的节点才能组成一个逻辑集群, 默认是"elasticsearch", 建议修改为与项目相关的名称.
cluster.name: heal_es

1.2 Node节点配置

# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
# 本节点的名称, 同一集群中各个node的名称不允许重复. 不配置则系统默认分配.
# node.name: node-1
#
# Add custom attributes to the node:
# 指定节点的部落属性 --- 一个比集群更大的范围, 即定义一些通用属性, 用于集群分配碎片时的过滤. 
#node.attr.rack: r1


Elasticsearch 6.6.0版本中取消了下述配置:

# 指定本节点是否有资格被选举为主节点, 默认是true. 
# ES集群中第一台机器被默认为master, 若此节点挂掉, 将重新选举master. 
# node.master=true
# 
# 指定本节点在集群中是否存储数据, 默认为true. 
# node.data=true
#
# 配置文件中给出了三种配置高性能集群拓扑结构的模式, 如下: 
# 1. 如果想让节点不被选举为主节点, 只用来存储数据, 可作为负载器:  
# node.master: false 
# node.data: true 
# node.ingest: true  默认为true
#
# 2. 如果想让节点成为主节点, 且不存储任何数据, 并保有空闲资源,可作为协调器: 
# node.master: true 
# node.data: false
# node.ingest: true
#
# 3. 如果想让节点既不作主节点, 又不作数据节点, 那么可将其作为搜索器, 从节点中获取数据, 生成搜索结果等:  
# node.master: false 
# node.data: false 
# node.ingest: true
#
# 4. 仅作为协调器: 
# node.master: false 
# node.data: false
# node.ingest: false

1.3 Paths路径配置

# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
# 如果不配置下述两项, ES将在其主目录下创建. 建议将程序与数据分离配置, 方便系统迁移与升级. 
# 存放索引数据的目录
path.data: /data/elk-6.6.0/data
#
# Path to log files: 存放日志信息的目录
path.logs: /data/elk-6.6.0/logs

1.4 Memory内存配置

# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
# 启动时是否锁定ES运行所需的内存, 默认为false. 
# true: 锁定---防止ES使用Swap交换空间, 效率较高. 此时要确保当前用户具有memlock的权限. 
# false: 将使用Swap交换空间.
# bootstrap.memory_lock: false 
bootstrap.system_call_filter: false
# 
# 确保ES_HEAP_SIZE参数的值设置为系统可用内存的一半左右, 不要超过, 因为Lucene底层索引和检索还需要一定的内存. 
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
# 
# 当系统使用内存交换, ES的性能将变得很差
# Elasticsearch performs poorly when the system is swapping the memory.

1.5 Network网络配置

# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
# 对外网关的IP, 默认为localhost, 此时只能通过localhost或127.0.0.1访问. 
# 设置为0.0.0.0, 即可被外部所有网络访问, 如此但是不够安全, 可以指定某一个网段:
network.host: 0.0.0.0
#network.host: 192.168.0.1
#
# Set a custom port for HTTP:
# 对外提供的HTTP访问端口, 默认为9200. 为提高安全性, 建议设置为其他值.
# 可以指定一个值或一个区间, 如果是区间就会采取区间内第一个可用的端口.
#http.port: 9200
# 
# For more information, consult the network module documentation.


Elasticsearch 6.6.0版本取消了transport.tcp.port的设置:

# 
# 集群节点之间通信的TCP传输端口. 下述Discovery部分的设置、ES的Java API 也通过此端口传输数据. 默认为9300. 
# 可以指定一个值或一个区间, 如果是区间就会采取区间内第一个可用的端口.
# transport.tcp.port: 9300

1、 旧版本的Java API中客户端TransportClient 使用的是9300端口, 它执行的是序列化的Java请求, 性能较低, 在7.x版本中将过期, 8.x版本中将移除;
2、 新版本推荐使用Java High Level REST Client客户端, 也就是RestHighLevelClient, 它执行HTTP请求, 使用的端口是http.port, 默认就是9200.

1.6 Discovery节点发现配置

# --------------------------------- Discovery ----------------------------------
# 
# 启动新节点时, 通过IP列表进行节点发现, 组建集群
# Pass an initial list of hosts to perform discovery when new node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
# '127.0.0.1'是ipv4的回环地址, '[::1]'是ipv6的回环地址
# 
# 1.x中默认使用多播(multicast)协议: 自动发现同一网段中的ES节点并组建集群; 
# 2.x中默认使用单播(unicast)协议: 要组建集群, 就需要在这里指定集群的节点信息 -- 安全高效, 但不够灵活. 
# 
# 默认已经关闭了自动发现节点的多播(组播)协议功能: 
# discovery.zen.ping.multicast.enabled: false
# 
# 指定单播模式的IP(或hostname)列表: 
# 也可配置为: ["ip:port", "ip:port"]. 若port未设置, 将使用transport.tcp.port的值. 
#discovery.zen.ping.unicast.hosts: ["host1", "host2"]
# 
# Prevent the "split brain" by configuring the majority of nodes (total number of master-eligible nodes / 2 + 1):
# 配置此参数以防止集群出现"脑裂现象": 集群中出现2个及以上master节点, 将导致数据不一致. 
# 官方推荐: 选举master的最少节点数 = (具有master资格的节点数 / 2) + 1
#discovery.zen.minimum_master_nodes: 
# 
# 设置集群中自动发现其他节点的连接超时时长, 默认是3秒.
# 在网络不佳时增加这个值, 会增加节点等待响应的时间, 可以减少误判.
# discover.zen.ping.timeout: 3s
# 
# For more information, consult the zen discovery module documentation.

1.7 Gateway网关配置

# ---------------------------------- Gateway -----------------------------------
# 
# Block initial recovery after a full cluster restart until N nodes are started:
# 配置集群中的N个节点启动后, 才允许进行数据恢复处理. 默认是1.
#gateway.recover_after_nodes: 3
#
# For more information, consult the gateway module documentation.

1.8 Various其他配置

# ---------------------------------- Various -----------------------------------
# 
# 在一台服务器上禁止启动多个es服务
# Disable starting multiple nodes on a single system:
#
# node.max_local_storage_nodes: 1
#
# 设置是否可以通过正则或者_all删除或者关闭索引库,默认true表示必须需要显式指定索引库名称
# 
# Require explicit names when deleting indices:
#action.destructive_requires_name: true
# 是否压缩TCP传输的数据, 默认是false: 
# transport.tcp.compress: false

# 是否使用HTTP协议对外提供服务, 默认是true:
# http.cors.enabled: true

# http传输内容的最大容量, 默认是100MB:
# http.max_content_length: 100mb

  • 注意:

在2.x版本的配置文件中, 存在 Index 配置项, 可配置包括分片数、副本分片数在内的配置.
在5.x版本中, 不支持在配置文件中设置此类配置项了, 请注意此区别.

  • 修改完之后使用命令查看具体修改了哪些值:
# 查看某个文件上次修改的内容: 
grep '^[a-z]' /data/elk-6.6.0/es-node/config/elasticsearch.yml

2 jvm.options(JVM参数配置)

文件位置: ${ES_HOME}/config/jvm.options

关于JVM常见参数的配置, 可参考博主文章:

对Tomcat 8.0进行JVM层面的优化(基于Oracle JDK 8)

关于JVM的垃圾回收(GC) 这可能是你想了解的

## JVM configuration

################################################################
## IMPORTANT: JVM heap size
################################################################
##
## You should always set the min and max JVM heap
## size to the same value. For example, to set
## the heap to 4 GB, set:
##
## -Xms4g
## -Xmx4g
##
## See https://www.elastic.co/guide/en/elasticsearch/reference/current/heap-size.html
## for more information
##
################################################################

# Xms represents the initial size of total heap space
# Xmx represents the maximum size of total heap space
# 下述配置最好不要超过节点物理内存的50%, 留出50%供Lucene底层索引与检索使用 
-Xms1g
-Xmx1g

################################################################
## Expert settings
################################################################
##
## All settings below this section are considered
## expert settings. Don't tamper with them unless
## you understand what you are doing
##
################################################################

## GC configuration
-XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly

## G1GC Configuration
# NOTE: G1GC is only supported on JDK version 10 or later.
# To use G1GC uncomment the lines below.
# 10-:-XX:-UseConcMarkSweepGC
# 10-:-XX:-UseCMSInitiatingOccupancyOnly
# 10-:-XX:+UseG1GC
# 10-:-XX:InitiatingHeapOccupancyPercent=75

## DNS cache policy
# cache ttl in seconds for positive DNS lookups noting that this overrides the
# JDK security property networkaddress.cache.ttl; set to -1 to cache forever
-Des.networkaddress.cache.ttl=60
# cache ttl in seconds for negative DNS lookups noting that this overrides the
# JDK security property networkaddress.cache.negative ttl; set to -1 to cache
# forever
-Des.networkaddress.cache.negative.ttl=10

## optimizations

# pre-touch memory pages used by the JVM during initialization
-XX:+AlwaysPreTouch

## basic

# explicitly set the stack size (reduce to 320k on 32-bit client JVMs)
-Xss1m

# set to headless, just in case
-Djava.awt.headless=true

# ensure UTF-8 encoding by default (e.g. filenames)
-Dfile.encoding=UTF-8

# use our provided JNA always versus the system one
-Djna.nosys=true

# turn off a JDK optimization that throws away stack traces for common
# exceptions because stack traces are important for debugging
-XX:-OmitStackTraceInFastThrow
# flags to configure Netty
-Dio.netty.noUnsafe=true
-Dio.netty.noKeySetOptimization=true
-Dio.netty.recycler.maxCapacityPerThread=0

# log4j 2
-Dlog4j.shutdownHookEnabled=false
-Dlog4j2.disable.jmx=true
-Dlog4j.skipJansi=true

## heap dumps

# generate a heap dump when an allocation from the Java heap fails
# heap dumps are created in the working directory of the JVM
-XX:+HeapDumpOnOutOfMemoryError

# specify an alternative path for heap dumps; ensure the directory exists and has sufficient space
# 生产环境中指定当发生OOM异常时, Heap的Dump Path, 默认是 -XX:HeapDumpPath=data
-XX:HeapDumpPath=/var/lib/elasticsearch

# specify an alternative path for JVM fatal error logs
-XX:ErrorFile=logs/hs_err_pid%p.log

## JDK 8 GC logging

8:-XX:+PrintGCDetails
8:-XX:+PrintGCDateStamps
8:-XX:+PrintTenuringDistribution
8:-XX:+PrintGCApplicationStoppedTime
8:-Xloggc:logs/gc.log
8:-XX:+UseGCLogFileRotation
8:-XX:NumberOfGCLogFiles=32
8:-XX:GCLogFileSize=64m

# JDK 9+ GC logging
9-:-Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m
# due to internationalization enhancements in JDK 9 Elasticsearch need to set the provider to COMPAT otherwise
# time/date parsing will break in an incompatible way for some date patterns and locals
9-:-Djava.locale.providers=COMPAT

# temporary workaround for C2 bug with JDK 10 on hardware with AVX-512
10-:-XX:UseAVX=2

Elasticsearch 6.6.0中已经移除了下述优化配置:

# disable calls to System#gc
-XX:+DisableExplicitGC

# force the server VM (remove on 32-bit client JVMs)
-server

# use old-style file permissions on JDK9
-Djdk.io.permissionsUseCanonicalPath=true


其他说明:

-Xmx2g        这种参数没有限制JVM版本
8: -Xmx2g     限制JVM版本为8
8-: -Xmx2g    限制JVM版本为8及8以上
8-10: -Xmx2g  限制JVM版本在8-10之间

3 log4j2.properties(日志配置)

文件位置: ${ES_HOME}/config/log4j2.properties

一般使用默认日志配置即可.