02、SkyWalking部署-Windows环境(SkyWalking Windows环境部署)

一、环境搭建资料

https://blog.csdn.net/zhangkang65/article/details/78991760

1、下载

1.1官网地址

http://skywalking.apache.org/downloads/

gitlib中文文档地址:

https://github.com/apache/incubator-skywalking/blob/v5.0.0-alpha/docs/README_ZH.md

官方网站:

http://skywalking.apache.org/

http://incubator.apache.org/projects/skywalking.html

github项目地址:

https://github.com/OpenSkywalking/skywalking-netcore

下载

http://skywalking.apache.org/downloads/

2、架构图

 

二、Windows本地部署

1、版本要求

  • SkyWalking5.0.0-GA
  • ElasticSearch-5.x
  • 注意6.X版本不支持;新版本的skywalking使用ES作为存储,所以先安装es。
  • JDK8+ (SkyWalking collector和WebUI部署在jdk8及以上版本)
  • JDK6+(被监控的应用程序运行在jdk6及以上版本)
  • 被监控应用的宿主服务器系统时间(包含时区)与collectors,UIs部署的宿主服务器时间设置正确且相同

2、部署过程

【参考】

https://blog.csdn.net/y_h_d/article/details/83342846

https://blog.csdn.net/jilo88/article/details/81355265

2.1 ES

2、 1.1配置;

修改config/elasticsearch.yml文件

1、 设置:

(1)、设置 cluster.name: CollectorDBCluster

此名称需要和collector配置文件一致。->collector配置文件需要和该名称一致【】。

-如:

collector配置文件为config\application.yml,其中配置为

clusterName: CollectorDBCluster。如下图

 

(2)、设置 node.name: CollectorDBCluster1

可以设置为任意名字,如Elasticsearch为集群模式,则每个节点名称需要不同。

2、 增加如下配置:

# ES监听的ip地址

#network.host: 172.16.105.93【???-未通过】

#-解释 https://www.cnblogs.com/sunxucool/p/3799190.html

network.host: 0.0.0.0

thread_pool.bulk.queue_size: 1000

2、 1.2验证;

localhost:9200或ip:9200(http://172.21.123.99:9200/)

 

2.2 SW-collector

2、 2.1端口要求;

确保端口10800,11800,12800不被占用

2、 2.2存储要求;

collector配置ElasticSearch作为运行存储介质

2、 2.3时间设置要求;

被监控应用的宿主服务器系统时间(包含时区)与collectors,UIs部署的宿主服务器时间设置正确且相同【???】

2、 2.4配置;

  • 位置

apache-skywalking-apm-incubating\config\application.yml

  • 配置项解释(官方解释)

下面是关于collector连接配置的5种类型方式

<1>naming :agent使用HTTP协议连接collectors

<2>agent_gRPC :agent使用gRPC协议连接collectors

<3>remote :Collector使用gRPC协议连接collector

<4>ui :使用HTTP协议连接collector,(大多数情况不需要修改)

<5>agent_jetty:agent使用HTTP协议连接collectors(可选连接)

  • 配置内容1(官方解释- cluster方式)

cluster:

# The Zookeeper cluster for collector cluster management.

zookeeper:

hostPort: localhost:2181

sessionTimeout: 100000

naming:

# Host and port used for agent config

jetty:

# 配置agent发现collector集群,host必须要系统真实网络ip地址. agent --(HTTP)--> collector

host: localhost

port: 10800

contextPath: /

remote:

gRPC:

# 配置collector节点在集群中相互通信,host必须要系统真实网络ip地址.

# collectorN --(gRPC) --> collectorM

host: localhost

port: 11800

agent_gRPC:

gRPC:

# 配置agent上传(链路跟踪和指标)数据到collector,host必须要系统真实网络ip地址. agent--(gRPC)--> collector

host: localhost

port: 11800

agent_jetty:

jetty:

# 配置agent上传(链路跟踪和指标)数据到collector,host必须要系统真实网络ip地址. agent--(HTTP)--> collector

# SkyWalking native Java/.Net/node.js agents don't use this.

# Open this for other implementor.

host: localhost

port: 12800

contextPath: /

analysis_register:

default:

analysis_jvm:

default:

analysis_segment_parser:

default:

bufferFilePath: ../buffer/

bufferOffsetMaxFileSize: 10M

bufferSegmentMaxFileSize: 500M

ui:

jetty:

# 配置UI访问collector,host必须要系统真实网络ip地址.

host: localhost

port: 12800

contextPath: /

# 配置Elasticsearch 集群连接信息

storage:

elasticsearch:

clusterName: CollectorDBCluster

clusterTransportSniffer: true

clusterNodes: localhost:9300

indexShardsNumber: 2

indexReplicasNumber: 0

highPerformanceMode: true

# 设置统计指标数据的失效时间,当指标数据失效时系统将数据自动删除.

traceDataTTL: 90 # 单位为分

minuteMetricDataTTL: 45 # 单位为分

hourMetricDataTTL: 36 # 单位为小时

dayMetricDataTTL: 45 # 单位为天

monthMetricDataTTL: 18 # 单位为月

configuration:

default:

# namespace: xxxxx

# 告警阀值

applicationApdexThreshold: 2000

serviceErrorRateThreshold: 10.00

serviceAverageResponseTimeThreshold: 2000

instanceErrorRateThreshold: 10.00

instanceAverageResponseTimeThreshold: 2000

applicationErrorRateThreshold: 10.00

applicationAverageResponseTimeThreshold: 2000

# 热力图配置,修改配置后需要删除热力指标统计表,由系统重建

thermodynamicResponseTimeStep: 50

thermodynamicCountOfResponseTimeSteps: 40

  • 配置内容参考2

#cluster:

# zookeeper:

# hostPort: localhost:2181

# sessionTimeout: 100000

naming:

jetty:

#OS real network IP(binding required), for agent to find collector cluster

host: 172.21.123.99

port: 10800

contextPath: /

cache:

# guava:

caffeine:

remote:

gRPC:

# OS real network IP(binding required), for collector nodes communicate with each other in cluster. collectorN --(gRPC) --> collectorM

host: 172.21.123.99

port: 11800

agent_gRPC:

gRPC:

#OS real network IP(binding required), for agent to uplink data(trace/metrics) to collector. agent--(gRPC)--> collector

host: 172.21.123.99

port: 11800

# Set these two setting to open ssl

#sslCertChainFile: $path

#sslPrivateKeyFile: $path

# Set your own token to active auth

#authentication: xxxxxx

agent_jetty:

jetty:

# OS real network IP(binding required), for agent to uplink data(trace/metrics) to collector through HTTP. agent--(HTTP)--> collector

# SkyWalking native Java/.Net/node.js agents don't use this.

# Open this for other implementor.

host: 172.21.123.99

port: 12800

contextPath: /

analysis_register:

default:

analysis_jvm:

default:

analysis_segment_parser:

default:

bufferFilePath: ../buffer/

bufferOffsetMaxFileSize: 10M

bufferSegmentMaxFileSize: 500M

bufferFileCleanWhenRestart: true

ui:

jetty:

# Stay in \localhost\ if UI starts up in default mode.

# Change it to OS real network IP(binding required), if deploy collector in different machine.

host: 172.21.123.99

port: 12800

contextPath: /

storage:

elasticsearch:

clusterName: CollectorDBCluster

clusterTransportSniffer: true

clusterNodes: localhost:9300

indexShardsNumber: 2

indexReplicasNumber: 0

highPerformanceMode: true

# Batch process setting, refer to https://www.elastic.co/guide/en/elasticsearch/client/java-api/5.5/java-docs-bulk-processor.html

bulkActions: 2000 # Execute the bulk every 2000 requests

bulkSize: 20 # flush the bulk every 20mb

flushInterval: 10 # flush the bulk every 10 seconds whatever the number of requests

concurrentRequests: 2 # the number of concurrent requests

# Set a timeout on metric data. After the timeout has expired, the metric data will automatically be deleted.

traceDataTTL: 90 # Unit is minute

minuteMetricDataTTL: 90 # Unit is minute

hourMetricDataTTL: 36 # Unit is hour

dayMetricDataTTL: 45 # Unit is day

monthMetricDataTTL: 18 # Unit is month

#storage:

# h2:

# url: jdbc:h2:~/memorydb

# userName: sa

configuration:

default:

#namespace: xxxxx

# alarm threshold

applicationApdexThreshold: 2000

serviceErrorRateThreshold: 10.00

serviceAverageResponseTimeThreshold: 2000

instanceErrorRateThreshold: 10.00

instanceAverageResponseTimeThreshold: 2000

applicationErrorRateThreshold: 10.00

applicationAverageResponseTimeThreshold: 2000

# thermodynamic

thermodynamicResponseTimeStep: 50

thermodynamicCountOfResponseTimeSteps: 40

# max collection's size of worker cache collection, setting it smaller when collector OutOfMemory crashed.

workerCacheMaxSize: 10000

#receiver_zipkin:

# default:

# host: localhost

# port: 9411

# contextPath: /

3、 2.6启动;

单独启动collector,运行 bin/collectorService.bat

2、 2.5验证;

http://172.21.123.99:10800/agent/jetty

 

2.3 SW-Web UI

2、 3.1位置;

WebUI的配置项保存在**\webapp\webapp.yml中

2、 3.2配置修改;

  • listOfServers配置

collector的访问服务名称(与config/application.yml中naming.jetty配置保持相同), 且若是多个collector服务名称用','分隔。

修改collector.ribbon.listOfServers如下图:

 

  • 端口配置

  • 修改原因

web的默认监听端口是8080,与tomcat默认端口冲突。修改该端口。

  • 修改

  •  

  • collector.path

Collector 查询uri地址. 默认是/graphql

  • collector.ribbon.ReadTimeout

查询超时时间,默认是10秒

  • security.user.*

登录用户名/密码. 默认是 admin/admin

2、 3.3启动;

单独启动UI,运行 bin/webappService.bat

2、 3.4验证;

http://ip:8090

2.4 SW-Agent

2、 4.1拷贝目录;

拷贝agent目录到所需位置。日志、插件和配置都包含在包中,请不要改变目录结构。

2、 4.2更改agent配置;

  • 位置

agent\config目录中的agent.config

  • 修改内容

agent.application_code=CollectorDBCluster

#对应elasticsearch中的clusterName,表示数据存储的集合名称【注释错误-可以随意命名】

collector.servers=10.176.16.39:10800

#对应collector配置中的 naming【???】

2.5 监控Tomcat实例

2、 5.1版本要求;

JDK6+(被监控的应用程序运行在jdk6及以上版本)

2、 5.2配置;

简单配置一下如何监控tomcat,在catalina脚本的setlocal下面添加一项

 

2、 5.3配置需要监控的应用的agent探针参考1;

实例windows为例。解压下载的skywalking-agent.zip文件,探针包含整个目录,请不要改变目录结构,可修改agent.config配置agent.application_code=xxl-job为自己的应用名。

配置文件如下:

# The application name in UI

agent.application_code=my_job

# The number of sampled traces per 3 seconds

# Negative number means sample traces as many as possible, most likely 100%

# agent.sample_n_per_3_secs=-1

# The max amount of spans in a single segment.

# Through this config item, skywalking keep your application memory cost estimated.

# agent.span_limit_per_segment=300

# Ignore the segments if their operation names start with these suffix.

# agent.ignore_suffix=.jpg,.jpeg,.js,.css,.png,.bmp,.gif,.ico,.mp3,.mp4,.html,.svg

# If true, skywalking agent will save all instrumented classes files in /debugging\ folder.

# Skywalking team may ask for these files in order to resolve compatible problem.

# agent.is_open_debugging_class = true

# Server addresses.

# Mapping to \agent_server/jetty/port\ in \config/application.yml\ of Collector.

# Examples:

# Single collector:SERVERS="127.0.0.1:8080"

# Collector cluster:SERVERS="10.2.45.126:8080,10.2.45.127:7600"

collector.servers=127.0.0.1:10800

# Logging level

logging.level=DEBUG

2.5.4****部署 java agent参考2

 

--https://blog.csdn.net/jilo88/article/details/81355265

2、 5.5JAR启动参考;

 

--https://blog.csdn.net/jilo88/article/details/81355265

三、启动顺序

1、ElasticSearch

启动elasticsearch.bat

2、SkyWalking

A、分别启动collectorService.bat、webappService.bat

B、启动startup.bat,使用bin/startup.bat则同时启动collector和web

3、监听

启动被监听的应用程序

四、监控结果

 

 

 

五、参考资源

(一)、环境部署--社区

1、 网络;

https://blog.csdn.net/y_h_d/article/details/83342846

https://blog.csdn.net/zhangkang65/article/details/78991760

2、 端口修改skywalking8080端口修改;

https://my.oschina.net/ytqvip/blog/1793767

3、 社区;

docker环境:

https://www.cnblogs.com/liguobao/p/9686310.html

4、 版本5.X;

A类

es环境安装:

http://blog.51cto.com/zero01/2130696

高级特性

https://blog.csdn.net/jilo88/article/details/81355265

https://blog.csdn.net/SoberChina/article/details/79315242

https://blog.csdn.net/qq_42281649/article/details/82804703

5、 独到总结;

https://blog.csdn.net/qq_36236890/article/details/79647017

6、 官方社区;

https://github.com/OpenSkywalking/Community

7、 高级部署;

http://blog.51cto.com/536410/2318051

8、 APM、Google;

pass==++++https://www.cnblogs.com/xiaoqi/p/apm.html

(二)、环境部署--官方--文档

1、 官方;

中文

https://github.com/apache/incubator-skywalking/blob/5.x/docs/README_ZH.md

--英文

https://github.com/apache/incubator-skywalking

2、 Docker;

https://github.com/JaredTan95/skywalking-docker

3、 如何构建项目;

https://github.com/apache/incubator-skywalking/blob/master/docs/en/guides/How-to-build.md

(三)、高级特性

1、 个性化服务过滤;

https://github.com/apache/incubator-skywalking/blob/5.x/apm-sniffer/optional-plugins/trace-ignore-plugin/README_CN.md

https://blog.csdn.net/u013095337/article/details/80452088

2、 版本、;

https://github.com/SkywalkingTest/agent-integration-test-report#dubbo

(四)、理论、深入研究文章

1、 架构设计-系列文章;

https://github.com/apache/incubator-skywalking/blob/5.x/docs/cn/Architecture-CN.md

https://blog.csdn.net/Saphulot/article/details/81739411

pass==https://www.jianshu.com/p/2fd56627a3cf

2、 全面深入分析;

https://juejin.im/post/5a7a9e0af265da4e914b46f1

3、 全面学习;

http://www.iocoder.cn/categories/SkyWalking/

4、 10加文章;

https://juejin.im/post/5ab5b0e26fb9a028e25d7fcb

5、 skywalking源码解析之javaAgent工具ByteBuddy的应用;

http://www.kailing.pub/article/index/arcid/178.html

6、 谷歌论文《Dapper,大规模分布式系统的跟踪系统》;

http://bigbully.github.io/Dapper-translation/

(五)、监控应用

https://www.jianshu.com/p/3ddd986c7581

https://www.cnblogs.com/huangxincheng/p/9666930.html

(六)、APM常见技术对比

https://blog.csdn.net/u012394095/article/details/79700200

https://www.jianshu.com/p/0fbbf99a236e

https://www.cnblogs.com/davidwang456/articles/8119047.html

(七)、UI

https://blog.csdn.net/qq_36236890/article/details/79647017

http://blog.zollty.com/b/archive/apm-comparison-of-skywalking-and-pinpiont.html