chore: sync local changes
This commit is contained in:
@@ -1,111 +1,197 @@
|
||||
# ClickHouse + Fluent Bit 使用手册(Ubuntu 22.04 / Amazon Linux 2023)
|
||||
# ClickHouse + Fluent Bit 快速部署(Ubuntu 22.04 / Amazon Linux 2023)
|
||||
|
||||
## 1. 支持范围
|
||||
## 1. 脚本说明
|
||||
|
||||
- Ubuntu 22.04
|
||||
- Amazon Linux 2023(AWS)
|
||||
|
||||
安装脚本:`install_clickhouse_linux.sh`(自动识别上述系统)。
|
||||
|
||||
## 2. 安装 ClickHouse
|
||||
- `setup_clickhouse.sh`:一键入口(推荐),默认顺序执行 安装 ClickHouse -> 配置 HTTPS -> 应用运行参数 -> 初始化日志表。
|
||||
- `install_clickhouse_linux.sh`:安装 `clickhouse-server`、`clickhouse-client`,并启动服务。
|
||||
- `configure_clickhouse_https.sh`:生成自签名 `server.crt + server.key`,写入 HTTPS 配置并重启服务。
|
||||
- `configure_clickhouse_runtime.sh`:默认将日志级别设为 `warning`,并禁用高开销系统日志表(`text_log`、`part_log`、`metric_log`、`asynchronous_metric_log`、`trace_log`)。
|
||||
- `init_waf_logs_tables.sh`:执行建表脚本。
|
||||
- `init_waf_logs_tables.sql`:`logs_ingest`、`dns_logs_ingest` 表结构定义。
|
||||
|
||||
进入脚本所在目录
|
||||
```bash
|
||||
cd /path/to/waf-platform/deploy/clickhouse
|
||||
chmod +x install_clickhouse_linux.sh
|
||||
sudo ./install_clickhouse_linux.sh
|
||||
cd /opt/waf-platform/deploy/clickhouse
|
||||
chmod +x setup_clickhouse.sh
|
||||
```
|
||||
|
||||
可选:安装时初始化 `default` 用户密码:
|
||||
## 2. 一键部署
|
||||
|
||||
### 2.1 方式A:不设置 ClickHouse 密码(用户名固定 `default`)
|
||||
|
||||
```bash
|
||||
sudo CLICKHOUSE_DEFAULT_PASSWORD='YourStrongPassword' ./install_clickhouse_linux.sh
|
||||
```
|
||||
|
||||
## 3. 开启 HTTPS(默认仅 crt+key)
|
||||
|
||||
脚本默认生成 `server.crt + server.key`(带 SAN)并启用 8443:
|
||||
|
||||
```bash
|
||||
cd /path/to/waf-platform/deploy/clickhouse
|
||||
chmod +x configure_clickhouse_https.sh
|
||||
sudo CH_HTTPS_PORT=8443 \
|
||||
CH_CERT_CN=clickhouse.example.com \
|
||||
CH_CERT_DNS=clickhouse.example.com \
|
||||
CH_CERT_IP=<CLICKHOUSE_IP> \
|
||||
./configure_clickhouse_https.sh
|
||||
```
|
||||
|
||||
使用已有证书:
|
||||
|
||||
```bash
|
||||
sudo SRC_CERT=/path/to/server.crt \
|
||||
SRC_KEY=/path/to/server.key \
|
||||
CH_HTTPS_PORT=8443 \
|
||||
./configure_clickhouse_https.sh
|
||||
```
|
||||
|
||||
## 4. 初始化日志表(含优化)
|
||||
|
||||
```bash
|
||||
cd /path/to/waf-platform/deploy/clickhouse
|
||||
chmod +x init_waf_logs_tables.sh
|
||||
sudo CH_HOST=127.0.0.1 \
|
||||
CH_PORT=9000 \
|
||||
CH_USER=default \
|
||||
CH_PASSWORD='YourStrongPassword' \
|
||||
CH_DATABASE=default \
|
||||
./init_waf_logs_tables.sh
|
||||
sudo ./setup_clickhouse.sh
|
||||
```
|
||||
|
||||
说明:
|
||||
- `init_waf_logs_tables.sql` 已内置主要优化(`CODEC`、`LowCardinality`、跳数索引)。
|
||||
- `optimize_schema.sql` 主要用于历史表补齐优化,不是首次建表必需步骤。
|
||||
- ClickHouse 连接用户是 `default`
|
||||
- 未设置密码时,后续平台连接密码留空
|
||||
|
||||
## 5. 平台侧配置(EdgeAdmin)
|
||||
### 2.2 方式B:设置用户名/密码(示例使用 `default`)
|
||||
|
||||
在 ClickHouse 设置页配置:
|
||||
```bash
|
||||
sudo CH_USER='default' \
|
||||
CH_PASSWORD='YourStrongPassword' \
|
||||
CH_DATABASE='default' \
|
||||
./setup_clickhouse.sh
|
||||
```
|
||||
|
||||
- Host:ClickHouse 地址
|
||||
- Port:`8443`
|
||||
- Database:`default`
|
||||
- Scheme:`https`
|
||||
说明:
|
||||
- `CH_USER`/`CH_PASSWORD`:初始化日志表时用于连接 ClickHouse
|
||||
- 如果你使用自定义用户,把 `CH_USER` 改为你的用户名,并保证该用户已有对应数据库权限
|
||||
|
||||
当前实现说明:
|
||||
- 前端不再提供 `TLS跳过校验` 和 `TLS Server Name` 配置项。
|
||||
- 后端固定 `TLSSkipVerify=true`(默认不校验证书)。
|
||||
可选:单独应用运行参数(日志级别/系统日志表开关):
|
||||
|
||||
保存后点击“测试连接”。
|
||||
```bash
|
||||
sudo CH_LOG_LEVEL=warning ./setup_clickhouse.sh runtime
|
||||
```
|
||||
|
||||
## 6. Fluent Bit 配置方式
|
||||
## 3. ClickHouse 安装后关键目录
|
||||
|
||||
推荐平台托管模式(在线安装/升级 Node、DNS 时自动下发):
|
||||
- 配置目录:`/etc/clickhouse-server/`
|
||||
- 客户端配置目录:`/etc/clickhouse-client/`
|
||||
- 数据目录:`/var/lib/clickhouse/`
|
||||
- 日志目录:`/var/log/clickhouse-server/`
|
||||
- HTTPS 覆盖配置:`/etc/clickhouse-server/config.d/waf-https.xml`
|
||||
- 运行参数覆盖配置:`/etc/clickhouse-server/config.d/waf-runtime.xml`
|
||||
- HTTPS 证书和私钥:`/etc/clickhouse-server/server.crt`、`/etc/clickhouse-server/server.key`
|
||||
- 证书生成中间文件目录:`/etc/clickhouse-server/pki/`
|
||||
|
||||
- `/etc/fluent-bit/fluent-bit.conf`
|
||||
- `/etc/fluent-bit/.edge-managed.env`
|
||||
- `/etc/fluent-bit/.edge-managed.json`
|
||||
## 4. 管理平台配置(EdgeAdmin)
|
||||
|
||||
检查状态:
|
||||
页面路径:
|
||||
- 左侧菜单:`系统设置` -> `高级设置`
|
||||
- 顶部标签:`日志数据库(ClickHouse)`
|
||||
|
||||
表单填写:
|
||||
- `连接地址(Host)`:ClickHouse 地址(IP 或域名),如 `10.0.0.8` 或 `clickhouse.example.com`
|
||||
- `协议(Scheme)`:`https`
|
||||
- `端口(Port)`:`8443`
|
||||
- `用户名(User)`:`default`(或你自定义的用户名)
|
||||
- `密码(Password)`:对应用户密码
|
||||
- `数据库(Database)`:`default`(或你初始化日志表时使用的库名)
|
||||
|
||||
提交顺序:
|
||||
1. 点“测试连接”
|
||||
2. 连接成功后点“保存”
|
||||
|
||||
## 5. Fluent Bit(两种方式)
|
||||
|
||||
### 5.1 跟随节点在线自动安装(推荐)
|
||||
|
||||
说明:
|
||||
- Node / DNS 在线安装或升级时,平台会自动安装/升级 Fluent Bit 并下发配置。
|
||||
- 默认由平台托管,不需要逐台手改配置文件。
|
||||
|
||||
安装后所在节点关键文件:
|
||||
- `/etc/fluent-bit/fluent-bit.conf`:Fluent Bit 主配置(输入日志路径、输出 ClickHouse、性能参数)。
|
||||
- `/etc/fluent-bit/parsers.conf`:日志解析器定义(当前主要使用 JSON parser)。
|
||||
- `/etc/fluent-bit/.edge-managed.env`:平台下发的 ClickHouse 认证环境变量(`CH_USER`/`CH_PASSWORD`)。
|
||||
- `/etc/fluent-bit/.edge-managed.json`:平台下发的元数据(角色、配置哈希、版本、更新时间)。
|
||||
|
||||
|
||||
说明:
|
||||
- 在线安装时,节点上的 `/etc/fluent-bit/fluent-bit.conf` 会被平台下发覆盖。
|
||||
|
||||
fluent-bit中ClickHouse 账号密码下发与更新逻辑:
|
||||
- 下发来源:管理平台 -日志数据库(ClickHouse)中保存的账号密码。
|
||||
- 落地文件:平台在线安装或升级时写入节点 `/etc/fluent-bit/.edge-managed.env`,内容为 `CH_USER`、`CH_PASSWORD`。
|
||||
- 更新触发:当平台里的 ClickHouse 账号或密码变更后,需触发一次节点安装/升级任务以下发新凭证。
|
||||
|
||||
- 常见问题:只在 ClickHouse 侧改密码、未同步更新平台配置时,Fluent Bit 会出现认证失败(401/unauthorized)。
|
||||
|
||||
高配机器调优(当前默认按 4C8G 参数):
|
||||
- 当前默认参数:`Flush=1`、`storage.backlog.mem_limit=512MB`、`Mem_Buf_Limit=256MB`、`workers=2`。
|
||||
- 机器升配后优先调这 4 个参数:
|
||||
- `storage.backlog.mem_limit`:总缓冲上限(先增大,降低突发堆积丢日志风险)。
|
||||
- `Mem_Buf_Limit`:每个 tail input 的内存缓冲(HTTP 与 DNS 两段都要改)。
|
||||
- `workers`:输出并发写入线程数(HTTP 与 DNS 两段都要改)。
|
||||
- `Flush`:刷盘/发送间隔(值越小越实时,CPU/网络开销更高)。
|
||||
- 8C16G 参考值可按 `deploy/fluent-bit/fluent-bit-sample-8c16g.conf`:
|
||||
- `storage.backlog.mem_limit=1024MB`
|
||||
- `Mem_Buf_Limit=512MB`
|
||||
- `workers=4`
|
||||
- `Refresh_Interval=1`
|
||||
- 修改方法:
|
||||
1. 编辑 `EdgeAPI/internal/installers/fluent_bit.go` 的 `renderManagedConfig()`。
|
||||
2. 按上面参数同步修改 Node/DNS 两段 `[INPUT]` 和 `[OUTPUT]`。
|
||||
3. 重新发布 API 并触发节点安装/升级任务,下发新配置。
|
||||
|
||||
检查:
|
||||
|
||||
```bash
|
||||
sudo systemctl status fluent-bit --no-pager
|
||||
sudo cat /etc/fluent-bit/.edge-managed.json
|
||||
sudo journalctl -u fluent-bit -n 100 --no-pager
|
||||
```
|
||||
|
||||
## 7. 验证与排障
|
||||
### 5.2 手动安装(自动安装失败时)
|
||||
|
||||
查看 Fluent Bit 日志:
|
||||
说明:
|
||||
- 适合节点在线自动安装 Fluent Bit 失败的场景。
|
||||
- 采用在线安装方式,由你手动安装并维护配置。
|
||||
|
||||
步骤:
|
||||
|
||||
1. 在线安装 Fluent Bit。
|
||||
|
||||
Ubuntu 22.04:
|
||||
|
||||
```bash
|
||||
curl https://raw.githubusercontent.com/fluent/fluent-bit/master/install.sh | sh
|
||||
sudo apt-get update -y
|
||||
sudo apt-get install -y fluent-bit
|
||||
```
|
||||
|
||||
AWS 2023:
|
||||
|
||||
```bash
|
||||
curl https://raw.githubusercontent.com/fluent/fluent-bit/master/install.sh | sh
|
||||
sudo dnf makecache -y
|
||||
sudo dnf install -y fluent-bit
|
||||
```
|
||||
|
||||
2. 放置配置文件:
|
||||
|
||||
```bash
|
||||
sudo mkdir -p /etc/fluent-bit
|
||||
sudo cp /opt/waf-platform/deploy/fluent-bit/fluent-bit.conf /etc/fluent-bit/
|
||||
sudo cp /opt/waf-platform/deploy/fluent-bit/clickhouse-upstream.conf /etc/fluent-bit/
|
||||
sudo cp /opt/waf-platform/deploy/fluent-bit/parsers.conf /etc/fluent-bit/
|
||||
```
|
||||
|
||||
3. 修改 `/etc/fluent-bit/clickhouse-upstream.conf` 的 ClickHouse `Host`、`Port`(如 `8443`)。
|
||||
4. 配置认证环境变量(按需):
|
||||
|
||||
```bash
|
||||
sudo tee /etc/fluent-bit/fluent-bit.env >/dev/null <<'EOF'
|
||||
CH_USER=default
|
||||
CH_PASSWORD=YourStrongPassword
|
||||
EOF
|
||||
```
|
||||
|
||||
5. 让 systemd 读取环境变量:
|
||||
|
||||
```bash
|
||||
sudo mkdir -p /etc/systemd/system/fluent-bit.service.d
|
||||
sudo tee /etc/systemd/system/fluent-bit.service.d/override.conf >/dev/null <<'EOF'
|
||||
[Service]
|
||||
EnvironmentFile=/etc/fluent-bit/fluent-bit.env
|
||||
EOF
|
||||
```
|
||||
|
||||
6. 启动并检查:
|
||||
|
||||
```bash
|
||||
sudo systemctl daemon-reload
|
||||
sudo systemctl enable fluent-bit
|
||||
sudo systemctl restart fluent-bit
|
||||
sudo systemctl status fluent-bit --no-pager
|
||||
sudo journalctl -u fluent-bit -n 100 --no-pager
|
||||
```
|
||||
|
||||
## 6. 验证
|
||||
|
||||
```bash
|
||||
sudo journalctl -u fluent-bit -f
|
||||
```
|
||||
|
||||
查看写入:
|
||||
|
||||
```sql
|
||||
SELECT count() FROM default.logs_ingest;
|
||||
SELECT count() FROM default.dns_logs_ingest;
|
||||
```
|
||||
|
||||
常见错误:
|
||||
- `connection refused`:8443 未监听或网络未放行。
|
||||
- `legacy Common Name`:证书缺 SAN,需重签。
|
||||
|
||||
@@ -46,18 +46,12 @@ CH_CERT_CN="${CH_CERT_CN:-$(hostname -f 2>/dev/null || hostname)}"
|
||||
CH_CERT_DNS="${CH_CERT_DNS:-}"
|
||||
CH_CERT_IP="${CH_CERT_IP:-}"
|
||||
CH_CERT_DAYS="${CH_CERT_DAYS:-825}"
|
||||
CH_GENERATE_CA="${CH_GENERATE_CA:-false}"
|
||||
|
||||
SRC_CERT="${SRC_CERT:-}"
|
||||
SRC_KEY="${SRC_KEY:-}"
|
||||
SRC_CA="${SRC_CA:-}"
|
||||
|
||||
CH_DIR="/etc/clickhouse-server"
|
||||
CH_CONFIG_D_DIR="${CH_DIR}/config.d"
|
||||
PKI_DIR="${CH_DIR}/pki"
|
||||
SERVER_CERT="${CH_DIR}/server.crt"
|
||||
SERVER_KEY="${CH_DIR}/server.key"
|
||||
CA_CERT="${CH_DIR}/ca.crt"
|
||||
OVERRIDE_FILE="${CH_CONFIG_D_DIR}/waf-https.xml"
|
||||
|
||||
mkdir -p "${CH_CONFIG_D_DIR}" "${PKI_DIR}"
|
||||
@@ -117,72 +111,13 @@ EOF
|
||||
|
||||
cp -f "${server_crt}" "${SERVER_CERT}"
|
||||
cp -f "${server_key}" "${SERVER_KEY}"
|
||||
rm -f "${CA_CERT}"
|
||||
}
|
||||
|
||||
generate_cert_with_ca() {
|
||||
echo "[INFO] generating local CA and server certificate ..."
|
||||
local ca_key="${PKI_DIR}/ca.key"
|
||||
local ca_crt="${PKI_DIR}/ca.crt"
|
||||
local server_key="${PKI_DIR}/server.key"
|
||||
local server_csr="${PKI_DIR}/server.csr"
|
||||
local server_crt="${PKI_DIR}/server.crt"
|
||||
local ext_file="${PKI_DIR}/server.ext"
|
||||
local san_line
|
||||
san_line="$(build_san_line)"
|
||||
|
||||
openssl genrsa -out "${ca_key}" 4096
|
||||
openssl req -x509 -new -nodes -key "${ca_key}" -sha256 -days 3650 \
|
||||
-out "${ca_crt}" -subj "/CN=ClickHouse Local CA"
|
||||
|
||||
openssl genrsa -out "${server_key}" 2048
|
||||
openssl req -new -key "${server_key}" -out "${server_csr}" -subj "/CN=${CH_CERT_CN}"
|
||||
|
||||
cat >"${ext_file}" <<EOF
|
||||
subjectAltName=${san_line}
|
||||
keyUsage=digitalSignature,keyEncipherment
|
||||
extendedKeyUsage=serverAuth
|
||||
EOF
|
||||
|
||||
openssl x509 -req -in "${server_csr}" -CA "${ca_crt}" -CAkey "${ca_key}" -CAcreateserial \
|
||||
-out "${server_crt}" -days "${CH_CERT_DAYS}" -sha256 -extfile "${ext_file}"
|
||||
|
||||
cp -f "${server_crt}" "${SERVER_CERT}"
|
||||
cp -f "${server_key}" "${SERVER_KEY}"
|
||||
cp -f "${ca_crt}" "${CA_CERT}"
|
||||
}
|
||||
|
||||
if [[ -n "${SRC_CERT}" || -n "${SRC_KEY}" ]]; then
|
||||
if [[ -z "${SRC_CERT}" || -z "${SRC_KEY}" ]]; then
|
||||
echo "[ERROR] SRC_CERT and SRC_KEY must be provided together"
|
||||
exit 1
|
||||
fi
|
||||
echo "[INFO] using provided certificate files ..."
|
||||
cp -f "${SRC_CERT}" "${SERVER_CERT}"
|
||||
cp -f "${SRC_KEY}" "${SERVER_KEY}"
|
||||
if [[ -n "${SRC_CA}" ]]; then
|
||||
cp -f "${SRC_CA}" "${CA_CERT}"
|
||||
else
|
||||
rm -f "${CA_CERT}"
|
||||
fi
|
||||
else
|
||||
case "$(echo "${CH_GENERATE_CA}" | tr '[:upper:]' '[:lower:]')" in
|
||||
1|true|yes|on)
|
||||
generate_cert_with_ca
|
||||
;;
|
||||
*)
|
||||
generate_self_signed_cert
|
||||
;;
|
||||
esac
|
||||
fi
|
||||
generate_self_signed_cert
|
||||
|
||||
chown clickhouse:clickhouse "${SERVER_CERT}" "${SERVER_KEY}" || true
|
||||
chmod 0644 "${SERVER_CERT}"
|
||||
chmod 0640 "${SERVER_KEY}"
|
||||
if [[ -f "${CA_CERT}" ]]; then
|
||||
chown clickhouse:clickhouse "${CA_CERT}" || true
|
||||
chmod 0644 "${CA_CERT}"
|
||||
fi
|
||||
|
||||
echo "[INFO] writing ClickHouse HTTPS override config ..."
|
||||
cat >"${OVERRIDE_FILE}" <<EOF
|
||||
@@ -221,7 +156,3 @@ echo "[OK] ClickHouse HTTPS setup finished"
|
||||
echo " HTTPS port : ${CH_HTTPS_PORT}"
|
||||
echo " cert file : ${SERVER_CERT}"
|
||||
echo " key file : ${SERVER_KEY}"
|
||||
if [[ -f "${CA_CERT}" ]]; then
|
||||
echo " CA file : ${CA_CERT}"
|
||||
echo " import this CA file into API/Fluent Bit hosts if tls.verify=On"
|
||||
fi
|
||||
|
||||
50
deploy/clickhouse/configure_clickhouse_runtime.sh
Normal file
50
deploy/clickhouse/configure_clickhouse_runtime.sh
Normal file
@@ -0,0 +1,50 @@
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
if [[ "${EUID}" -ne 0 ]]; then
|
||||
echo "[ERROR] please run as root"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
CH_LOG_LEVEL="${CH_LOG_LEVEL:-warning}"
|
||||
CH_DIR="/etc/clickhouse-server"
|
||||
CH_CONFIG_D_DIR="${CH_DIR}/config.d"
|
||||
OVERRIDE_FILE="${CH_CONFIG_D_DIR}/waf-runtime.xml"
|
||||
|
||||
case "${CH_LOG_LEVEL}" in
|
||||
none|fatal|critical|error|warning|notice|information|debug|trace|test)
|
||||
;;
|
||||
*)
|
||||
echo "[ERROR] invalid CH_LOG_LEVEL: ${CH_LOG_LEVEL}"
|
||||
echo " allowed: none,fatal,critical,error,warning,notice,information,debug,trace,test"
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
|
||||
mkdir -p "${CH_CONFIG_D_DIR}"
|
||||
|
||||
echo "[INFO] writing ClickHouse runtime override config ..."
|
||||
cat >"${OVERRIDE_FILE}" <<EOF
|
||||
<clickhouse>
|
||||
<logger>
|
||||
<level>${CH_LOG_LEVEL}</level>
|
||||
</logger>
|
||||
|
||||
<text_log remove="1"/>
|
||||
<part_log remove="1"/>
|
||||
<metric_log remove="1"/>
|
||||
<asynchronous_metric_log remove="1"/>
|
||||
<trace_log remove="1"/>
|
||||
</clickhouse>
|
||||
EOF
|
||||
|
||||
echo "[INFO] restarting clickhouse-server ..."
|
||||
systemctl restart clickhouse-server
|
||||
sleep 2
|
||||
|
||||
echo "[INFO] service status ..."
|
||||
systemctl --no-pager -l status clickhouse-server | sed -n '1,15p'
|
||||
|
||||
echo "[OK] ClickHouse runtime config applied"
|
||||
echo " file : ${OVERRIDE_FILE}"
|
||||
echo " logger level: ${CH_LOG_LEVEL}"
|
||||
@@ -1,123 +0,0 @@
|
||||
-- =============================================================================
|
||||
-- ClickHouse logs_ingest 表优化脚本
|
||||
--
|
||||
-- 说明:
|
||||
-- - 所有 ALTER 操作均为在线操作,无需停服
|
||||
-- - 建议按阶段顺序执行,每阶段执行后观察 system.parts 确认生效
|
||||
-- - 压缩编解码器变更仅影响新写入的 part,存量数据需等 merge 或手动 OPTIMIZE
|
||||
--
|
||||
-- 执行方式:
|
||||
-- clickhouse-client --host 127.0.0.1 --port 9000 --user default --password 'xxx' < optimize_schema.sql
|
||||
-- =============================================================================
|
||||
|
||||
-- =============================================
|
||||
-- 阶段 1:大字段压缩优化(效果最显著)
|
||||
-- =============================================
|
||||
|
||||
-- 大文本字段改用 ZSTD(3),对 JSON / HTTP 文本压缩率远优于默认 LZ4
|
||||
-- 预期效果:磁盘占用减少 40%-60%
|
||||
ALTER TABLE logs_ingest MODIFY COLUMN request_headers String CODEC(ZSTD(3));
|
||||
ALTER TABLE logs_ingest MODIFY COLUMN request_body String CODEC(ZSTD(3));
|
||||
ALTER TABLE logs_ingest MODIFY COLUMN response_headers String CODEC(ZSTD(3));
|
||||
ALTER TABLE logs_ingest MODIFY COLUMN response_body String CODEC(ZSTD(3));
|
||||
|
||||
-- 中等长度文本字段用 ZSTD(1),平衡压缩率与 CPU 开销
|
||||
ALTER TABLE logs_ingest MODIFY COLUMN ua String CODEC(ZSTD(1));
|
||||
ALTER TABLE logs_ingest MODIFY COLUMN path String CODEC(ZSTD(1));
|
||||
ALTER TABLE logs_ingest MODIFY COLUMN referer String CODEC(ZSTD(1));
|
||||
|
||||
-- 低基数字段改用 LowCardinality(内存+磁盘双降)
|
||||
-- method 的基数极低(GET/POST/PUT/DELETE 等),host 基数取决于站点数量
|
||||
ALTER TABLE logs_ingest MODIFY COLUMN method LowCardinality(String);
|
||||
ALTER TABLE logs_ingest MODIFY COLUMN log_type LowCardinality(String);
|
||||
ALTER TABLE logs_ingest MODIFY COLUMN host LowCardinality(String);
|
||||
|
||||
-- 数值字段使用 Delta + ZSTD 编码(利用相邻行的时间/大小相关性)
|
||||
ALTER TABLE logs_ingest MODIFY COLUMN bytes_in UInt64 CODEC(Delta, ZSTD(1));
|
||||
ALTER TABLE logs_ingest MODIFY COLUMN bytes_out UInt64 CODEC(Delta, ZSTD(1));
|
||||
ALTER TABLE logs_ingest MODIFY COLUMN cost_ms UInt32 CODEC(Delta, ZSTD(1));
|
||||
|
||||
-- =============================================
|
||||
-- 阶段 2:添加 Skipping Index(加速高频过滤查询)
|
||||
-- =============================================
|
||||
|
||||
-- trace_id 精确查找(查看日志详情 FindByTraceId)
|
||||
-- bloom_filter(0.01) = 1% 误判率,GRANULARITY 4 = 每 4 个 granule 一个 bloom block
|
||||
ALTER TABLE logs_ingest ADD INDEX IF NOT EXISTS idx_trace_id trace_id TYPE bloom_filter(0.01) GRANULARITY 4;
|
||||
|
||||
-- IP 精确查找
|
||||
ALTER TABLE logs_ingest ADD INDEX IF NOT EXISTS idx_ip ip TYPE bloom_filter(0.01) GRANULARITY 4;
|
||||
|
||||
-- host 模糊查询支持(tokenbf_v1 对 LIKE '%xxx%' 有效)
|
||||
ALTER TABLE logs_ingest ADD INDEX IF NOT EXISTS idx_host host TYPE tokenbf_v1(10240, 3, 0) GRANULARITY 4;
|
||||
|
||||
-- firewall_policy_id 过滤(HasFirewallPolicy: > 0)
|
||||
ALTER TABLE logs_ingest ADD INDEX IF NOT EXISTS idx_fw_policy firewall_policy_id TYPE minmax GRANULARITY 4;
|
||||
|
||||
-- status 范围过滤(HasError: status >= 400)
|
||||
ALTER TABLE logs_ingest ADD INDEX IF NOT EXISTS idx_status status TYPE minmax GRANULARITY 4;
|
||||
|
||||
-- =============================================
|
||||
-- 阶段 3:物化索引到现有数据(对存量数据生效)
|
||||
-- =============================================
|
||||
-- 注意:MATERIALIZE INDEX 会触发后台 mutation,大表可能需要一定时间
|
||||
-- 可通过 SELECT * FROM system.mutations WHERE is_done = 0 监控进度
|
||||
|
||||
ALTER TABLE logs_ingest MATERIALIZE INDEX idx_trace_id;
|
||||
ALTER TABLE logs_ingest MATERIALIZE INDEX idx_ip;
|
||||
ALTER TABLE logs_ingest MATERIALIZE INDEX idx_host;
|
||||
ALTER TABLE logs_ingest MATERIALIZE INDEX idx_fw_policy;
|
||||
ALTER TABLE logs_ingest MATERIALIZE INDEX idx_status;
|
||||
|
||||
|
||||
-- =============================================================================
|
||||
-- dns_logs_ingest 表优化(DNS 日志表)
|
||||
-- =============================================================================
|
||||
|
||||
-- 大文本字段压缩
|
||||
ALTER TABLE dns_logs_ingest MODIFY COLUMN content_json String CODEC(ZSTD(3));
|
||||
ALTER TABLE dns_logs_ingest MODIFY COLUMN error String CODEC(ZSTD(1));
|
||||
|
||||
-- 低基数字段
|
||||
ALTER TABLE dns_logs_ingest MODIFY COLUMN question_type LowCardinality(String);
|
||||
ALTER TABLE dns_logs_ingest MODIFY COLUMN record_type LowCardinality(String);
|
||||
ALTER TABLE dns_logs_ingest MODIFY COLUMN networking LowCardinality(String);
|
||||
|
||||
-- request_id 精确查找
|
||||
ALTER TABLE dns_logs_ingest ADD INDEX IF NOT EXISTS idx_request_id request_id TYPE bloom_filter(0.01) GRANULARITY 4;
|
||||
|
||||
-- remote_addr 精确查找
|
||||
ALTER TABLE dns_logs_ingest ADD INDEX IF NOT EXISTS idx_remote_addr remote_addr TYPE bloom_filter(0.01) GRANULARITY 4;
|
||||
|
||||
-- question_name 模糊查询
|
||||
ALTER TABLE dns_logs_ingest ADD INDEX IF NOT EXISTS idx_question_name question_name TYPE tokenbf_v1(10240, 3, 0) GRANULARITY 4;
|
||||
|
||||
-- domain_id 过滤
|
||||
ALTER TABLE dns_logs_ingest ADD INDEX IF NOT EXISTS idx_domain_id domain_id TYPE minmax GRANULARITY 4;
|
||||
|
||||
-- 物化索引到现有数据
|
||||
ALTER TABLE dns_logs_ingest MATERIALIZE INDEX idx_request_id;
|
||||
ALTER TABLE dns_logs_ingest MATERIALIZE INDEX idx_remote_addr;
|
||||
ALTER TABLE dns_logs_ingest MATERIALIZE INDEX idx_question_name;
|
||||
ALTER TABLE dns_logs_ingest MATERIALIZE INDEX idx_domain_id;
|
||||
|
||||
|
||||
-- =============================================================================
|
||||
-- 验证命令(执行完上述 ALTER 后运行)
|
||||
-- =============================================================================
|
||||
|
||||
-- 查看列的压缩编解码器
|
||||
-- SELECT name, type, compression_codec FROM system.columns WHERE table = 'logs_ingest' AND database = currentDatabase();
|
||||
|
||||
-- 查看表的压缩率
|
||||
-- SELECT table, formatReadableSize(sum(data_compressed_bytes)) AS compressed, formatReadableSize(sum(data_uncompressed_bytes)) AS uncompressed, round(sum(data_uncompressed_bytes) / sum(data_compressed_bytes), 2) AS ratio FROM system.columns WHERE table IN ('logs_ingest', 'dns_logs_ingest') GROUP BY table;
|
||||
|
||||
-- 查看各列占用的磁盘空间(找出最大的列)
|
||||
-- SELECT name, formatReadableSize(sum(data_compressed_bytes)) AS compressed, formatReadableSize(sum(data_uncompressed_bytes)) AS uncompressed FROM system.columns WHERE table = 'logs_ingest' GROUP BY name ORDER BY sum(data_compressed_bytes) DESC;
|
||||
|
||||
-- 查看 mutation 进度
|
||||
-- SELECT database, table, mutation_id, command, is_done, parts_to_do FROM system.mutations WHERE is_done = 0;
|
||||
|
||||
-- 强制触发 merge(可选,让压缩编解码器变更对存量数据生效)
|
||||
-- OPTIMIZE TABLE logs_ingest FINAL;
|
||||
-- OPTIMIZE TABLE dns_logs_ingest FINAL;
|
||||
108
deploy/clickhouse/setup_clickhouse.sh
Normal file
108
deploy/clickhouse/setup_clickhouse.sh
Normal file
@@ -0,0 +1,108 @@
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
INSTALL_SCRIPT="${SCRIPT_DIR}/install_clickhouse_linux.sh"
|
||||
HTTPS_SCRIPT="${SCRIPT_DIR}/configure_clickhouse_https.sh"
|
||||
RUNTIME_SCRIPT="${SCRIPT_DIR}/configure_clickhouse_runtime.sh"
|
||||
TABLES_SCRIPT="${SCRIPT_DIR}/init_waf_logs_tables.sh"
|
||||
|
||||
usage() {
|
||||
cat <<'EOF'
|
||||
Usage:
|
||||
sudo ./setup_clickhouse.sh [all|install|https|runtime|tables]
|
||||
|
||||
Modes:
|
||||
all Install ClickHouse, configure HTTPS, apply runtime config, init ingest tables (default)
|
||||
install Install ClickHouse only
|
||||
https Configure HTTPS only
|
||||
runtime Apply ClickHouse runtime config only
|
||||
tables Initialize ingest tables only
|
||||
|
||||
Common env vars:
|
||||
CLICKHOUSE_DEFAULT_PASSWORD Default user password set during install
|
||||
CH_HTTPS_PORT HTTPS port (default: 8443)
|
||||
CH_CERT_CN Certificate CN
|
||||
CH_CERT_DNS Certificate SAN DNS list (comma-separated)
|
||||
CH_CERT_IP Certificate SAN IP list (comma-separated)
|
||||
CH_CERT_DAYS Certificate validity days (default: 825)
|
||||
CH_LOG_LEVEL ClickHouse logger level (default: warning)
|
||||
CH_HOST ClickHouse host for table init (default: 127.0.0.1)
|
||||
CH_PORT ClickHouse port for table init (default: 9000)
|
||||
CH_USER ClickHouse user for table init (default: default)
|
||||
CH_PASSWORD ClickHouse password for table init
|
||||
CH_DATABASE Database for table init (default: default)
|
||||
EOF
|
||||
}
|
||||
|
||||
require_script() {
|
||||
local script="$1"
|
||||
if [[ ! -f "${script}" ]]; then
|
||||
echo "[ERROR] required file not found: ${script}"
|
||||
exit 1
|
||||
fi
|
||||
}
|
||||
|
||||
run_install() {
|
||||
echo "[INFO] step 1/3: install ClickHouse ..."
|
||||
bash "${INSTALL_SCRIPT}"
|
||||
}
|
||||
|
||||
run_https() {
|
||||
echo "[INFO] step 2/3: configure ClickHouse HTTPS ..."
|
||||
bash "${HTTPS_SCRIPT}"
|
||||
}
|
||||
|
||||
run_runtime() {
|
||||
echo "[INFO] step 3/4: apply ClickHouse runtime config ..."
|
||||
bash "${RUNTIME_SCRIPT}"
|
||||
}
|
||||
|
||||
run_tables() {
|
||||
echo "[INFO] step 4/4: initialize ingest tables ..."
|
||||
bash "${TABLES_SCRIPT}"
|
||||
}
|
||||
|
||||
MODE="${1:-all}"
|
||||
|
||||
case "${MODE}" in
|
||||
-h|--help|help)
|
||||
usage
|
||||
exit 0
|
||||
;;
|
||||
all|install|https|runtime|tables)
|
||||
;;
|
||||
*)
|
||||
echo "[ERROR] invalid mode: ${MODE}"
|
||||
usage
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
|
||||
require_script "${INSTALL_SCRIPT}"
|
||||
require_script "${HTTPS_SCRIPT}"
|
||||
require_script "${RUNTIME_SCRIPT}"
|
||||
require_script "${TABLES_SCRIPT}"
|
||||
|
||||
case "${MODE}" in
|
||||
all)
|
||||
run_install
|
||||
run_https
|
||||
run_runtime
|
||||
run_tables
|
||||
;;
|
||||
install)
|
||||
run_install
|
||||
;;
|
||||
https)
|
||||
run_https
|
||||
;;
|
||||
runtime)
|
||||
run_runtime
|
||||
;;
|
||||
tables)
|
||||
run_tables
|
||||
;;
|
||||
esac
|
||||
|
||||
echo "[OK] setup completed: mode=${MODE}"
|
||||
Reference in New Issue
Block a user