ELK这套日志系统相对来说还是太重了,安装部署过程也不简单,消耗的CPU、内存资源也较大。而loki则轻量级,且是根据prometheus思想开发的,消耗资源也很小。以下用它来采集下日志。

创建NAS存储

购买云上文件存储NAS

创建PV和PVC

创建PV, 将购买的NAS挂载上; 创建PVC,绑定上PV

创建loki的配置文件configmap

local-config.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
auth_enabled: false

server:
http_listen_port: 3100

ingester:
lifecycler:
address: 127.0.0.1
ring:
kvstore:
store: inmemory
replication_factor: 1
final_sleep: 0s
chunk_idle_period: 1h # Any chunk not receiving new logs in this time will be flushed
max_chunk_age: 1h # All chunks will be flushed when they hit this age, default is 1h
chunk_target_size: 1048576 # Loki will attempt to build chunks up to 1.5MB, flushing first if chunk_idle_period or max_chunk_age is reached first
chunk_retain_period: 30s # Must be greater than index read cache TTL if using an index cache (Default index read cache TTL is 5m)
max_transfer_retries: 0 # Chunk transfers disabled

schema_config:
configs:
- from: 2020-10-24
store: boltdb-shipper
object_store: filesystem
schema: v11
index:
prefix: index_
period: 24h

storage_config:
boltdb_shipper:
active_index_directory: /loki/boltdb-shipper-active
cache_location: /loki/boltdb-shipper-cache
cache_ttl: 24h # Can be increased for faster performance over longer query periods, uses more disk space
shared_store: filesystem
filesystem:
directory: /loki/chunks

compactor:
working_directory: /loki/boltdb-shipper-compactor
shared_store: filesystem

chunk_store_config:
max_look_back_period: 0s

table_manager:
retention_deletes_enabled: false
retention_period: 0s

ruler:
storage:
type: local
local:
directory: /loki/rules
rule_path: /loki/rules-temp
alertmanager_url: http://localhost:9094
ring:
kvstore:
store: inmemory
enable_api: true

创建loki stateful的工作负载

loki:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: loki
namespace: devops
spec:
podManagementPolicy: OrderedReady
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app: loki
serviceName: ''
template:
metadata:
labels:
app: loki
name: loki
spec:
containers:
- image: 'grafana/loki:latest'
imagePullPolicy: IfNotPresent
name: loki
ports:
- containerPort: 3100
protocol: TCP
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /loki-pre
name: volume-loki
- mountPath: /etc/loki/local-config.yaml
name: volume-1638773405322
subPath: local-config.yaml
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
runAsGroup: 0
runAsNonRoot: false
runAsUser: 0
terminationGracePeriodSeconds: 30
volumes:
- name: volume-loki
persistentVolumeClaim:
claimName: loki
- configMap:
defaultMode: 420
name: local-config.yaml
name: volume-1638773405322
updateStrategy:
rollingUpdate:
partition: 0
type: RollingUpdate

服务中使用promtail收集日志到loki中

/etc/promtail/config.yml配置文件

config.yml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
server:
http_listen_port: 9080
grpc_listen_port: 0

positions:
filename: /tmp/positions.yaml

clients:
- url: http://192.168.96.52:3100/loki/api/v1/push

scrape_configs:
- job_name: demo
static_configs:
- targets:
- localhost
labels:
job: demo
__path__: /logs/demo/*log

- job_name: demo-user
static_configs:
- targets:
- localhost
labels:
job: demo-user
__path__: /logs/demo-user/*log

业务服务:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
labels:
app: springboot
namespace: c
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app: springboot
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
type: RollingUpdate
template:
metadata:
annotations:
redeploy-timestamp: '1637846723068'
labels:
app: springboot
spec:
containers:
- image: 'grafana/promtail:latest'
imagePullPolicy: IfNotPresent
name: promtail
resources:
limits:
cpu: 100m
memory: 100Mi
requests:
cpu: 100m
memory: 100Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /etc/promtail/config.yml
name: volume-1638775609723
subPath: config.yml
- mountPath: /logs/springboot
name: app-logs
- env:
- name: JAVA_OPTS
value: >-
-Duser.timezone=Asia/Shanghai -XX:+UnlockExperimentalVMOptions
-XX:+UseCGroupMemoryLimitForHeap -Xmx512m -Xms512m
-Dspring.profiles.active=pre -Dserver.port=8080
image: >-
registry-vpc.cn-hangzhou.cr.aliyuncs.com/test/springboot:20211202163025-3
imagePullPolicy: Always
livenessProbe:
failureThreshold: 3
httpGet:
path: /springboot-service/monitor/health
port: 8080
scheme: HTTP
initialDelaySeconds: 60
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
name: springboot
ports:
- containerPort: 8080
name: http
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /springboot-service/monitor/health
port: 8080
scheme: HTTP
initialDelaySeconds: 60
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /mnt/logs/springboot
name: app-logs
dnsPolicy: ClusterFirst
imagePullSecrets:
- name: cr-regsecret
- name: regsecret
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
volumes:
- emptyDir: {}
name: app-logs
- configMap:
defaultMode: 420
name: promtail
name: volume-1638775609723

grafana获取并展示:

grafana网络连通loki后,添加相应模板进行日志监控展示

loki模板: 13639

后话

  • 如果k8s中多集群,而不想使用过多的slb,可自已根据实际情况灵活处理,可iptables 、端口转发等将物理机端口进行转发联通服务后采集

  • 如需收集集群外主机的日志,如nginx等网关服,并不在集群内,可手动安装promtail

    wget https://github.com/grafana/loki/releases/download/v2.2.1/promtail-linux-amd64.zip

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    cat > /usr/lib/systemd/system/promtail.service <<EOF
    [Unit]
    Description=promtail sends log files to Loki
    Wants=network-online.target
    After=network-online.target

    [Service]
    ExecStart=/usr/local/promtail-linux-amd64/promtail-linux-amd64 -config.file=/usr/local/promtail-linux-amd64/config.yml

    [Install]
    WantedBy=multi-user.target

    EOF

    grafana:
    https://github.com/sunny0826/cms-grafana-builder