1. 需求分析

1.1 当前现状

iidp平台作为核心业务系统,其存储架构需满足以下关键需求:

  • 多业务pod共享存储:需统一存储引擎底座jar包、应用appjar包(包括业务应用和内置应用)、前端资源包(底座和业务zip包)等文件
  • 多pod并发访问:应用市场、业务引擎等多个POD需同时读写相同文件目录
  • 混合文件规模
    • 小文件(2KB-10KB级):配置文件、元数据文件、前端dist文件
    • 中等文件(1-10MB):前端资源包、轻量级应用
    • 大文件(50-500MB):业务应用app包、引擎底座包、server包
  • 高并发访问需求:应用市场上传/上架期间多个pod并发读写相同文件;在线IDE实时操作元模型文件并上传、git管理、下载共享等操作

现有NFS解决方案面临的核心问题:

  1. 性能瓶颈:小文件读写延迟高,大文件传输速率低
  2. 单点故障风险:NFS服务端故障导致整个平台不可用
  3. 扩展性限制:性能随客户端数量增加呈指数级下降
  4. 运维复杂:调优困难,缺乏有效监控手段

2. 技术选型分析

2.1 JuiceFS核心优势

特性维度 JuiceFS NFS 价值点
架构 元数据/数据分离架构 单体架构 水平扩展能力
性能 本地缓存加速 纯网络传输 10倍+速度提升
高可用 多副本/自动故障转移 单点故障 99.95%+可用性
Kubernetes集成 原生CSI驱动 动态资源调配 动态资源调配
存储后端 支持MinIO/S3等 仅本地存储 无缝接入现有基础设施
监控能力 Prometheus+Dashboard 基础监控 深度可观测性

2.2 关键特性适配分析

  1. 挂载点复用机制

    graph TB subgraph Kubernetes集群 A[应用POD1] -->|共享| C[JuiceFS PVC] B[应用POD2] -->|共享| C D[应用市场POD] -->|共享| C end C --> E[单个Mount Pod] E --> F[元数据Redis] E --> G[对象存储MinIO]

    在apps和apps-frontend两个PVC场景下,无论POD数量多少,仅需维护两个挂载点

  2. 智能缓存体系
    • 本地SSD缓存:热数据加速访问
    • 分布式缓存:跨节点共享缓存
    • 透明缓存同步:确保多客户端一致性
  3. 生产环境验证

3. 性能对比测试

3.1 测试环境

  • 硬件配置:公司提供的服务器 (8vCPU/16GB RAM)
  • 网络环境:千兆以太网(不确定)
  • 对比方案
    • NFSv4.1(现有方案)
    • MinIO直连
    • JuiceFS+MinIO后端

3.2 性能指标对比

指标 NFS MinIO直连 JuiceFS 性能对比
写吞吐 52.23 MiB/s 134.42 MiB/s 825.47 MiB/s 15.8× NFS
读吞吐 71.97 MiB/s 108.87 MiB/s 587.95 MiB/s 8.2× NFS
单文件耗时 1.91 s 0.15 s 0.12 s 94% ↓
操作延迟 19.58 ms 114.78 ms 1.37 ms 93% ↓
指标 NFS MinIO直连 JuiceFS 性能对比
写入IOPS 102.1 112.74 152.6 49% ↑
读取IOPS 240.9 368.76 254.3 5.6% ↑
元数据操作 8468.5 ops 2,298.5 ops 4,567 ops 2.0× ↑
延迟优化 0.87 ms - 0.19 ms 78% ↓

3.3 JuiceFS特有性能优势

  1. 元数据操作优化
    • 文件属性查询速度:0.19ms/op (NFS为0.87ms)
    • 列表操作性能:18,598.6/s(MinIO基准)
  2. 缓存效率指标
    pie title 缓存命中率对比 “JuiceFS缓存命中” : 75 “磁盘读取” : 15 “网络读取” : 10
操作类型 MinIO基准 JuiceFS优化 性能变化 关键原理
小对象写入 112.6 ops/s 152.6 ops/s ↑35.5% 客户端合并写入+元数据批量提交,减少对象存储API调用次数
小对象读取 368.0 ops/s 254.3 ops/s ↓30.9% 测试环境未启用本地缓存,直接访问对象存储导致延迟增加
删除操作 1441.0 ops/s 1458.48 ops/s ↑1.2% 元数据引擎(Redis)事务处理优化,与对象存储解耦

4. 技术实施细节

4.1 Kubernetes集成方案

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: juicefs-sc
provisioner: csi.juicefs.com
parameters:
  storage: minio
  bucket: "http://minio-service:9000/jfs-bucket"
  metaurl: "redis://redis-service:6379/8"
  mountOptions:
    - cache-dir=/var/juicefs
    - cache-size=20480
    - max-uploads=50
    - writeback_cache

4.2 核心参数优化建议

  1. 缓存策略

    --cache-dir=/mnt/ssd_cache  # SSD加速
    --cache-size=20480          # 20GB缓存
    --free-space-ratio=0.2      # 保留20%空间
  2. IO优化参数

    -o max_uploads=100          # 提升并发上传
    -o writeback_cache          # 内核级合并写入
    -o keep_pagecache           # 保留页面缓存
  3. 监控配置

    --metrics=localhost:9567    # Prometheus采集
    --consul=192.168.1.100:8500 # 服务注册

4.3 高可用架构

graph TD subgraph K8s集群 JFS[Mount Pod] --> Redis[Redis哨兵集群] JFS --> MinIO[MinIO集群] end subgraph 监控体系 Prometheus -->|拉取| JFS Grafana --> Prometheus 告警中心 --> Grafana end

5. 迁移与实施路径

5.1 分阶段迁移方案

阶段 目标 时间窗口 回滚方案
并行运行期 新写入数据双写 1-2周 删除JuiceFS路由
历史数据迁移 rsync增量同步 维护窗口 脚本自动回退
流量切换 DNS切流 5分钟 DNS回切
验证期 监控对比 48小时 自动报警触发回滚

5.2 数据迁移脚本示例

#!/bin/bash
# 增量迁移方案
while true; do
  rsync -avz --delete /nfs/apps/ /jfs/apps/
  rsync -avz --delete /nfs/frontend/ /jfs/frontend/
  sleep 300  # 5分钟同步间隔
done

6. 风险控制

6.1 潜在风险应对

风险点 概率 影响 缓解措施
缓存不一致 启用fsync强同步
元数据延迟 Redis集群优化/采用mysql
挂载点故障 自动重启机制
容量不足 自动扩容策略

6.2 监控指标阈值

监控项:
  - name: juicefs_cache_hit_ratio
    warn: <0.6
    crit: <0.4
    
  - name: juicefs_used_buffer_ratio
    warn: >0.8
    crit: >0.9
    
  - name: juicefs_fuse_ops
    crit: latency_ms > 1000

7. 结论与建议

7.1 技术可行性结论

  1. 性能优势
    • 大文件读写性能提升8-15倍
    • 小文件IOPS提升30-50%
    • 元数据操作延迟降低80%
  2. 架构优势
    • 完美适配Kubernetes多POD共享场景
    • 复用现有MinIO存储基础设施
    • 消除单点故障风险
  3. 运维优势
    • 可视化监控看板实现深度洞察
    • 自动化故障恢复机制
    • 无感扩容能力

7.2 实施建议

  1. 分阶段部署
    • Phase1:应用市场模块优先迁移
    • Phase2:业务引擎迁移
    • Phase3:前端资源管理迁移
  2. 性能调优重点

    graph LR A[SSD缓存盘] --> B[写合并优化] C[内存缓存] --> D[并发上传] E[元数据集群] --> F[监控告警]
  3. 长期演进方向
    • 实现自动分级存储(热/温/冷数据)
    • 构建跨区域复制能力

8. 压测参考数据

在公司服务器上使用docker搭建了一个单机版本的minio和单机版redis。 使用minio官方压测工具

warp minio 压测结果

Reqs: 26887, Errs:0, Objs:26887, Bytes: 157.50GiB
-  DELETE Average: 9 Obj/s; Current 5 Obj/s, 77.0 ms/req
-  GET Average: 40 Obj/s, 403.3MiB/s; Current 40 Obj/s, 395.4MiB/s, 194.0 ms/req, TTFB: 88.5ms
-  PUT Average: 13 Obj/s, 134.4MiB/s; Current 14 Obj/s, 136.2MiB/s, 773.3 ms/req
-  TAT Average: 27 Obj/s; Current 30 Obj/s, 46.3 ms/req


Report: DELETE. Concurrency: 20. Ran: 4m57s
* Average: 8.95 obj/s
* Reqs: Avg: 71.4ms, 50%: 63.3ms, 90%: 133.6ms, 99%: 234.0ms, Fastest: 2.6ms, Slowest: 382.1ms, StdDev: 43.9ms

Throughput, split into 297 x 1s:
* Fastest: 18.48 obj/s
* 50% Median: 8.82 obj/s
* Slowest: 2.06 obj/s

──────────────────────────────────

Report: GET. Concurrency: 20. Ran: 4m57s
* Average: 403.29 MiB/s, 40.33 obj/s
* Reqs: Avg: 189.2ms, 50%: 172.3ms, 90%: 321.1ms, 99%: 475.3ms, Fastest: 24.9ms, Slowest: 1211.2ms, StdDev: 94.7ms
* TTFB: Avg: 81ms, Best: 4ms, 25th: 36ms, Median: 63ms, 75th: 112ms, 90th: 167ms, 99th: 275ms, Worst: 610ms StdDev: 60ms

Throughput, split into 297 x 1s:
* Fastest: 686.4MiB/s, 68.64 obj/s
* 50% Median: 390.8MiB/s, 39.08 obj/s
* Slowest: 136.5MiB/s, 13.65 obj/s

──────────────────────────────────

Report: PUT. Concurrency: 20. Ran: 4m57s
* Average: 134.42 MiB/s, 13.44 obj/s
* Reqs: Avg: 785.0ms, 50%: 771.7ms, 90%: 1034.3ms, 99%: 1285.9ms, Fastest: 300.7ms, Slowest: 2193.2ms, StdDev: 176.9ms

Throughput, split into 297 x 1s:
* Fastest: 207.4MiB/s, 20.74 obj/s
* 50% Median: 136.5MiB/s, 13.65 obj/s
* Slowest: 49.9MiB/s, 4.99 obj/s

──────────────────────────────────

Report: STAT. Concurrency: 20. Ran: 4m57s
* Average: 26.88 obj/s
* Reqs: Avg: 46.6ms, 50%: 38.9ms, 90%: 86.6ms, 99%: 167.7ms, Fastest: 2.0ms, Slowest: 454.3ms, StdDev: 31.0ms

Throughput, split into 297 x 1s:
* Fastest: 49.00 obj/s
* 50% Median: 26.43 obj/s
* Slowest: 7.00 obj/s


──────────────────────────────────

Report: Total. Concurrency: 20. Ran: 4m57s
* Average: 537.72 MiB/s, 89.61 obj/s

Throughput, split into 297 x 1s:
* Fastest: 833.6MiB/s, 146.81 obj/s
* 50% Median: 511.6MiB/s, 89.18 obj/s
* Slowest: 236.0MiB/s, 35.61 obj/s


Cleanup
Cleanup Done

juicefs objbench 结果

Start Performance Testing ...
put small objects: 100/100 [==============================================================]  112.6/s   used: 887.91245ms
get small objects: 100/100 [==============================================================]  368.0/s   used: 271.76285ms
upload objects: 25/25 [==============================================================]  20.5/s    used: 1.218708009s
download objects: 25/25 [==============================================================]  27.2/s    used: 919.960997ms
list objects: 500/500 [==============================================================]  18598.6/s used: 27.061319ms
head objects: 125/125 [==============================================================]  2209.6/s  used: 56.73503ms
delete objects: 125/125 [==============================================================]  1441.0/s  used: 86.870648ms
Benchmark finished! block-size: 4.0 MiB, big-object-size: 100 MiB, small-object-size: 128 KiB, small-objects: 100, NumThreads: 4
+--------------------+--------------------+-----------------------+
|        ITEM        |        VALUE       |          COST         |
+--------------------+--------------------+-----------------------+
|     upload objects |        82.12 MiB/s |      194.84 ms/object |
|   download objects |       108.87 MiB/s |      146.97 ms/object |
|  put small objects |   112.74 objects/s |       35.48 ms/object |
|  get small objects |   368.76 objects/s |       10.85 ms/object |
|       list objects | 19537.05 objects/s | 25.59 ms/ 125 objects |
|       head objects |  2284.84 objects/s |        1.75 ms/object |
|     delete objects |  1458.48 objects/s |        2.74 ms/object |
| change permissions |        not support |           not support |
| change owner/group |        not support |           not support |
|       update mtime |        not support |           not support |
+--------------------+--------------------+-----------------------+

juicefs bench nfs 结果

juicefs bench . --big-file-size 50M -p 2
  Write big blocks: 100/100 [====================================================]  51.7/s   used: 1.934549134s
  Read big blocks: 100/100 [====================================================]  71.9/s   used: 1.391573646s
  Write small blocks: 200/200 [====================================================]  102.1/s  used: 1.95950659s
  Read small blocks: 200/200 [====================================================]  240.2/s  used: 832.645761ms
  Stat small files: 200/200 [====================================================]  8434.9/s used: 23.787622ms
Benchmark finished!
BlockSize: 1.0 MiB, BigFileSize: 50 MiB, SmallFileSize: 128 KiB, SmallFileCount: 100, NumThreads: 2
+------------------+-----------------+---------------+
|       ITEM       |      VALUE      |      COST     |
+------------------+-----------------+---------------+
|   Write big file |     52.23 MiB/s |   1.91 s/file |
|    Read big file |     71.97 MiB/s |   1.39 s/file |
| Write small file |   102.1 files/s | 19.58 ms/file |
|  Read small file |   240.9 files/s |  8.30 ms/file |
|        Stat file | 8468.5 files/s |  0.24 ms/file |
+------------------+-----------------+---------------+

juicefs bench 结果

juicefs bench . --big-file-size 50M -p 2
  Write big blocks: 100/100 [====================================================]  801.2/s  used: 124.823072ms
  Read big blocks: 100/100 [====================================================]  583.9/s   used: 171.326371ms
  Write small blocks: 200/200 [====================================================]  152.6/s  used: 1.311017584s
  Read small blocks: 200/200 [====================================================]  253.9/s  used: 787.66073ms
  Stat small files: 200/200 [====================================================]  2273.5/s used: 88.041224ms
Benchmark finished!
BlockSize: 1.0 MiB, BigFileSize: 50 MiB, SmallFileSize: 128 KiB, SmallFileCount: 100, NumThreads: 2
Time used: 6.2 s, CPU: 26.5%, Memory: 310.7 MiB
+------------------+-----------------+---------------+
|       ITEM       |      VALUE      |      COST     |
+------------------+-----------------+---------------+
|   Write big file |    825.47 MiB/s |   0.12 s/file |
|    Read big file |    587.95 MiB/s |   0.17 s/file |
| Write small file |   152.6 files/s | 13.10 ms/file |
|  Read small file |   254.3 files/s |  7.87 ms/file |
|        Stat file |  2298.5 files/s |  0.87 ms/file |
|   FUSE operation | 4567 operations |    1.05 ms/op |
|      Update meta |  608 operations |    3.66 ms/op |
|       Put object |  226 operations |  114.78 ms/op |
|       Get object |    0 operations |    0.00 ms/op |
|    Delete object |    0 operations |    0.00 ms/op |
| Write into cache |  226 operations |    1.37 ms/op |
|  Read from cache |  229 operations |    5.46 ms/op |
+------------------+-----------------+---------------+

9. 部署流程

这里给出一个values.yaml 模板,只是在我测试环境可用,具体参数需要根据实际环境调整,比如私有镜像等。

# Default values for juicefs-csi
# This is a YAML-formatted file
# Declare variables to be passed into your templates

# Overrides the chart's computed name
# nameOverride: ""
# Overrides the chart's computed fullname
# fullnameOverride: ""

image:
  repository: dockerhub.kubekey.local/test/juicefs-csi-driver
  tag: "v0.28.3"
  pullPolicy: ""

dashboardImage:
  repository: dockerhub.kubekey.local/test/csi-dashboard
  tag: "v0.28.3"
  pullPolicy: ""

sidecars:
  livenessProbeImage:
    repository: dockerhub.kubekey.local/test/livenessprobe
    tag: "v2.12.0"
    pullPolicy: ""
  nodeDriverRegistrarImage:
    repository: dockerhub.kubekey.local/test/csi-node-driver-registrar
    tag: "v2.9.0"
    pullPolicy: ""
  csiProvisionerImage:
    repository: dockerhub.kubekey.local/test/csi-provisioner
    tag: "v2.2.2"
    pullPolicy: ""
  csiResizerImage:
    repository: dockerhub.kubekey.local/test/csi-resizer
    tag: "v1.9.0"
    pullPolicy: ""

imagePullSecrets: []

# The way JuiceFS Client runs. choose between:
# - mountpod: default, run JuiceFS Client in an independent pod
# - sidecar: run JuiceFS Client as a sidecar container in the same pod with application
# - process: run JuiceFS Client as a process in the JuiceFS CSI node service
# - serverless: a special "sidecar" mode that requires no privilege, creates no hostPath volumes, to allow full serverless deployment
# Ref: https://juicefs.com/docs/csi/introduction/
mountMode: mountpod

# The name of the JuiceFS CSI driver
driverName: "csi.juicefs.com"

# This file contains the configuration options for the JuiceFS CSI driver
# Ref: https://juicefs.com/docs/zh/csi/guide/configurations#configmap
globalConfig:
  # Set to false to disable global config
  enabled: true

  # Set to true to manage global config by Helm
  # If set to false:
  #    1. the global config will only be applied in the first installation, and will not be updated or deleted by Helm
  #    2. if you want to update it, you need to edit the configmap directly, or use csi-dashboard
  manageByHelm: true

  # Set to true to schedule mount pod to node with via nodeSelector, rather than nodeName
  enableNodeSelector: false
  
  # The mountPodPatch section defines the mount pod spec
  # Each item will be recursively merged into PVC settings according to its pvcSelector
  # If pvcSelector isn't set, the patch will be applied to all PVCs
  # Variable templates are supported, e.g.  ${MOUNT_POINT}, ${SUB_PATH}, ${VOLUME_ID}
  mountPodPatch:

    # Example configurations:
    # - pvcSelector:
    #     matchLabels:
    #       disable-host-network: "true"
    #   hostNetwork: false
  
    # - pvcSelector:
    #     matchLabels:
    #       apply-labels: "true"
    #   labels:
    #     custom-labels: "asasasa"
  
    # - pvcSelector:
    #     matchLabels:
    #       custom-resources: "true"
    #   resources:
    #     requests:
    #       cpu: 100m
    #       memory: 512Mi

    # - pvcSelector:
    #     matchLabels:
    #       custom-image: "true"
    #   eeMountImage: "juicedata/mount:ee-5.0.17-0c63dc5"
    #   ceMountImage: "juicedata/mount:ce-v1.2.0"
  
    # - pvcSelector:
    #     matchLabels:
    #       custom-liveness: "true"
    #   livenessProbe:
    #     exec:
    #       command:
    #       - stat
    #       - ${MOUNT_POINT}/${SUB_PATH}
    #     failureThreshold: 3
    #     initialDelaySeconds: 10
    #     periodSeconds: 5
    #     successThreshold: 1

# For some environment without DNS server and want to use /etc/hosts instead
# - ip: "127.0.0.1"
#   hostnames:
#   - "s3.juicefs.local"
#   - "redis.juicefs.local"
hostAliases: []

# The kubelet working directory, can be set using --root-dir when starting kubelet
kubeletDir: /var/lib/kubelet

# JuiceFS mount directory
jfsMountDir: /var/lib/juicefs/volume
# JuiceFS config directory
jfsConfigDir: /var/lib/juicefs/config

# Specifies whether JuiceFS is being deployed in an immutable Kubernetes environment.
# Immutable environments, such as Talos Linux, have read-only paths in the host filesystem.
immutable: false

dnsPolicy: ClusterFirstWithHostNet
dnsConfig: {}
# Example config which uses the AWS nameservers
# dnsPolicy: "None"
# dnsConfig:
#   nameservers:
#     - 169.254.169.253

serviceAccount:
  controller:
    # Specifies whether a service account of controller should be created
    create: true
    # Annotations to add to the service account
    annotations: {}
    # The name of the service account to use
    name: "juicefs-csi-controller-sa"
  node:
    # Specifies whether a service account of node service should be created
    create: true
    # Annotations to add to the service account
    annotations: {}
    # The name of the service account to use
    name: "juicefs-csi-node-sa"
  dashboard:
    # Specifies whether a service account of dashboard should be created
    create: true
    # Annotations to add to the service account
    annotations: {}
    # The name of the service account to use
    name: "juicefs-csi-dashboard-sa"

controller:
  enabled: true
  # Enable verbose logging
  debug: false
  leaderElection:
    # Enable leader election for controller, ref: https://juicefs.com/docs/csi/administration/going-production#leader-election
    enabled: true
    # The namespace where the leader election resource lives. Defaults to the pod namespace if not set
    leaderElectionNamespace: ""
    # The duration that non-leader candidates will wait to force acquire leadership. This is measured against time of last observed ack
    # Defaults to 15s, if not set
    leaseDuration: ""
    # The duration that the acting control-plane will retry refreshing leadership before giving up
  # Enable provisioner of controller service, must be set to true when pathPattern is used
  # Ref: https://juicefs.com/docs/csi/guide/pv/#using-path-pattern
  provisioner: true
  # Cache client auth config file in user's secret, only applicable to JuiceFS EE
  cacheClientConf: true
  replicas: 2
  resources:
    limits:
      cpu: 1000m
      memory: 1Gi
    requests:
      cpu: 100m
      memory: 512Mi
  # Grace period to allow the CSI Controller pod to shutdown before it is killed
  terminationGracePeriodSeconds: 30
  labels: {}
  annotations: {}
  metricsPort: "9567"
  # Affinity for CSI Controller pod
  affinity: {}
  # Node selector for CSI Controller pod
  nodeSelector: {}
  # Tolerations for CSI Controller pod
  tolerations:
  - key: CriticalAddonsOnly
    operator: Exists
  # CSI Controller service
  service:
    port: 9909
    type: ClusterIP
  # PriorityClass name for CSI Controller pod
  priorityClassName: system-cluster-critical
  # -- Extra envs of CSI Controller
  # Example:
  #  - name: ENABLE_APISERVER_LIST_CACHE
  #    value: "true"
  envs: []

node:
  # CSI Node Service will be deployed in every node
  enabled: true
  # Enable verbose logging
  debug: false
  hostNetwork: false
  # Set to true to run node-driver-registrar and liveness-probe sidecar in privileged mode (e.g. for SELinux systems)
  sidecarPrivileged: false
  # Enable transparent hugepage tuning
  # Set to true to configure transparent hugepage defrag to 'defer' mode
  tuneTransparentHugePage: false
  resources:
    limits:
      cpu: 1000m
      memory: 1Gi
    requests:
      cpu: 100m
      memory: 512Mi
  # When set true, enable application pods using same sc share the same mount pod
  storageClassShareMount: false
  # When set true, disable mount pods preempt application pods when in resource pressure
  mountPodNonPreempting: false
  # Grace period to allow the CSI Node Service pods to shutdown before it is killed
  terminationGracePeriodSeconds: 30
  labels: {}
  annotations: {}
  metricsPort: "9567"
  # Affinity for CSI Node Service pods
  affinity: {}
  # Node selector for CSI Node Service pods, ref: https://juicefs.com/docs/csi/guide/resource-optimization#csi-node-node-selector
  nodeSelector: {}
  # Tolerations for CSI Node Service pods
  tolerations:
  - key: CriticalAddonsOnly
    operator: Exists
  # PriorityClass name for CSI Node Service pods
  priorityClassName: system-node-critical
  # -- Extra envs of CSI Node
  # Example:
  #  - name: ENABLE_APISERVER_LIST_CACHE
  #    value: "true"
  envs: []
  updateStrategy:
    rollingUpdate:
      maxUnavailable: 50%
  ifPollingKubelet: true
  livenessProbe:
    failureThreshold: 5
    httpGet:
      path: /healthz
      port: 9909  # numeric value only
    initialDelaySeconds: 10
    periodSeconds: 10
    timeoutSeconds: 3

# Expose CSI Driver metrics
metrics:
  enabled: false
  port: 8080
  service:
    annotations: {}
    # prometheus.io/scrape: "true"
    # prometheus.io/port: "8080"
    servicePort: 8080

dashboard:
  # CSI Dashboard helps with CSI Driver observation, enabled by default
  enabled: true

  # Enable manager for dashboard
  # If enabled, the dashboard will watch and cache k8s resources in the dashboard, which is used to achieve better performance and more features.
  # If disabled, directly obtain resources from the k8s API server when the user accesses the dashboard, which will reduce the pressure on the API server under large-scale clusters.
  enableManager: true

  # Basic auth for dashboard
  auth:
    enabled: false
    # Set existingSecret to indicate whether to use an existing secret. If it is empty, a corresponding secret will be created according to the plain text configuration.
    existingSecret: ""
    username: admin
    password: admin

  replicas: 1
  leaderElection:
    # Enable leader election for dashboard.
    enabled: false
    # The namespace where the leader election resource lives. Defaults to the pod namespace if not set
    leaderElectionNamespace: ""
    # The duration that non-leader candidates will wait to force acquire leadership. This is measured against time of last observed ack
    # Defaults to 15s, if not set
    leaseDuration: ""
    # The duration that the acting control-plane will retry refreshing leadership before giving up
  hostNetwork: false
  resources:
    limits:
      cpu: 1000m
      memory: 1Gi
    requests:
      cpu: 100m
      memory: 200Mi
  labels: {}
  annotations: {}
  affinity: {}
  nodeSelector: {}
  tolerations:
  - key: CriticalAddonsOnly
    operator: Exists
  service:
    port: 8088
    type: ClusterIP
  ingress:
    enabled: false
    className: "nginx"
    annotations: {}
    # kubernetes.io/ingress.class: nginx
    # kubernetes.io/tls-acme: "true"
    hosts:
    - host: ""
      paths:
      - path: /
        pathType: ImplementationSpecific
    tls: []
    #  - secretName: chart-example-tls
    #    hosts:
    #      - chart-example.local
  priorityClassName: system-node-critical
  envs: []

  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
      maxSurge: 25%
      
  # Configure the pod level securityContext.
  podSecurityContext: {}

  # Configure SecurityContext for Pod.
  # Ensure that required linux capability to bind port number below 1024 is assigned (`CAP_NET_BIND_SERVICE`).
  securityContext:
    allowPrivilegeEscalation: false
    capabilities:
      drop:
        - ALL
    readOnlyRootFilesystem: true
  
  hostAliases: []
  # - ip: "127.0.0.1"
  #   hostnames:
  #   - "foo.local"
  #   - "bar.local"

# Override mount image, ref: https://juicefs.com/docs/csi/guide/custom-image/
defaultMountImage:
  ce: "dockerhub.kubekey.local/test/mount:ce-v1.3.0"
  ee: ""

webhook:
  # Setup the webhook using cert-manager
  certManager:
    enabled: true
  # Helm will auto-generate these fields
  caBundlePEM: |

  crtPEM: |

  keyPEM: |

  # It is recommended that admission webhooks should evaluate as quickly as possible (typically in milliseconds),
  # since they add to API request latency. It is encouraged to use a small timeout for webhooks
  # https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/#timeouts
  timeoutSeconds: 5
  # FailurePolicy defines how unrecognized errors and timeout errors from the admission webhook are handled
  # https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/#failure-policy
  FailurePolicy: Fail

validatingWebhook:
  enabled: false
  # It is recommended that admission webhooks should evaluate as quickly as possible (typically in milliseconds),
  # since they add to API request latency. It is encouraged to use a small timeout for webhooks
  # https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/#timeouts
  timeoutSeconds: 5
  # FailurePolicy defines how unrecognized errors and timeout errors from the admission webhook are handled
  # https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/#failure-policy
  FailurePolicy: Ignore

# For production environment, manually create & manage storageClass outside Helm is recommended, ref: https://juicefs.com/docs/csi/guide/pv#create-storage-class
storageClasses:
- name: "juicefs-sc"
  # Set to true to actually create this StorageClass
  enabled: false
  # Set existingSecret to indicate whether to use an existing secret. If it is empty, a corresponding secret will be created according to the plain text configuration.
  existingSecret: ""
  # Either Retain or Delete, ref: https://juicefs.com/docs/csi/guide/resource-optimization#reclaim-policy
  reclaimPolicy: Retain
  # Set to true to allow PVC expansion
  allowVolumeExpansion: true
  # Additional annotations for this StorageClass, e.g. make it default
  # annotations:
  #   storageclass.kubernetes.io/is-default-class: "true"

  backend:
    # The JuiceFS file system name
    name: ""
    # Connection URL for metadata engine (e.g. Redis), for community edition use only, ref: https://juicefs.com/docs/community/databases_for_metadata
    metaurl: ""
    # Object storage type, such as s3, gs, oss, for community edition use only, ref: https://juicefs.com/docs/community/how_to_setup_object_storage
    storage: ""
    # Bucket URL, for community edition use only, ref: https://juicefs.com/docs/community/how_to_setup_object_storage
    bucket: ""
    # Token for JuiceFS Enterprise Edition token, ref: https://juicefs.com/docs/cloud/acl
    token: ""
    # Access key for object storage
    accessKey: ""
    # Secret key for object storage
    secretKey: ""
    # Environment variables for the JuiceFS Client
    # Example: {"a": "b"}
    # Ref: https://juicefs.com/docs/csi/guide/pv#volume-credentials
    envs: ""
    # Extra files for the mount pod, ref: https://juicefs.com/docs/csi/guide/pv/#mount-pod-extra-files
    configs: ""
    # The number of days which files are kept in the trash, for community edition use only, ref: https://juicefs.com/docs/community/security/trash
    trashDays: ""
    # Options passed to the "juicefs format" or "juicefs auth" command, depending on which edition you're using
    # Example: block-size=4096,capacity=10
    # Ref: https://juicefs.com/docs/community/command_reference#format and https://juicefs.com/docs/cloud/reference/commands_reference#auth
    formatOptions: ""

  # Options for the "juicefs mount" command
  # Example:
  # - debug
  # - cache-size=2048
  # - cache-dir=/var/foo
  # Ref: https://juicefs.com/docs/community/command_reference#mount and https://juicefs.com/docs/cloud/reference/commands_reference#mount
  mountOptions:

  # Customize PV directory format, ref: https://juicefs.com/docs/csi/guide/pv#using-path-pattern
  # If enabled, controller.provisioner must be set to true
  # Example: "${.PVC.namespace}-${.PVC.name}"
  pathPattern: ""

  # Using PVC as JuiceFS cache path, ref: https://juicefs.com/docs/csi/guide/cache#use-pvc-as-cache-path
  cachePVC: ""

  mountPod:
    # Mount pod resource requests & limits
    resources:
      limits:
        cpu: 5000m
        memory: 5Gi
      requests:
        cpu: 1000m
        memory: 1Gi
    # Override mount pod image, ref: https://juicefs.com/docs/csi/guide/custom-image
    image: ""
    # Set annotations for the mount pod
    annotations: {}

1. 在线部署

如果能够访问公网github,则可以直接参考官方文档在线部署。

helm repo add juicefs https://juicedata.github.io/charts/
helm repo update

# 不论是初次安装还是后续的配置变更,都可以运行这一行命令达到效果
helm upgrade --install juicefs-csi-driver juicefs/juicefs-csi-driver -n kube-system -f ./values.yaml

2. 离线部署

首先需要下载官方charts

cd charts\juicefs-csi-driver

helm install juicefs-csi-driver ./ -n kube-system

# 查看部署情况
kubectl get pod -n kube-system

# 输出下面相关的pod
juicefs-csi-controller 
juicefs-csi-dashboard
juicefs-csi-node

3. 部署storageclass

这里提供的只是一个可参考的模板,具体的参数需要按需替换。

apiVersion: v1
kind: Secret
metadata:
  name: juicefs-secret
  namespace: default
stringData:
  name: "my-juicefs"
  metaurl: "redis://192.168.168.176:6379/5"
  storage: "minio"
  bucket: "http://192.168.184.122:9000/122"
  access-key: "snest"
  secret-key: "snest123"

---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: juicefs-sc
provisioner: csi.juicefs.com
parameters:
  csi.storage.k8s.io/provisioner-secret-name: juicefs-secret
  csi.storage.k8s.io/provisioner-secret-namespace: default
  csi.storage.k8s.io/node-publish-secret-name: juicefs-secret
  csi.storage.k8s.io/node-publish-secret-namespace: default
  juicefs/mount-image: dockerhub.kubekey.local/test/mount:ce-v1.3.0
mountOptions:
  - writeback # 异步写,性能更好,但可能存在数据丢失风险,谨慎使用
  - max-uploads=50
  - buffer-size=1024
reclaimPolicy: Retain

kubectl apply -f .\juicefs-secret.yaml

kubectl get sc
# 输出
juicefs-sc        csi.juicefs.com

有了storageclass,就可以创建业务所需的pvc了。