We now have storage located in several geographic regions. Make sure you use the right compute nodes to ensure the optimal speed accessing it!
pip packages on all CephFS (shared) filesystems is strictly prohibited!
Please purge any data you don’t need. We’re not an archival storage, and can only store the data actively used for computations.
Persistent data in kubernetes comes in a form of Persistent Volumes (PV), which can only be seen by cluster admins. To request a PV, you have to create a PersistentVolumeClaim (PVC) of a supported StorageClass in your namespace, which will allocate storage for you.
Currently available storageClasses:
|StorageClass||Filesystem Type||Region||AccessModes||Restrictions||Storage Type||Size|
|rook-cephfs||CephFS||US West||ReadWriteMany||Spinning drives with NVME meta||2.5 PB|
|rook-cephfs-east||CephFS||US East||ReadWriteMany||Mixed||1 PB|
|rook-cephfs-pacific||CephFS||Hawaii+Guam||ReadWriteMany||Spinning drives with NVME meta||384TB|
|rook-cephfs-haosu||CephFS||US West (local)||ReadWriteMany||Hao Su and Ravi cluster||NVME||131 TB|
|rook-cephfs-suncave||CephFS||US West (local)||ReadWriteMany||UCSD Suncave data only||SSD||10 TB|
|rook-ceph-block (default)||RBD||US West||ReadWriteOnce||Spinning drives with NVME meta||2.5 PB|
|rook-ceph-block-east||RBD||US East||ReadWriteOnce||Mixed||1 PB|
|rook-ceph-block-pacific||RBD||Hawaii+Guam||ReadWriteOnce||Spinning drives with NVME meta||384 TB|
|seaweedfs-storage||SeaweedFS||US West||ReadWriteMany||NVME||300 TB|
Ceph shared filesystem (CephFS) is the primary way of storing data in nautilus and allows mounting same volumes from multiple PODs in parallel (ReadWriteMany). Same applies to the BeegFS mounts accessed using NFS.
Ceph block storage allows RBD (Rados Block Devices) to be attached to a single pod at a time (ReadWriteOnce). Provides fastest access to the data, and is preferred for smaller (below 500GB) datasets, and all datasets not needing shared access from multiple pods.
Creating and mounting the PVC
Use kubectl to create the PVC:
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: examplevol spec: storageClassName: <required storage class> accessModes: - <access mode, f.e. ReadWriteOnce > resources: requests: storage: <volume size, f.e. 20Gi>
After you’ve created a PVC, you can see it’s status (
kubectl get pvc pvc_name). Once it has the Status
Bound, you can attach it to your pod (claimName should match the name you gave your PVC):
apiVersion: v1 kind: Pod metadata: name: vol-pod spec: containers: - name: vol-container image: ubuntu args: ["sleep", "36500000"] volumeMounts: - mountPath: /examplevol name: examplevol restartPolicy: Never volumes: - name: examplevol persistentVolumeClaim: claimName: examplevol
Using the right region for your pod
Latency is significantly affecting the I/O performance. If you want optimal access speed to Ceph, add the region affinity to your pod for the correct
spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: topology.kubernetes.io/region operator: In values: - us-west
You can list the nodes region label using:
kubectl get nodes -L topology.kubernetes.io/region
All ceph volumes created starting from December 2020 can be expanded by simply modifying the
storage field of the PVC (either by using
kubectl edit pvc ..., or
kubectl update -f updated_pvc_definition.yaml)
For older ones, all
rook-ceph-block and most
rook-cephfs-east volumes can be expanded. If yours is not expanding, you can ask cluster admins to do it in manual mode.
Mounting pre-assigned folders (deprecated)
If you have a CephFS FOLDER assigned with a secret CEPH_KEY, to use it you first need to create a secret in your NAMESPACE:
kubectl create secret -n NAMESPACE generic ceph-fs-secret --from-literal=key=CEPH_KEY
Then use the secret in your pod volume (by default the folder name in path corresponds to your user name):
volumes: - name: fs-store flexVolume: driver: ceph.rook.io/rook fsType: ceph options: clusterNamespace: rook fsName: nautilusfs path: /FOLDER mountUser: USER mountSecret: ceph-fs-secret
Also add a volumeMounts section (see above) to mount the volume into your pod.