Posix

We now have storage located in several geographic regions. Make sure you use the right compute nodes to ensure the optimal speed accessing it!

Installing conda and pip packages on all CephFS (shared) filesystems is strictly prohibited!

Cleaning up

Please purge any data you don’t need. We’re not an archival storage, and can only store the data actively used for computations.

Posix volumes

Persistent data in kubernetes comes in a form of Persistent Volumes (PV), which can only be seen by cluster admins. To request a PV, you have to create a PersistentVolumeClaim (PVC) of a supported StorageClass in your namespace, which will allocate storage for you.

Currently available storageClasses:
StorageClass Filesystem Type Region AccessModes Restrictions Storage Type Size
rook-cephfs CephFS US West ReadWriteMany Spinning drives with NVME meta 2.5 PB
rook-cephfs-east CephFS US East ReadWriteMany Mixed 1 PB
rook-cephfs-pacific CephFS Hawaii+Guam ReadWriteMany Spinning drives with NVME meta 384TB
rook-cephfs-haosu CephFS US West (local) ReadWriteMany Hao Su and Ravi cluster NVME 131 TB
rook-cephfs-suncave CephFS US West (local) ReadWriteMany UCSD Suncave data only SSD 10 TB
beegfs BeeGFS US West ReadWriteMany 2PB
rook-ceph-block (default) RBD US West ReadWriteOnce Spinning drives with NVME meta 2.5 PB
rook-ceph-block-east RBD US East ReadWriteOnce Mixed 1 PB
rook-ceph-block-pacific RBD Hawaii+Guam ReadWriteOnce Spinning drives with NVME meta 384 TB
seaweedfs-storage SeaweedFS US West ReadWriteMany NVME 300 TB

Ceph shared filesystem (CephFS) is the primary way of storing data in nautilus and allows mounting same volumes from multiple PODs in parallel (ReadWriteMany). Same applies to the BeegFS mounts accessed using NFS.

Ceph block storage allows RBD (Rados Block Devices) to be attached to a single pod at a time (ReadWriteOnce). Provides fastest access to the data, and is preferred for smaller (below 500GB) datasets, and all datasets not needing shared access from multiple pods.

Creating and mounting the PVC

Use kubectl to create the PVC:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: examplevol
spec:
  storageClassName: <required storage class>
  accessModes:
  - <access mode, f.e. ReadWriteOnce >
  resources:
    requests:
      storage: <volume size, f.e. 20Gi>

After you’ve created a PVC, you can see it’s status (kubectl get pvc pvc_name). Once it has the Status Bound, you can attach it to your pod (claimName should match the name you gave your PVC):

apiVersion: v1
kind: Pod
metadata:
  name: vol-pod
spec:
  containers:
  - name: vol-container
    image: ubuntu
    args: ["sleep", "36500000"]
    volumeMounts:
    - mountPath: /examplevol
      name: examplevol
  restartPolicy: Never
  volumes:
    - name: examplevol
      persistentVolumeClaim:
        claimName: examplevol
Using the right region for your pod

Latency is significantly affecting the I/O performance. If you want optimal access speed to Ceph, add the region affinity to your pod for the correct region (us-east or us-west):

spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: topology.kubernetes.io/region
            operator: In
            values:
            - us-west

You can list the nodes region label using: kubectl get nodes -L topology.kubernetes.io/region

Volumes expanding

All ceph volumes created starting from December 2020 can be expanded by simply modifying the storage field of the PVC (either by using kubectl edit pvc ..., or kubectl update -f updated_pvc_definition.yaml)

For older ones, all rook-ceph-block and most rook-cephfs, rook-cephfs-haosu and rook-cephfs-east volumes can be expanded. If yours is not expanding, you can ask cluster admins to do it in manual mode.

Mounting pre-assigned folders (deprecated)

If you have a CephFS FOLDER assigned with a secret CEPH_KEY, to use it you first need to create a secret in your NAMESPACE:

kubectl create secret -n NAMESPACE generic ceph-fs-secret --from-literal=key=CEPH_KEY

Then use the secret in your pod volume (by default the folder name in path corresponds to your user name):

 volumes:
 - name: fs-store
   flexVolume:
     driver: ceph.rook.io/rook
     fsType: ceph
     options:
       clusterNamespace: rook
       fsName: nautilusfs
       path: /FOLDER
       mountUser: USER
       mountSecret: ceph-fs-secret

Also add a volumeMounts section (see above) to mount the volume into your pod.

Back to top