Skip to main content
Mountpoint for Amazon S3 is an open-source client for mounting S3 buckets as local file systems. You can use Mountpoint for Amazon S3 to mount Object Storage buckets to your local machines, and to resources in Nebius AI Cloud (Compute virtual machines, Kubernetes clusters) and other cloud providers.

Use cases and limitations

Mountpoint for Amazon S3 is optimized for specific use cases like machine learning training which often involve reading large datasets with high throughput. For example, it is a good fit for data lake applications that read large objects without using other file system features like locking or POSIX permissions, or write objects sequentially from a single node. Mountpoint for Amazon S3 achieves this by parallelizing requests, both to each single file and to multiple files. Tests in Nebius AI Cloud show that Mountpoint for Amazon S3 can show performance that is close to clients that natively implement S3 APIs.
Mountpoint for Amazon S3 doesn’t implement all features of a POSIX file system. This means certain file operations, such as file locks, directory renaming, symlinks, hardlinks, and full control over file modes, owners and groups, are not fully supported or may behave differently than expected in a traditional file system. If your applications require these features or collaborative editing across multiple instances and users, use Compute shared filesystems.
See Mountpoint file system behavior for a detailed description of Mountpoint for Amazon S3’s behavior and POSIX support and how they could affect your application. To troubleshoot file operations that may not be supported by Mountpoint for Amazon S3, see the troubleshooting documentation.

Installing and mounting buckets

Prerequisites

  1. Create a service account and add it to a group that grants the required level of access, for example, the default viewers or editors group.
  2. Create an access key pair for the service account and save the key ID and the secret key.
The following commands create a service account, add it to the default viewers group and create an access key for it. Specify your tenant and project IDs in the commands:
TENANT_ID=<tenant-...>
PROJECT_ID=<project-...>

export SA_ID=$(nebius iam service-account create \
  --parent-id "$PROJECT_ID" \
  --name s3-mountpoint \
  --format jsonpath='{.metadata.id}')

export VIEWERS_ID=$(nebius iam group get-by-name \
  --parent-id "${TENANT_ID}" \
  --name 'viewers' \
  --format json \
  | jq -r '.metadata.id')
nebius iam group-membership create \
  --parent-id "$VIEWERS_ID" \
  --member-id "$SA_ID"

export ACCESS_KEY_ID=$(nebius iam v2 access-key create \
  --parent-id "$PROJECT_ID" \
  --name "s3-mountpoint" \
  --account-service-account-id "$SA_ID" \
  --description "Amazon S3 Mountpoint CSI Driver" \
  --format json \
  | jq -r '.metadata.id')
export AWS_ACCESS_KEY_ID=$(nebius iam v2 access-key get \
  --id "${ACCESS_KEY_ID}" \
  --format json \
  | jq -r '.status.aws_access_key_id')
export AWS_SECRET_ACCESS_KEY=$(nebius iam v2 access-key get \
  --id "${ACCESS_KEY_ID}" \
  --format json \
  | jq -r '.status.secret')

Local and virtual machines

Mountpoint for Amazon S3 is only available for Linux operating systems.
To mount an Object Storage bucket on your local machine or a Compute VM:
  1. Install Mountpoint for Amazon S3:
    wget https://s3.amazonaws.com/mountpoint-s3-release/latest/x86_64/mount-s3.rpm
    sudo yum install -y ./mount-s3.rpm
    
    For more installation instructions and details, see Getting started and Installing Mountpoint for Amazon S3 in Mountpoint for Amazon S3 documentation.
  2. Create a credentials file, ~/.aws/credentials, with your access key pair:
    cat <<EOF > ~/.aws/credentials
    [default]
    aws_access_key_id=<key_ID>
    aws_secret_access_key=<secret_key>
    
    This ensures that the credentials are persistent between shell sessions. If you have the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables defined, they override ~/.aws/credentials. For more details, see AWS credentials in Mountpoint for Amazon S3 documentation.
  3. Mount the bucket:
    mount-s3 <bucket_name> <directory> \
      --region <region> \
      --endpoint-url https://storage.<region>.nebius.cloud:443 \
      --maximum-throughput-gbps 10000 --max-threads 64
    
    Replace the following values:
    • <bucket_name>: Name of your bucket.
    • <directory>: Path to the directory on your machine where the bucket should be mounted.
    • --region: Nebius AI Cloud region where the parent project of your bucket is located. To get the region of a project, go to the web console and then expand the top-left list of tenants; the region, e.g. eu-north1, is displayed next to the project’s name.
    • --endpoint-url: Object Storage endpoint in the region. All the endpoints have the https://storage.<region>.nebius.cloud:443 format. For example, for buckets in the eu-north1 region, the endpoint is https://storage.eu-north1.nebius.cloud:443.
    --maximum-throughput-gbps 10000 and --max-threads 64 are recommended performance settings. For more details, see Performance. You can add more parameters to the command, for example, --foreground to run Mountpoint for Amazon S3 in the foreground instead of the background. For more details, see Configuring Mountpoint for Amazon S3.

Kubernetes clusters

Before making any changes to a production environment, contact support or your personal manager.
  1. Install kubectl and configure it to work with your cluster. For Managed Kubernetes clusters in Nebius AI Cloud, see How to connect to Managed Service for Kubernetes® clusters using kubectl.
  2. Create a Kubernetes Secret with your access key pair:
    kubectl create secret generic aws-secret \
      --namespace kube-system \
      --from-literal "key_id=<key_id>" \
      --from-literal "access_key=<secret_key>"
    
  3. Install Helm.
  4. Install the Mountpoint for Amazon S3 CSI Driver:
    helm repo add aws-mountpoint-s3-csi-driver https://awslabs.github.io/mountpoint-s3-csi-driver
    helm repo update
    helm upgrade --install aws-mountpoint-s3-csi-driver \
      --namespace kube-system \
      aws-mountpoint-s3-csi-driver/aws-mountpoint-s3-csi-driver
    
    For more installation details and instructions, see Installing Mountpoint for Amazon S3 CSI Driver.
  5. Create a PersistentVolume (PV):
    REGION=<region>
    kubectl apply -f - <<EOF
    apiVersion: v1
    kind: PersistentVolume
    metadata:
      name: s3-mountpoint
    spec:
      capacity:
        storage: 1Ti                     # Required by Kubernetes but effectively ignored
      accessModes:
        - ReadOnlyMany                   # Supported modes: ReadOnlyMany, ReadWriteMany
      storageClassName: ""               # Empty string required for static provisioning
      mountOptions:
        - endpoint-url https://storage.$REGION.nebius.cloud:443
        - region $REGION
        - maximum-throughput-gbps 10000  # raises max throughput as default limit is 1.25 GB/s (10 Gbps)
        - max-threads 64                 # raises max-threads as default is 16
        - allow-other                    # without this, only root can read
      csi:
        driver: s3.csi.aws.com           # Required
        volumeHandle: <volume_handle>    # Must be unique per volume
        volumeAttributes:
          bucketName: <bucket_name>
    EOF
    
    Replace the following values:
    • REGION: Nebius AI Cloud region where the parent project of your bucket is located. To get the region of a project, go to the web console and then expand the top-left list of tenants; the region, e.g. eu-north1, is displayed next to the project’s name.
    • .spec.accessModes: List of access modes supported by the PV. Supported modes are ReadOnlyMany (multiple nodes can mount the PV as read-only) and ReadWriteMany (multiple nodes can mount the PV as read-write).
      If your application requires the ReadWriteMany mode, contact support before creating the PV.
    • .spec.csi.volumeHandle: String that identifies the PV. The volume handle must be unique for each volume in your cluster.
    • .spec.csi.volumeAttributes.bucketName: Name of your bucket.
    .spec.mountOptions.maximum-throughput-gbps and .spec.mountOptions.max-threads are recommended performance settings. For more details, see Performance.
  6. Create a PersistentVolumeClaim (PVC):
    kubectl apply -f - <<EOF
    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: s3-mountpoint
      namespace: soperator
    spec:
      accessModes:
        - ReadOnlyMany     # Must match the access modes of your PV
      storageClassName: "" # Empty string required for static provisioning
      resources:
        requests:
          storage: 1Ti          # Required by Kubernetes but effectively ignored
      volumeName: s3-mountpoint # Must match the name of your PV
    EOF
    
    .spec.accessModes and .spec.volumeName must match the access modes and the name of your PV, respectively.
  7. Use the PVC to mount the PV to your Pod:
    kubectl apply -f - <<EOF
    apiVersion: v1
    kind: Pod
    metadata:
      name: storage-test-app
    spec:
      volumes:
        - name: bucket
          persistentVolumeClaim:
            claimName: s3-mountpoint # Must match the name of your PVC
      containers:
        - name: app
          image: centos
          command: ["/bin/sh"]
          args: ["-c", "while true; do echo $(date -u) >> /data/out.txt; sleep 5; done"]
          volumeMounts:
            - name: bucket
              mountPath: /data
    EOF
    

Performance

To optimize performance when mounting buckets, you can configure the following parameters:

maximum-throughput-gbps

The --maximum-throughput-gbps parameter raises the maximum throughput limit. The default limit is 1.25 GB/s (10 Gbps). For example, to set it to 10000 Gbps:
mount-s3 bucket ./bucket --maximum-throughput-gbps 10000
In Kubernetes, specify this in the mountOptions of your PersistentVolume:
mountOptions:
  - maximum-throughput-gbps 10000

max-threads

The --max-threads parameter raises the maximum number of threads. The default is 16. For example, to set it to 64:
mount-s3 bucket ./bucket --max-threads 64
In Kubernetes, specify this in the mountOptions of your PersistentVolume:
mountOptions:
  - max-threads 64

metadata-ttl

The --metadata-ttl parameter controls how long metadata is cached. Consider setting it to 120 seconds for better performance. For example:
mount-s3 bucket ./bucket --metadata-ttl 120
For more information, see the official documentation.

UNSTABLE_MOUNTPOINT_MAX_PREFETCH_WINDOW_SIZE

If you need to maximize the throughput within a single file, you can set the UNSTABLE_MOUNTPOINT_MAX_PREFETCH_WINDOW_SIZE environment variable. For example:
UNSTABLE_MOUNTPOINT_MAX_PREFETCH_WINDOW_SIZE=8589934592 mount-s3 bucket ./bucket
This feature is unstable and not recommended for default use, as it can lead to uncontrollable memory consumption.