- From other storage providers to Nebius AI Cloud. The commands support S3-compatible storage services and Azure Blob Storage.
- From one bucket in Nebius AI Cloud to another. For example, between buckets in different regions.
Background
Each data transfer consists of consecutive iterations. At each iteration, Object Storage performs the following steps:-
Makes a
ListObjectsrequest for every 1000 objects in the source bucket. -
Makes
HeadObjectrequests to both the source and destination buckets for every listed object. After the first successful iteration, objects with last modified timestamps before the last successful iteration are skipped to reduce synchronization costs. -
For every object that needs to be transferred (depending on your overwrite strategy):
-
If the object is less than 100 MB, makes a single
GetObjectrequest to the source bucket and a singlePutObjectrequest to the destination bucket. -
For larger objects, makes the following requests:
CreateMultipartUploadrequest to the destination bucket.GetObjectrequests to the source bucket andUploadPartrequests to the destination bucket for every approximately 50 MB of the object.CompleteMultipartUploadrequest to destination bucket after uploading all parts.
-
If the object is less than 100 MB, makes a single
-
Checks whether the stop condition is satisfied:
- If it is, the transfer is complete.
- If it is not, the next iteration starts after the inter-iteration interval.
Costs
The cost of a data transfer is made up of the costs for the source and destination buckets. Thenebius storage v1alpha1 transfer commands themselves do not incur additional costs.
Source bucket costs
If your source bucket is at another storage provider, it may charge you for egress traffic and requests made during the transfer. For details, see your providerโs documentation. If your source bucket is in Nebius AI Cloud, Object Storage charges you for the following billing items:- HTTP requests on data in the source bucket, as described in Background: class A (LIST) and class B (GET, HEAD) operations.
- Egress traffic from the source bucket (except when it is in the same Nebius AI Cloud region as the destination bucket).
Destination bucket costs
The destination bucket of a data transfer is always in Nebius AI Cloud. Object Storage charges you for the following billing items:- HTTP requests on data in the destination bucket, as described in Background: class A (PUT, POST) and class B (HEAD) operations.
- Storing data in the destination bucket. Charges for each individual object or its part (for multipart uploads) starts once it is uploaded to the destination bucket.
Prerequisites
-
Make sure you are in a group that has at least the
editorrole within your tenant; for example, the defaulteditorsgroup. You can check this in the Administrationย โย IAM section of the web console. - Create a destination bucket in Nebius AI Cloud.
- Install and configure the Nebius AI Cloud CLI.
-
Get the ID of the project that contains the destination bucket:
- Web console
In the sidebar, go toย Storageย โย Object Storage and then find the destination bucket in the list:
- If the bucket is in the list, open the project menu and then click
โ Copy project ID next to the selected project.
- If the bucket is not in the list, open the project menu and then select another project to find the bucket in.
-
Set up access to the buckets:
-
For the destination bucket, create access keys for a user or service account from a group that has at least the
editorrole within your tenant; for example, the defaulteditorsgroup. This grants the account permissions to make HEAD and PUT requests to the bucket. -
For the source bucket, create access keys that grant permissions to make GET, HEAD and LIST requests:
- If your source bucket is in Nebius AI Cloud, you can create access keys for a user or service account from a group that has at least the
viewerrole within your tenant; for example, the defaultviewersgroup. - If your source bucket is in Azure Blob Storage, you can create access keys for an Azure storage account that has at least the Storage Blob Data Reader role.
- If your source bucket is in Nebius AI Cloud, you can create access keys for a user or service account from a group that has at least the
-
For the destination bucket, create access keys for a user or service account from a group that has at least the
Steps
Create the transfer configuration
Create atransfer.json file with the following contents and change values in it:
Access to source containers in Azure Blob Storage
If your source bucket (container) is in Azure Blob Storage,.spec.source should look like this:
.spec.source.endpoint: the endpoint of the Azure storage account in the formathttps://<storage-account-name>.blob.core.windows.net>..spec.source.bucket_name: the name of the source container in Azure Blob Storage..spec.source.credentials.azure_access_key.account_name: the name of the Azure storage account. Not to be confused with the name of your Azure account..spec.source.credentials.azure_access_key.access_key: the access key that the storage account uses for authentication.
Anonymous access
If a bucket has anonymous access enabled, you can pass{"anonymous": {}} as the value for .spec.source.credentials or .spec.destination.credentials. For example:
- Source bucket:
ListObjects,HeadObjectandGetObject. - Destination bucket:
HeadObject,PutObject,CreateMultipartUpload,UploadPartandCompleteMultipartUpload.
Stop condition
After each iteration, the transfer will evaluate its stopping condition to decide if it should stop. The example above uses the.spec.infinite condition. The configuration must include exactly one of the following stop conditions:
-
.spec.after_one_iteration: the transfer stops after completing its first iteration. The value should be an empty object. For example: -
.spec.after_n_empty_iterations: the transfer stops after a number of consecutive empty iterations (.spec.after_n_empty_iterations.empty_iterations_threshold). For example: -
.spec.infinite: the transfer continues indefinitely until manually stopped. For example:
Overwrite strategy
The overwrite strategy determines how the transfer handles objects already present in the destination bucket. The configuration must include.spec.overwrite_strategy with one of the following values:
NEVER: the transfer does not overwrite objects that exist in the destination bucket. If an object with the same name already exists, it is skipped. This is the safest option to prevent accidental data loss.IF_NEWER: the transfer only overwrites an object if the source object has a newer timestamp (based on theLast-Modifiedheader) compared to the existing object in the destination bucket. This strategy is recommended for incremental synchronization scenarios.ALWAYS: the transfer always overwrites objects in the destination bucket. Use this option with caution since it may lead to data loss in the destination bucket. It is suitable when you want to fully rewrite data in destination bucket. After the first successful iteration, the transfer only checks objects modified since the last iteration. This means that objects manually added to the destination bucket will not be overwritten by older source versions, because those source objects are not processed unless they were modified after the last iteration.
Configuration examples
Transfer between regions (eu-north1 โ us-central1)
Sample setup for transferring data from thesource_bucket in the eu-north1 region to the destination_bucket in the us-central1 region.
Requires a project in the us-central1 (for example, project-u00xxxx).
Cross-region backup (us-central1 โ eu-north1)
Sample setup for continuous backup from thebucket in the us-central1 region to the backup_bucket in the eu-north1 region.
Requires a project in the eu-north1 (for example, project-e00xxxx).
Launch the transfer
Run the following command:--parent-id, specify the ID of the project that contains the destination bucket. For details about getting the ID, see Prerequisites. You can also create the transfer in another project in the same Nebius AI Cloud region as the destination bucket, but not in a project in another region.
If you get an error about the source bucket endpoint not being in the allowlist, contact support.
Manage the transfer
You can use the nebius storage v1alpha1 transfer update command to update certain parameters of a transfer, such as the name, inter-iteration interval, stop condition, overwrite condition, and credentials for both the source and destination buckets. Changes to configurations may require several minutes to take effect, during which the old settings could still be in use temporarily.You cannot change the names or endpoints of the source and destination buckets of a transfer.
-
To resume the transfer, run the following command:
-
To delete the transfer, run the following command:
Limits
You can control transfer limits by setting request rate and bandwidth limits in thespec.source.limiters section of the configuration. These values are not guaranteed throughput targets but maximum thresholds that the transfer will not exceed.
This is useful in the following cases:
- Your source provider imposes rate or bandwidth restrictions.
- You need to reserve request capacity and bandwidth for other processes.
Incompleted multipart uploads
In rare cases, Object Storage might leave incomplete multipart uploads in the destination bucket. To prevent this, set a lifecycle rule on the destination bucket to automatically abort incomplete multipart uploads. For example, you can configure this rule by using the CLI:Check transfer progress
To monitor your current transfer iteration status, check thestatus.last_iteration field of the transfer:
status.last_iteration.objects_discovered_count: Number of objects identified for processing in current iteration. First iteration: All objects in the bucket. Subsequent iterations: Only objects created/modified since the last successful iteration start. Increases as new objects are discovered.status.last_iteration.objects_migrated_count: Objects successfully transferred.status.last_iteration.objects_skipped_count: Objects skipped due to overwrite strategy or because they were already transferred.
status.error for an endpoint where the error occurred (source or destination), the error code and message.