Uploading Large Files to Amazon S3 Using Multipart Upload in .NET

Ercan Erdoğan
4 min readSep 26, 2024

--

When working with large files, uploading them to Amazon S3 as a single operation can be inefficient, prone to failures, and may result in costly retries. Fortunately, Amazon S3 offers multipart upload, a feature that allows you to split large files into smaller, manageable parts (chunks) and upload them in parallel. This not only optimizes the upload process but also ensures data reliability and cost-efficiency.

In this post, we’ll explore how to leverage multipart upload in .NET using AWS SDK to upload large files by slicing them into chunks.

Why Multipart Upload?

Multipart upload is particularly useful in these scenarios:

  1. Improved Resilience: Large file uploads can fail due to network issues. Multipart upload allows you to retry individual parts without re-uploading the entire file.
  2. Parallel Uploads: You can upload multiple parts simultaneously, speeding up the overall upload process.
  3. Efficient Large File Uploads: Amazon S3 requires multipart uploads for files larger than 5GB, but it can also be beneficial for smaller files as it provides more control and reliability.

How Multipart Upload Works

  1. Initiate a Multipart Upload: First, you start the upload session by initiating a multipart upload request.
  2. Upload File Parts: You split the file into smaller parts and upload each part independently.
  3. Complete the Upload: Once all parts are uploaded, you finalize the process by sending a request to complete the multipart upload.

Let’s go to implement this multi part upload functionality in C# by using AWS SDK.

1. Prerequisites

Make sure you have the following:

  • AWS SDK for .NET installed.
  • AWS credentials configured properly (either via environment variables or IAM roles).
  • A large file to upload.

Here is the example code :

using Amazon.S3;
using Amazon.S3.Model;
using System;
using System.IO;
using System.Threading.Tasks;

public class S3MultipartUpload
{
private static string bucketName = "your-bucket-name";
private static string keyName = "your-key-name";
private static string filePath = "your-file-path";
private static IAmazonS3 s3Client;

public static async Task UploadFileInChunksAsync()
{
s3Client = new AmazonS3Client(); // Assuming default credentials and region

// 1. Start a multipart upload
var initiateResponse = await s3Client.InitiateMultipartUploadAsync(new InitiateMultipartUploadRequest
{
BucketName = bucketName,
Key = keyName
});

string uploadId = initiateResponse.UploadId;
Console.WriteLine($"Initiated upload with ID: {uploadId}");

const int partSize = 5 * 1024 * 1024; // 5MB chunks
byte[] buffer = new byte[partSize];
var partETags = new List<PartETag>();

try
{
using (var fileStream = new FileStream(filePath, FileMode.Open, FileAccess.Read))
{
int partNumber = 1;
int bytesRead;

// 2. Upload each chunk
while ((bytesRead = await fileStream.ReadAsync(buffer, 0, buffer.Length)) > 0)
{
var partRequest = new UploadPartRequest
{
BucketName = bucketName,
Key = keyName,
UploadId = uploadId,
PartNumber = partNumber,
PartSize = bytesRead,
InputStream = new MemoryStream(buffer, 0, bytesRead)
};

var partResponse = await s3Client.UploadPartAsync(partRequest);
partETags.Add(new PartETag(partNumber, partResponse.ETag));

Console.WriteLine($"Uploaded part {partNumber}, ETag: {partResponse.ETag}");
partNumber++;
}
}

// 3. Complete the multipart upload
var completeRequest = new CompleteMultipartUploadRequest
{
BucketName = bucketName,
Key = keyName,
UploadId = uploadId,
PartETags = partETags
};

var completeResponse = await s3Client.CompleteMultipartUploadAsync(completeRequest);
Console.WriteLine($"Upload completed, Location: {completeResponse.Location}");
}
catch (Exception e)
{
// Abort the multipart upload if there's an error
Console.WriteLine($"An error occurred: {e.Message}");
var abortRequest = new AbortMultipartUploadRequest
{
BucketName = bucketName,
Key = keyName,
UploadId = uploadId
};
await s3Client.AbortMultipartUploadAsync(abortRequest);
}
}
}

Explanation of the Code

Step 1: Initiate a Multipart Upload

The InitiateMultipartUploadAsync method starts the multipart upload session. You’ll receive an UploadId, which uniquely identifies this upload process.

var initiateResponse = await s3Client.InitiateMultipartUploadAsync(new InitiateMultipartUploadRequest
{
BucketName = bucketName,
Key = keyName
});
string uploadId = initiateResponse.UploadId;

Step 2: Upload File Parts

The file is read in chunks of 5MB (the minimum part size for multipart uploads, except the last part). Each part is uploaded using the UploadPartAsync method, and the response contains an ETag for each part, which must be stored for later.

while ((bytesRead = await fileStream.ReadAsync(buffer, 0, buffer.Length)) > 0)
{
var partRequest = new UploadPartRequest
{
BucketName = bucketName,
Key = keyName,
UploadId = uploadId,
PartNumber = partNumber,
PartSize = bytesRead,
InputStream = new MemoryStream(buffer, 0, bytesRead)
};
var partResponse = await s3Client.UploadPartAsync(partRequest);
partETags.Add(new PartETag(partNumber, partResponse.ETag));
partNumber++;
}

Step 3: Complete the Multipart Upload

Once all parts are uploaded, you finalize the upload by calling the CompleteMultipartUploadAsync method, which stitches all parts together into a single object.

var completeRequest = new CompleteMultipartUploadRequest
{
BucketName = bucketName,
Key = keyName,
UploadId = uploadId,
PartETags = partETags
};
var completeResponse = await s3Client.CompleteMultipartUploadAsync(completeRequest);

Error Handling: Abort the Upload

If an error occurs during the upload, it’s essential to abort the multipart upload to clean up any incomplete uploads and avoid unnecessary storage costs.

var abortRequest = new AbortMultipartUploadRequest
{
BucketName = bucketName,
Key = keyName,
UploadId = uploadId
};
await s3Client.AbortMultipartUploadAsync(abortRequest);

Best Practices for Multipart Upload

  • Chunk Size: Set a reasonable part size. AWS requires each part (except the last one) to be at least 5MB. You can adjust the part size based on your file size and network conditions.
  • Parallelism: You can upload multiple parts in parallel to speed up the upload process.
  • Error Handling: Always handle exceptions and ensure multipart uploads are aborted properly to prevent data corruption and unnecessary charges.

Conclusion

Uploading large files to S3 can be streamlined and optimized using the multipart upload feature. By splitting a file into smaller parts, you gain control over the upload process and improve reliability, especially for large files. The provided example with .NET demonstrates how easy it is to integrate this functionality using the AWS SDK.

Now, you can confidently upload files of any size to Amazon S3 without worrying about timeouts, retries, or network interruptions!

--

--