Upload terabytes to the cloud with Azure Data Box

Learn how to use the Azure Data Box tools to move large data volumes efficiently between your on-premises environment and Microsoft Azure blob storage.
Latest posts by Timothy Warner (see all)

You can use Azure Storage Explorer or the AzCopy command-line utility to transfer binary large objects (blobs) to your Microsoft Azure-based storage account. Sure, it might take a long time, depending on your Azure connectivity, and yeah, you might have timeouts that require retransmissions. But you can get the job done this way, right?

Well, if you need to move terabytes of data into Azure, you need a more "industrial strength" solution. You could purchase an ExpressRoute circuit, which has line speeds into Azure of up to 10 gigabytes per second. However, relatively few businesses have either the finances or the business need for such a fast and secure Azure connection.

Today I will introduce you to the Azure Data Box product family; this tool suite simplifies large-scale data ingress into Azure blob storage, and does so economically and with minimal hassle.

I choose to ignore the legacy Azure Import/Export service because, at least as far as I see it, Azure Data Box fully replaces any need for that old product. In case you're curious, the Import/Export service works by you shipping Microsoft your own BitLocker-protected hard drives to your preferred Azure region datacenter. Microsoft then unpacks your data into a storage account and returns you your disks (all at your expense).

Let's begin!

Options for offline data transfer ^

Microsoft Azure offers us three products to facilitate offline data transfer into Azure, shown in the next screenshot.

Azure Data Box product family

Azure Data Box product family

  • Azure Data Box: Up to 80 TB capacity
  • Azure Data Box Disk (Preview): 8 TB solid-state drives (SSDs), available singly or up to five drives (40 TB)
  • Azure Data Box Heavy (Preview): 1 PB capacity (!)

Note: Whenever you see Preview appended to an Azure product name, you need to remember that Microsoft offers no service-level agreement (SLA) on pre-release products. Therefore, you should use public preview features only in test/dev environments. Read more about the Microsoft Azure preview program terms of use.

Let's take a closer look at each of these three possibilities.

Azure Data Box ^

Azure Data Box is the only data box product that is in generally available (GA) status. Physically, this is a ruggedized hardware appliance (effectively a hard drive enclosure) that connects to your local area network with up to three 10 Gbps Ethernet connections.

The drives are organized as a RAID 5 array, which translates to 80 TB usable capacity from 100 TB overall. All data you put on the Data Box is encrypted with AES256; you connect to the Data Box by using an onboard web GUI and standard network transfer protocols such as Server Message Block (SMB).

The general workflow for using Data Box is as follows:

  • Order a Data Box from the Azure portal (shown in the next screenshot)
  • Receive the device and connect it to your LAN
  • Copy data to the Data Box shares
  • Return the Data Box to your regional datacenter
  • Microsoft uploads your data into your chosen blob storage account and then securely erases the device for the next customer's use
Ordering a Data Box in the Azure portal

Ordering a Data Box in the Azure portal

I hesitate to give you specific pricing because as you probably know, Azure pricing is variable. At any rate, check out the pricing page and know that you are essentially renting the appliance under the following terms:

  • You pay a shipping fee, a one-time import service fee, and for any extra days you keep the appliance over the lease term
  • The default lease term is 10 days; this period includes the day you receive the Data Box as well as the pickup date
  • As of fall 2018, the charge you incur if you lose or damage Azure Data Box is $40,000 USD

As I said, Azure Data Box includes one management RJ45 connection and three 1 Gbps ports (RJ45 and SFP). After connecting the device to your management PC and the rest of your LAN, you're ready to access the Azure Data Box web portal and transfer data! I show you the portal interface in the next screen capture.

Azure Data Box portal (image credit Microsoft)

Azure Data Box portal (image credit Microsoft)

Azure Data Box makes various SMB shares available on your LAN:

  • Page blobs: For virtual machine virtual hard disk (VHD) files
  • Block blobs: For typical file-system data
  • File Service: For use with the Azure File Service

After you ship Azure Data Box to Microsoft and they unpack your data, you interact with the storage account(s) the same way you do ordinarily. The next screen shot shows me working with a storage account in the Azure portal.

Interacting with your blob data in Azure

Interacting with your blob data in Azure

Azure Data Box Disk ^

Azure Data Box Disk is aimed at smaller businesses that do not need the capacity offered by Azure Data Box or Azure Data Box Heavy. These are 8-GB SSDs that Microsoft ships to you. You connect to a target system via USB 3.0 or SATA II or III. Data is encrypted on-disk with a 128-bit AES key.

As I mentioned earlier, you can specify up to five Data Box Disks per order, giving you up to a 40 TB capacity. As of fall 2018, Microsoft hasn't specified a lease duration; instead, they suggest in their pricing FAQ that you "return the disks after a reasonable time of use." The Azure Data Box Disk lost or damaged fee is $2,500 USD.

Azure Data Box Heavy ^

Azure Data Box Heavy is clearly an enterprise-only possibility (the appliance ships on a full-sized freight pallet and uses a wheeled cart as its enclosure). Although the overall capacity is 1 PB, the RAID implementation gives you 800 TB of usable capacity.

The setup and transfer workflow here is identical to that of Azure Data Box. The Azure Data Box Heavy lease period is 20 days, and the lost/damaged fee is $250,000 USD.

Options for online data transfer ^

The products we've discussed thus far serve primarily for one-time, one-way data transfers from on-premises sites into Azure. By contrast, Azure Data Box Edge and Data Box Gateway serve for longer-running and perhaps bidirectional traffic between on-prem sites and Azure.

Azure Data Box Edge is a 1-U rack-mounted device that you install permanently in your datacenter. You populate the appliance by using SMB or Network File System (NFS), and its connection to Azure means you can apply complex transformations to your data before moving it to Azure storage.

For instance, your Edge appliance might aggregate telemetry data from your internet-of-things (IoT) devices, run Azure machine-learning models against the data, perform transformations, and then finally upload the results to your Azure blob storage account.

The pricing page says that Azure Data Box Edge will be available through a monthly subscription fee. However, the price is listed as "Coming soon" as of October 2018.

Azure Data Box Gateway offers most of the functionality contained in Azure Data Box Edge, but with no physical server requirement. That is to say, you deploy this preconfigured virtual machine in your local environment. A flat monthly fee also serves as the pricing model here.

As shown in the next screenshot, you order both online data-transfer products directly from the Azure portal.

Ordering a Data Box Edge appliance in the Azure portal

Ordering a Data Box Edge appliance in the Azure portal

Wrap-up ^

If you've worked with Azure for a while, you may have thought, "What's the difference between Azure Data Box and StorSimple?" On one level, StorSimple and Data Box appear similar because both involve physical and virtual form factors.

With that said, StorSimple serves as a hybrid storage solution with a dash of disaster recovery built in. Azure Data Box serves as a vehicle for migrating data from on-premises sites into Azure, period.

I'm proud of the thought that the Azure Data Box team put into this technology. The different product offerings hit businesses of any size. They squarely address one of the most common questions businesses moving to the cloud have: "How in the world can we get all of this data into Azure efficiently?"

Poll: Does your organization plan to introduce Artifical Intelligence?

Read 4sysops without ads and for free by becoming a member!

0
0 Comments

Leave a reply

Your email address will not be published. Required fields are marked *

*

© 4sysops 2006 - 2020

CONTACT US

Please ask IT administration questions in the forums. Any other messages are welcome.

Sending

Log in with your credentials

or    

Forgot your details?

Create Account