Select Page

Azure Storage

Cloud has now become the first choice for most of the enterprise for hosting workloads. Azure cloud offers different options for storing different types of contents on cloud.

1. Storage Account

Azure Storage offers a massively scalable object store for data objects, a file system service for the cloud, a messaging store for reliable messaging, and a NoSQL store. A Storage account is the entry point to start using Azure storage. Every storage account has a name, the name be unique across Azure. The name also must be between 3 and 24 characters in length, and can include numbers and lowercase letters only. There are 3 types of storage accounts. Each account offers different set of features and has its own pricing model.

2. Performance

Azure supports two type of performance tier

  • Standard – (Magnetic disk)
  • Premium – (SSD – Solid State Device)

3. Type of Storage account

The storage account could be one of the following types:

  • General-purpose v2 accounts: Basic storage account type for blobs, files, queues, and tables. Guidance – Use it for most scenarios using Azure Storage.
  • General-purpose v1 accounts: Legacy account type for blobs, files, queues, and tables. Guidance –Use general-purpose v2 accounts instead when possible.
  • Blob storage accounts: Blob-only storage accounts. Guidance – Use general-purpose v2 accounts instead when possible.

4. Redundancy

Redundancy in simple terms means – how many copies of the data Microsoft creates whenever you use Azure storage for storing your data. For everything you store on Azure storage Microsoft automatically creates at least 3 copies (LRS). You must select at least one option at the time of creation of storage account (It can be changed later – so don’t worry). The data in your Microsoft Azure storage account is always replicated to ensure durability and high availability. Azure Storage copies your data so that it is protected from planned and unplanned events, including transient hardware failures, network or power outages, and massive natural disasters. You can choose to replicate your data within the same data center, across zonal data centers within the same region, or across geographically separated regions.

Locally-redundant storage (LRS): A simple, low-cost replication strategy. Data is replicated within a single storage scale unit. Locally redundant storage (LRS) provides at least 99.999999999% (11 nines) durability of objects over a given year.

Zone-redundant storage (ZRS): Replication for high availability and durability. Data is replicated synchronously across three availability zones. ZRS offers durability for storage objects of at least 99.9999999999% (12 9’s) over a given year.

Geo-redundant storage (GRS): Cross-regional replication to protect against region-wide unavailability. Geo-redundant storage (GRS) is designed to provide at least 99.99999999999999% (16 9’s) durability of objects over a given year by replicating your data to a secondary region that is hundreds of miles away from the primary region.

Read-access geo-redundant storage (RA-GRS): Cross-regional replication with read access to the replica. With RS-GRS you have the option to read from the secondary region. With RA-GRS, you can read from the secondary region regardless of whether Microsoft initiates a failover from the primary to secondary region. Same 16 9’s SLA.

Security

  • All data written to Azure Storage is automatically encrypted using Storage Service Encryption (SSE).
  • Data can be secured in transit between an application and Azure by using Client-Side Encryption, HTTPS, or SMB 3.0.
  • OS and data disks used by Azure virtual machines can be encrypted using Azure Disk Encryption.
  • Delegated access to the data objects in Azure Storage can be granted using Shared Access Signatures.

Azure Storage Account Service Use Case

Feature Description When to use
Azure Files Provides an SMB interface, client libraries, You want to “lift and shift” an application to the cloud which already uses the native file system APIs to share data between it and other applications running in Azure.
You want to store development and debugging tools that need to be accessed from many virtual machines.
Azure Blobs Provides client libraries and a REST interface that allows unstructured data to be stored and accessed at a massive scale in block blobs. You want your application to support streaming and random-access scenarios.
You want to be able to access application data from anywhere.
You want to build an enterprise data lake on Azure and perform big data analytics.
Azure Disks Provides client libraries and a REST interface that allows data to be persistently stored and accessed from an attached virtual hard disk. You want to lift and shift applications that use native file system APIs to read and write data to persistent disks.
You want to store data that is not required to be accessed from outside the virtual machine to which the disk is attached.

Transfer data to and from Azure

Azure offers both – offline and online option for transferring data between on-premises to cloud and vice-a-versa.

Offline transfer using shippable devices – Use physical shippable devices for offline one-time bulk data transfer.

  • Microsoft sends you a disk, or a secure specialized device.
  • Purchase and ship your own disks.
  • Copy data to the device and ship it to Azure where Microsoft does rest of the thing.
    • The available options for this case are Data Box Disk, Data Box, Data Box Heavy, and Import/Export (use your own disks).

Network Transfer – You transfer your data to Azure over your network connection. This can be done in many ways.

Graphical interface – Occasional transfer, few files and do not need to automation – Choose a graphical interface tool such as Azure Storage Explorer or a web-based exploration tool in Azure portal.

Scripted or programmatic transfer – call Azure REST APIs/SDKs directly. The available scriptable tools are AzCopy, Azure PowerShell, and Azure CLI. For programmatic interface, use one of the SDKs for .NET, Java, Python, Node/JS, C++, Go, PHP or Ruby.

On-premises devices – Microsoft can supply a physical or virtual device that resides in your datacenter and optimizes data transfer over the network. These devices also provide a local cache of frequently used files. The physical device is the Data Box Edge and the virtual device is the Data Box Gateway. Both run permanently in your premises and connect to Azure over the network.

Managed data pipeline -Set up a cloud pipeline to regularly transfer files between several Azure services, on-premises or a combination of two. Use Azure Data Factory to set up and manage data pipelines and move and transform data for analysis. (diagram – courtesy Microsoft)

Interesting facts about data transfer (Volume vs Time)