updated: 2009-02-25 (v1.2)
This page contains features and other 'wish list' items I've collected while working with Microsoft's Azure Storage. You might notice that most of the requests I list here relate to 'application protocol' level items - HTTP. That's because I am currently most interested in these issues and I have not found too many others covering the same space. I know there are lots of important database-related features that Azure Storage can include, too. I leave the work of documenting those feature requests to others.
The Azure team maintains serveral active forums for anyone who wishes to join. There is also an official blog. You'll find lots of the items on this page covered in these other sources. However, I had a hard time keeping track of progress on several of these items. So I started this page to try to keep a easy-to-find linkable list of items related to Azure Storage.
Of course, this list is in no way official or comprehensive. It's not meant to be a 'sink' for Azure Storage feature requests. If you like some of the stuff here - cool. If you think some of it is just lame, I accept that. If you think I've missed some important items, you're not be alone. I encourage you to publish your own feature lists for Azure Storage to consider. Feel free to post your comments on my Azure Feedback page.
Currently Azure Storage (Table, Blob, and Queue) uses a request-signing pattern. However, all collections do not share the same rules. The Azure Storage Services documentation lists two rules for accessing storage:
StringToSign = VERB + "\n" + Content-MD5 + "\n" + Content-Type + "\n" + Date + "\n" + CanonicalizedHeaders + CanonicalizedResource;
StringToSign = VERB + "\n" + Content-MD5 + "\n" + Content-Type + "\n" + Date + "\n" + CanonicalizedResource;
There are also different rules on how to handle the Date
portion of the request signature.
Request signatures for Blob, Table, and Queue collections should be the same. Preferably, Blob and Queue collections should adopt the (simpler) Table signature rules.
In addition, using request-signing slows processing on both the client and the server. It also reduces the usability of the service since standard HTTP clients (common browsers, cUrl, wGet, WFetch) cannot form valid requests. This means standard HTTP clients must use a proxy service (custom-coded middleware) to access even public content. It would be better if Azure storage employed known public authentication schemes (HTTP Basic, HTTP Digest, etc.) and a shared authorization scheme (see elsewhere in this document).
Currently, Azure Storage uses the following URI patterns to address resources:
/customers()
/customers()?$filter=(PartitionKe%20eq%20'preferred')
/customers(PartitionKey="preferred",RowKey="c1234")
The first example returns all the Entities in the Customer
table. The second example returns all the Entities in the
Customers
table assigned to the preferred partition. The last example returns a single Entity in the Customers
collection that matches both the PartitionKey and RowKey values.
This URI pattern has too much 'technology leaking' and is unecessarily 'crufty.' Instead, the following URI patterns are simpler, more intuitive and more direct.
/customers/
/customers/preferred/
/customers/preferred/c1234
Making this change will improve the usability of Azure Storage and increase the likelihood of adoption from a wide range of User-Agents and platforms. It will also hide the details of the underlying implementation and make it less brittle and less likely to 'break' as the underlying technology changes over time.
NOTEIt is also possible to create Entities that have no PartitionKey, but this is not advisable.
Currently Azure Storage a single user account that has full access to all operations (Create, Read, Update, Delete) for all Azure Storage objects
(Tables and Entities). Azure Storage should also provide support for multiple user accounts with
granular access to the same Azure Storage objects. In keeping with Azure Storage's support for REST-like interaction, it is proposed
that access security for Azure Storage be based on permissions (and not roles) and that the permissions
be mapped directly to HTTP methods: POST
(Create), GET
(Read),
PUT
(Update), and DELETE
(Delete).
Further, it is proposed that these permissions be applied directly to the URIs that are used to request
Azure Storage objects. For example http://[account-name].table.core.windows.net
[POST, GET] is
a security rule that allows only Read and Create rights for, in this case Authorities. This pattern
can be expanded by using a templating notation (i.e. regular expressions). For example, the following
rule defines read-only access for the selected /table:
http://[account-name].table.core.windows.net/my-table/(.*) [GET]
.
Finally, these security access rules (URI+HTTP-Method-list
) can be associated with one or
more user accounts to complete the access control features of Azure Storage. Also, it is possible to associate
one or more security access rules with a role|group and then associate one or more user accounts with
that group. In this way, Azure Storage can implement support for 'role-based' security.
NOTE: It is not recommended that access security be based on values that do not appear in the URI (i.e. cookies, custom headers, etc.) since this can break intermediaries (caching, security proxies, etc.) and could result in the exact same URI returning different data to the user-agent based on the contents of these 'hidden' values. For the same reasons, it is not recommended that access security be based on the contents of Entity objects (i.e. Kind) since this is data 'hidden' in the body of the response and not available via the URI itself.
Currently Azure Storage supports a custom request-signing authetication pattern. In addition, Azure Storage should offer user-agents the option of using the Basic and/or Digest authentication in line with HTTP Authentication. While the Digest authentication algorithm is more secure than the Basic algorithm, both forms are essential for supporting automated user-agent interaction with the Azure Storage data servers.
Currently Azure Storage implements "paging" using custom "continuation" HTTP headers to provide hints for additional Entities available on the server. This offers a simple "forward-only" paging pattern, but falls short of full support for paging large amounts of data. Azure Storage needs a complete paging solution.
A good example of a pattern already 'vetted' by the community is the Feed Paging and Archiving standard developed for the Atom MIME-type. Since Azure Storage already supports Atom, using the above-mentioned RFC standard seems a good target.
For MIME-types other than Atom, it is suggested that a set of custom HTTP Headers be employed to cover the same values.
Suggested names are: x-paging-first, x-paging-previous, x-paging-next, x-paging-last
. In line with
RFC5005 (see above), the values should be links, not scalar values. This allows the greatest flexibility for future changes to the
way paging is used and/or computed (i.e. x-paging-next: http://example.com/table/entity-25
)
Another possible solution would be to consider the
HTTP Header Linking
draft from Mark Nottingham (i.e. link:<http://example.com/table/>; rel="first")
.
While this draft is currently dormant, it has potential for wider acceptance over adopting custom HTTP headers.
/table/entity-id
) should not
change depending on the MIME-type (i.e. /atom/table/entity-id
is not allowed). Text-based
media types that are desirable include
HTML, and
CSV.
Additional types that would improve the user-agent experience include
PDF and
SVG.
2009-01-30
I confirmed that Azure table storing *does* support the content-MD5
header.
I missed this in my initial review of the feature set.
Using the HEAD method will allow user-agents to check for the existence of a resource (Entity) before making the actual request. This call can also be used to cut down on traffic and bandwidth since making a HEAD call w/ the server-supplied ETag will allow the user-agent to determine if a new copy of the Entity exists on the server. This can improve scalability and runtime performance. Supporting this feature can reduce charges to the Azure Storage account-holder.
Instead of returning the resulting Entity from a POST or PUT, return the appropriate HTTP Status Code with a Location HTTP Header that points to the resulting Entity. For POST, return 201 (Created). For PUT, return 204 (No Content). This is especially helpful when handling large binary objects as it cuts down on possibly needless traffic and bandwidth. Supporting this feature can reduce charges to the Azure Storage account-holder.