Working with Tables
For more information on Tables in general, please see some of the many articles on MSDN or the Azure documentation. Some of the core features of the Tables provider are: -
Rapid navigation
You can easily move between tables within a storage account. Simply dotting into a Tables property will automatically retrieve the list of tables in the storage account. This allows easy exploration of your table assets, directly from within the REPL.
1: 2: |
|
|
Automatic schema inference
Unlike some non-relational data stores, Azure Table Storage does maintain schema information at the
row level in the form of EDM metadata. This can be interrogated in order to infer a table's shape, as
well as help query data. Let's look at the schema of the row fred.1
in the employee
table.
EDM Data Type |
Value |
|
---|---|---|
Partition Key |
string |
fred |
Row Key |
string |
1 |
Years Worked |
int |
10 |
Dob |
datetime |
01/05/1990 |
Name |
string |
fred |
Salary |
double |
0 |
Is Manager |
bool |
true |
Based on this EDM metadata, an appropriate .NET type can be generated and used within the provider.
1: 2: 3: |
|
|
Optional type generation
The Storage Type Provider will also intelligently map fields which are not always populated in the table. For example, given the following rows, we can see that two columns are not always populated.
Timestamp |
Dob |
IsManager |
... |
Salary |
YearsWorking |
IsAnimal |
|
---|---|---|---|---|---|---|---|
partition.1 |
09/11/2017 11:06:10 +00:00 |
01/05/1990 00:00:00 |
True |
... |
0 |
10 |
N/A |
partition.2 |
09/11/2017 11:06:10 +00:00 |
01/05/1990 00:00:00 |
True |
... |
0 |
N/A |
False |
The type provider will correctly infer that YearsWorking and IsAnimal are optional fields are will cascade this to the type system
1: 2: |
|
|
Of course, all other fields remain mandatory. This will also cascade to the provider constructor for the type - optional fields will become optional constructor arguments. The type provider will download the first 10 rows by default to infer schema for each table, but this can be increased during creation of the type provider. Obviously, a large sample increases the chance of the type provider correctly inferring option fields - but if you use a local development storage account for development purposes, you can probably manually generate a few rows to guide the inference system.
1: 2: |
|
This feature is especially useful when working with hot schema loading. If new rows are added within the bounds of the schema size, they will be used to re-evaluate the optimal schema automatically.
Data Frame interoperability
In addition, an extra "Values" property is available which exposes all properties on the entity in a key/value collection - this is useful for binding scenarios or e.g. mapping in Deedle frames.
1: 2: 3: 4: 5: 6: 7: 8: 9: |
|
Timestamp |
Dob |
IsManager |
Name |
Salary |
YearsWorking |
|
---|---|---|---|---|---|---|
women.1 |
09/11/2017 11:06:10 +00:00 |
30/04/2005 00:00:00 |
True |
sara |
3.5 |
35 |
women.2 |
09/11/2017 11:06:10 +00:00 |
20/08/1965 00:00:00 |
False |
rachel |
5.5 |
20 |
Offline development
In addition to using the Azure Storage Emulator, you can also simply provide the type provider with a JSON file containing the list of tables and their schema. This is particularly useful within the context of a CI process, or when you know a specific "known good" structure of tables within a storage account.
Overview
You can still access blobs using the compile-time storage connection string if provided, or override as normal at runtime.
1:
|
|
The contents of TableSchema.json
looks as follows.
|
We can access the data just as normal - the only difference is that the schema is defined by the json file rather than inferred from a set of data.
1: 2: |
|
|
Notice that both Salary and IsManager are rendered as option types, since we've explicitly marked them as such in the schema definition file. Compare this to the earlier sample, where the type provider inferred types based on the first n rows, where they were rendered as mandatory types.
Supported data types
The list of types available are limited by those supported by Azure Tables: -
Value in JSON config |
|
---|---|
EdmType.Binary |
binary |
EdmType.Boolean |
boolean |
EdmType.DateTime |
datetime |
EdmType.Double |
double |
EdmType.Guid |
guid |
EdmType.Int32 |
int32 |
EdmType.Int64 |
int64 |
EdmType.String |
string |
Querying data
The storage provider has an easy-to-use query API that is also flexble and powerful, and uses the inferred schema to generate the appropriate query functionality. Data can be queried in several ways, in both synchronous or asynchronous forms.
Key Lookups
These are the simplest (and best performing) queries, based on a partition / row key combination, returning an optional result. You can also retrieve an entire partition.
1: 2: 3: |
|
Plain Text Queries
If you need to search for a set of entities, you can enter a plain text search, either manually or using the Azure SDK query builder.
1: 2: 3: 4: 5: 6: 7: 8: |
|
|
Query DSL
A third alternative to querying with the Azure SDK is to use the LINQ IQueryable implementation.
This works in F# as well using the query { }
computation expression. However, this
implementation has two main limitations: -
- You need to manually create a type to handle the result
- IQueryable does not guarantee runtime safety for a query that compiles. This is particularly
evident with the Azure Storage Tables, which allow a very limited set of queries.
The Table provider allows an alternative that has the same sort of composability of IQueryable, yet is strongly typed at compile- and runtime, whilst being extremely easy to use. It generates query methods appropriate for each property on the entity, including appropriate clauses e.g. Greater Than, Less Than etc. as supported by the Azure Storage Table service
1:
|
|
|
These can be composed and chained. When you have completed building, simply call Execute()
.
1: 2: 3: 4: |
|
|
Query operators are strongly typed, so Equal To
on Years Working
takes in an int, whereas
on Name
it takes in a string, whilst for booleans there is no such provided method.
Inserting data
Inserting data is extremely easy with Azure Type Provider. A table will always have an Insert method on it, with various overloads that will appear depending on whether the table has data in it (and thus a schema could be inferred) or is empty.
Inserting single records
If the table is newly-created (and has no existing data from which to infer schema), you can insert data by calling one of the two available Insert overloads. The first one takes a single PartitionKey, RowKey and any object. It will automatically generate the appropriate Azure Storage request for all public properties. Conversely, if the table exists and has schema associated with it, you can use a strongly-typed Insert method. You can also choose whether to insert or upsert (insert or update) data.
Both mechanisms will also return a TableResponse discriminated union on the outcome of the operation.
1: 2: 3: 4: 5: |
|
|
1: 2: 3: 4: 5: 6: 7: 8: 9: |
|
|
1:
|
|
|
Inserting batches
The Storage provider makes it easier to insert large amounts of data by automatically grouping large datasets into the appropriate batches, which for Azure Tables have to be across the same partition and in sizes of up to 100 entities. You can insert batches for tables that either already have schema or do not.
1: 2: 3: 4: 5: 6: 7: |
|
|
Deleting data
Deleting data is also extremely easy - simply supply the set of Partition / Row Key combinations that you wish to delete.
1:
|
|
|
Alternatively, you can delete an entire partition (although due to the limitations of the Table Storage service, this will require loading all row keys for that partition into memory).
1:
|
|
|
Error handling
Operations should not raise exceptions. Instead, return codes of any write operations return a TableResponse. This contains the Partition and Row Keys of the affected row as well as the associated TableResult, wrapped in a discriminated union to allow easy matching.
1: 2: 3: 4: 5: 6: |
|
1: 2: |
|
|
1: 2: 3: |
|
|
1: 2: |
|
|