WorldBank Provider
The World Bank is an international organization that provides financial and technical assistance to developing countries around the world. As one of the activities, the World Bank also collects development indicators and other data about countries in the world. The data catalog contains over 8,000 indicators that can be programmatically accessed.
The WorldBank Type Provider makes the WorldBank data easily accessible to F# programs and scripts in a type-safe manner. This article provides an introduction.
Introducing the provider
The following example initializes a connection to the WorldBank using the GetDataContext
method and then
retrieves the percentage of population who attend universities in the UK:
open FSharp.Data
let data = WorldBankData.GetDataContext()
data.Countries.``United Kingdom``.Indicators.``Gross capital formation (% of GDP)``
|> Seq.maxBy fst
|
When generating the data context, the WorldBank Type Provider retrieves the list of all
countries known to the WorldBank and the list of all supported indicators. Both of these
dimensions are provided as properties, so you can use autocomplete to easily discover
various data sources. Most of the indicators use longer names, so we need to wrap the name
in \
``.
The result of the Gross capital formation (% of GDP)
property is a sequence with
values for different years. Using Seq.maxBy fst
we get the most recent available value.
Using World Bank data asynchronously
If you need to download large amounts of data or run the operation without
blocking the caller, then you probably want to use F# asynchronous workflows to perform
the operation. The FSharp.Data package also provides the WorldBankDataProvider
type which takes
a number of static parameters. If the Asynchronous
parameter is set to true
then the
type provider generates all operations as asynchronous:
type WorldBank = WorldBankDataProvider<"World Development Indicators", Asynchronous=true>
WorldBank.GetDataContext()
The above snippet specified "World Development Indicators" as the name of the data
source (a collection of commonly available indicators) and it set the optional argument
Asynchronous
to true
. As a result, properties such as
Gross capital formation (% of GDP)
will now have a type Async<(int * int)[]>
meaning
that they represent an asynchronous computation that can be started and will eventually
produce the data.
Downloading data in parallel
To demonstrate the asynchronous version of the type provider, let's write code that downloads the university enrollment data about a number of countries in parallel. We first create a data context and then define an array with some countries we want to process:
let wb = WorldBank.GetDataContext()
// Create a list of countries to process
let countries =
[| wb.Countries.``Arab World``
wb.Countries.``European Union``
wb.Countries.Australia
wb.Countries.Brazil
wb.Countries.Canada
wb.Countries.Chile
wb.Countries.Czechia
wb.Countries.Denmark
wb.Countries.France
wb.Countries.Greece
wb.Countries.``Low income``
wb.Countries.``High income``
wb.Countries.``United Kingdom``
wb.Countries.``United States`` |]
To download the information in parallel, we can create a list of asynchronous
computations, compose them using Async.Parallel
and then run the (single) obtained
computation to perform all the downloads:
[ for c in countries -> c.Indicators.``Gross capital formation (% of GDP)`` ]
|> Async.Parallel
|> Async.RunSynchronously
|
Related articles
- API Reference: WorldBankDataProvider type provider
namespace FSharp
--------------------
namespace Microsoft.FSharp
namespace FSharp.Data
--------------------
namespace Microsoft.FSharp.Data
<summary>Typed representation of WorldBank data with additional configuration parameters. See http://www.worldbank.org for terms and conditions.</summary> <param name='Sources'>The World Bank data sources to include, separated by semicolons. Defaults to <c>World Development Indicators;Global Financial Development</c>. If an empty string is specified, includes all data sources.</param> <param name='Asynchronous'>Generate asynchronous calls. Defaults to false.</param>
The data for country 'Australia'
The data for country 'Brazil'
The data for country 'Canada'
The data for country 'Chile'
The data for country 'Czechia'
The data for country 'Denmark'
The data for country 'France'
The data for country 'Greece'
<summary>The indicators for the country</summary>
type Async = static member AsBeginEnd: computation: ('Arg -> Async<'T>) -> ('Arg * AsyncCallback * obj -> IAsyncResult) * (IAsyncResult -> 'T) * (IAsyncResult -> unit) static member AwaitEvent: event: IEvent<'Del,'T> * ?cancelAction: (unit -> unit) -> Async<'T> (requires delegate and 'Del :> Delegate) static member AwaitIAsyncResult: iar: IAsyncResult * ?millisecondsTimeout: int -> Async<bool> static member AwaitTask: task: Task<'T> -> Async<'T> + 1 overload static member AwaitWaitHandle: waitHandle: WaitHandle * ?millisecondsTimeout: int -> Async<bool> static member CancelDefaultToken: unit -> unit static member Catch: computation: Async<'T> -> Async<Choice<'T,exn>> static member Choice: computations: Async<'T option> seq -> Async<'T option> static member FromBeginEnd: beginAction: (AsyncCallback * obj -> IAsyncResult) * endAction: (IAsyncResult -> 'T) * ?cancelAction: (unit -> unit) -> Async<'T> + 3 overloads static member FromContinuations: callback: (('T -> unit) * (exn -> unit) * (OperationCanceledException -> unit) -> unit) -> Async<'T> ...
--------------------
type Async<'T>
static member Async.Parallel: computations: Async<'T> seq * ?maxDegreeOfParallelism: int -> Async<'T array>