Do you know that in addition to Excel API, now you can also extract data easily from Datastream by using Python or MATLAB?
For Python:
PyDatastream is a Python interface to the Thomson Dataworks Enterprise (DWE) SOAP API (non free), with some convenience functions for retrieving Datastream data specifically. This package requires valid credentials for this API.
It is a Python API for time-series data which abstracts the database which is used to store the data, providing a powerful and unified API. It provides an easy way to insert time-series datapoints and automatically downsample them into multiple levels of granularity for efficient querying time-series data at various time scales.
- This package is mainly meant to access Datastream. However basic functionality (
request
method) should work for other Dataworks Enterprise sources. - The package is using Pandas library (GitHub repo), which I found to be the best Python library for time series manipulations. Together with IPython notebook it is the best open source tool for the data analysis. For quick start with pandas have a look on tutorial notebook and 10-minutes introduction.
Alternatives for other scientific computing languages:
- MATLAB: MATLAB datafeed toolbox
- R: RDatastream API is deprecated and this package no longer works as of July 1 2019.
Please note that only PhD students and faculty can request credentials. Please email business@library.columbia.edu if you had any questions.