Author Archives: Rohit Gernapudi

App4Apis (Update)

Phase: Final stages of completion

As we are going into the spring break, we want to the update the status of the project. Before diving into the details, a quick introduction of the project.

App4Apis: A one stop solution to access APIs that are designed to take the parameters in the query and return a json object. We have two types of configurations. The first is a list of preset APIs (Geocode, Human Resources Archive, Internet Archive) where we are well aware of the structure of the API and provide a form to update the necessary information for querying. The second is more generic where user provides an example url (including the query parameters) from which we identify the API request pattern. This pattern is then used to query the API with a larger dataset in the next step. We let the user download the results or the results can be sent as an email (helpful for particularly large datasets).

Status: In the last blog post, our to-do list was

  1. Finish few pending screens towards our goal.
  2. Integrate Geocode API into preset.
  3. Develop the functionality to support user to upload a file in the ad hoc query case.
  4. Cosmetic changes and make the website visually appealing.
  5. End to end exhaustive testing.
  6. Deploy!

From that list, we are able to finish the layout of pending screens, completed the integration of Geocode into the preset list, and made the website more visually appealing. We are in the process of end-to-end testing of the preset phase before moving onto the ad hoc query case. The redesigned screens of the website are attached at the end of the blog. Any feedback is welcome.

We are able to send the results to the user through email in the chunks of 1000 results per csv file. This will help the users to submit a task to the system and receive the results at a later point of time, enabling the system to handle large inputs and giving an option to user to not wait for the results.

Current tasks at hand are:

  1. Make the email sending process offline: Currently, we can send an email but we are not able to schedule it for a later point of time and allow the user to exit the screen. He can leave the application after submitting the tasks but we are not able to provide it as an option. I am working on taking the input as a task and schedule it for a later point of time.
  2. Complete testing of the preset APIs workflow.
  3. Complete testing of the ad-hoc requests workflow. 
  4. Deploy.

I am expecting to finish the first 3 tasks by this month end so that we can turn our focus on deployment from the start of next month.

Thanks,

Rohit Bharadwaj Gernapudi

The website screens are:

Capture_1

 

Capture_3Capture_4

Capture_2

App4Apis (Pre-Alpha)

Phase: Pre-Alpha.

A month from the last update and we are back with another blogpost to quickly update our status. We will start with a brief summary of our work and then the mandatory status update.

What are we trying to do?

With an increase in the data resources around the world and as most of them are exposed through a REST API, we felt the need to develop an application that can ease the access of these datasets. Towards that direction, we intend to develop a web application which can help the users query the API of their choice without developing any new tools or scripts. More details on the project can be found at link.

Current Status:

During our last update, we were working on the designing the application and building the code to access 3 APIs of our interest (Geocode, Internet Archive, and Human Resources). This month, we freezed the design for our initial version. (Designer alert: Design is freezed only on paper and can be/ will be modified till project is released, and sometimes even after the release!) The screens below provide the barebones visualization of the application (by barebones, it is not visually appealing and reflects just the functionality). The cosmetic changes will be duly applied as we finish the skeleton and flesh of the application. The screens (in the order of their access are)Capture1

 

There are 2 flows in our application.

Flow 1: User will use any one of the preset APIs by uploading a csv file and give the column numbers of the required parameters to query the API. We will run the preset configuration for the csv file and will return a csv that can be downloaded with the results column populated from the API results. Corresponding screens are

Selecting the preset:

Capture2-1File upload and Values to be entered:

Capture2-2

File populated with the results:

Capture2-3

 

 

 

 

 

Flow 2: User will enter a new API url. We expect this url to be complete with all the necessary parameters to query the API as this url is used as an example for all the subsequent new requests. For example,

https://api.cityofnewyork.us/geoclient/v1/address.json?houseNumber=314&street=west 100 st&borough=manhattan&app_id=7a4e791c&app_key=2015DigitalIntern

Here the base url is https://api.cityofnewyork.us/geoclient/v1/address.json? and the query parameters are houseNumber=314&street=west 100 st&borough=manhattan&app_id=7a4e791c&app_key=2015DigitalIntern

We will query the API with this sample url to get all the output parameters and will allow the user to select the fields of his interest. We then allow him to give new values for the query parameters and display the results for the specified output parameters.  Corresponding screens are

User to enter the URL:Capture3-1

User selects the output parameters:

Capture3-2User gets the output:

Capture3-3

 

As you can see, the screens are designed only to implement the required functionality. That is why we are still in the pre-alpha phase of the project development. The tasks that are in the pipeline are

  • Finish few pending screens towards our goal.
  • Integrate Geocode API into preset.
  • Develop the functionality to support user to upload a file in the adhoc query case.
  • Cosmetic changes and make the website visually appealing.
  • End to end exhaustive testing.
  • Deploy!

We are planning to finish these tasks by the end of January, 2016 and release the first version of our product.

Until next blog post, ciao.

Thanks,

Rohit

App4APIs

Hello Reader,

I am Rohit Bharadwaj, a graduate student to the Masters in Data Science program. I have worked as a software engineer at Factset for four years where I used machine learning and natural language processing techniques to handle information retrieval problems. I joined the Digital Social Science Center as a digital centers intern this fall. I work with Jeremiah and Eric in DSSC and we are trying to synthesize my technical knowledge with their domain expertise to come up with optimal solutions pertaining to a few technical problems related to DSSC.

Goal: To provide a comprehensive tool to all the digital libraries in Columbia that can facilitate the access to any api that a user wants  to access

Digital centers are often faced with requests from users pertaining access to a particular API (Application Programming Interface) or a set of APIs through which they can access the data of interest. User requests can be very diverse, ranging from several attributes for a specific data point to several data points and all of their associated attributes. Normally, we have scripts catering to each API and the students are expected to run these scripts on their machine in order to  get the data. Though this method works, there are limitations in terms of the access (for python scripts, an interpreter of python needs to  to be installed). Furthermore, there are constraints on the input data to be of a specific format, and often, the output of the existing scripts are not clear and cannot be controlled.

We, as part of  this internship, aim to build a web application that can be accessed from anywhere within the university. As part of this web application, we aim to provide a web page, where a student can enter the mandatory information required to get data from an API. For example, an API that returns geo-coordinates require the address of the location to be provided and the address should have few mandatory attributes such as street number, borough code etc. We aim to provide the flexibility of entering input in one of the three formats: a form, tab separated text file, or a comma separated text file, and also plan to support different output formats : display on the screen, tab separated text file, comma separated text file.

Currently, we have 3 APIs that we are looking to integrate into the web application. The APIs that we are planning to support initially and their use cases are listed below:

  1. Geocode: This API provides geospatial coordinates of any given address in the New York city. This can potentially be  used by students for location-based data analysis and pattern recognition tasks..

  2. Human Rights Web Archive: This API enables access to human rights database index. Currently, this is used to verify if the citations of a journal article are valid or not. This also provides a link to the article in human rights index for validation purpose.

  3. Internet archive: This API provides access to internet archive database. This is also used in the similar context as in (2).

Our first aim is to support these 3 APIs through our web application after which we will  focus on extending the application to support any API that is of interest to a user. We are aiming to make the application as flexible as possible, wherein an user can configure the application with a new API with very few restrictions without negatively affecting  the application’s usability.

Current Status: Developing the backend of our web-application that supports  access to the first set of APIs is completed to a major extent, barring a few functions for different types of I/O operations. We are currently designing the user interface for the application and also refining the backend code base.

Thanks,

Rohit Bharadwaj G.