Perceptual Bases for Virtual Reality: Part 2, Video

This is Part 2 of a post about the perceptual bases for virtual reality. Part 1 deals with the perceptual cues related to the spatial perception of audio.

The chief goal of the most recent virtual reality hardware is to simulate depth perception in the viewer. Depth perception in humans arises when the brain reconciles the images from each eye, which differ slightly as a result of separation of the eyes in space. VR headsets position either a real or virtual (a single LCD panel showing a split screen) display on front of each eye. Software sends to the headset a pair of views onto the same 3D scene that are rendered from the perspective of two cameras in the virtual spaces, separated by distance between the user’s eyes. Accurate measurement and propagation of this inter-pupilliary distance (IPD) is important for effective immersion. The optics inside the Oculus Rift, for instance, are designed to tolerate software changes to the effective IPD within an certain operational range without requiring physical calibration. All these factors taken into consideration, when the user allows themselves to focus on a point beyond the surface of the headset display, they will hopefully experience a perceptually fused view of the scene with the appropriate sense of depth that arises from stereopsis.

However, stereoscopic cues are not the only perceptual cues that contribute to a sense of depth perception. For example, the widely-understood motion parallax effect is a purely monocular cue for depth perception: as we move our head, we expect objects closer to us to seem to move faster than those that are further away. Many of these cues are experiential truisms: objects farther away seem smaller, opaque objects occlude, and so on. Father Ted explains it best to his perenially hapless colleague Father Dougal in this short clip.

Others are less obvious, though well-known to 2D artists, like the effect that texture and shading has on depth perception. Nevertheless, each one of these cues needs to be activated by a convincing VR rendering, and implemented in either the client application code or the helper libraries provided by the device vendor (for instance, the Oculus Rift SDK). Here, I discuss three additional contingencies that impact the sense of VR immersion that go beyond typical depth-perception cues, to show the importance of carefully understanding human perception in order to produce convincing virtual worlds.

Barrel distortion

As this photograph, taken from the perspective of the user of an Oculus Rift, shows, the image rendered to each eye is radially distorted.
Image from inside Oculus Rift
This kind of bulging distortion is known as barrel distortion. It is intentionally applied (using a special shader) by either client software or the vendor SDK to increase the effective field-of-view (FOV) of the user.  The lenses used in the Oculus Rift correct for this distortion. The net result is an effective FOV of about 110 degrees in the case of the Oculus Rift DK1. This approaches the effective stereoscopic FOV for humans, which is between 114 and 130 degrees. Providing visual stimuli in our remaining visual field (our peripheral vision) is important for the perception of immersion, so other VR vendors are working on solutions that increase the effective FOV. One solution is to provide high-resolution display panels which are either curved or tilted in such a way as to encompass more of the real FOV of the user (e.g. StarVR). Another solution is to exploit Fresnel lenses (e.g. Weareality Sky), which can provide an effective FOV larger than regular lenses in a more compact package that is suitable for use in conjunction with a smartphone. Both of these methods have drawbacks: the additional cost of larger panels increases the total cost of the ‘wrap-around method’, while Fresnel lenses produce ‘milky’ images and their optical effects are more difficult to model in software than those of regular lenses.


An exceptionally important factor in the perception of immersion in virtual reality is the sense of smoothness of scene updates in response to both user movement in the real world, and avatar movements in the virtual world. Perhaps the most bottleneck in this process to tackle is the rendering pipeline. For this reason, high-end gaming setups are the norm for the recommended system specifications for virtual reality. Builds in excess of 8GB of system RAM, a processor beating at least an Intel Core i5, and mid- to high-range PCIe graphics cards with at least 4GB of VRAM on-board are de rigeur. NVidia have partnered with component and system manufacturers to develop a commerce-led set of informal standards known as ‘VR Ready’.

Even if the rendering pipeline is able to provide frames to the display at a rate and reliability sufficient for the perception of fluid motion, the user motion detection subsystem must also be able to provide feedback to the game at a sufficiently high rate, so that motions in the real world can be translated to motions in the virtual scene in good time. The Oculus Rift has an innovative and very high resolution head-tracking system that fuses accelerometer, gyrometer, and magnetometer data with computer vision data from a head-tracking camera that infers the position of an array of infrared markers in real space. Interestingly, even very smooth motions in the virtual world can induce nausea and break the perception of immersion if those motions cannot be reconciled with normal human behavior. So, for instance, in cutscene animations, care must be taken not to move the virtual viewpoint in ways which do not correspond with the constraints of human body motion. For example, rotating the viewpoint around the axis of the human neck in excess of 360 degrees entails disorientation and confusion.

Contextual depth cues can remedy confounding aspects of common game mechanics

Apart from those aspects of rendering that are application-invariant, VR game programming poses special problems to the maintenance of user immersion and comfort, because of the visual conventions of video game user interface. This excellent talk by video game developer and UI designer Riho Kroll indicates some of the solutions to potentially problematic representations of certain popular game mechanics.

Kroll gives the example of objective markers in the context of a first-person game that are designed to guide the player to a target location on the map corresponding to the current game objective. Normally, objective markers are scale-invariant and unshaded, and therefore lack some of the important cues that allows the player to locate them in the virtual z-plane. Furthermore, objective markers tend to be excluded from occlusion reckonings. The consequence is that if the player’s view of the spatial context of an objective marker is completely occluded by another game object, almost all of the depth-perception cues for the location of the marker are unavailable. Kroll describes an inelegant but well-implemented solution: during similar conditions of extreme occlusion, an orthogonal grid describing the z-plane is blend-superimposed over the viewport. This orthogonal grid recedes in to the distance, behaving as expected according to the conventions of perspective and thereby providing a crucial and sufficient depth-perception cue in otherwise adversarial circumstances.

Automating the Boring Stuff!


I am Harsh Vardhan Tiwari, a first year Master’s student in Financial Engineering student, I am working on web scraping, which is a technique of writing code to extract data from the internet. There are several packages available for this purpose in various programming languages. I am am primarily using the Beautiful Soup 4 package in Python.  There are various resources available online for exploring the functionalities within Beautiful Soup, but the 2 resources I found the most helpful are:

  2. Web Scraping with Python: Collecting Data from the Modern Web by Ryan Mitchell

My project basically involves writing a fully automated program to download and archive data mostly in PDF format from about 80 webpages containing about a 1000 PDF documents in total. Imagine how boring it would be to download them manually and more so if these webpages are updated regularly and you need to perform this task on a monthly basis. It would take us hours and hours of work, visiting each webpage and clicking on all the PDF attachments on each webpage to perform this task. And even worse if you have to repeat this regularly! But do we actually need to do this? The answer is NO!

We have this powerful tool called Beautiful Soup in Python that can help us automate this task with ease. About a 100 lines of code can help us accomplish the task. I will now give you an overall outline of how the code could look like.

Step 1: Import the Modules

So this typically parses the webpage and downloads all the pdfs in it. I used BeautifulSoup but you can use mechanize or whatever you want.





Step 2: Input Data

Now you enter your data like your URL(that contains the pdfs) and the download path(where the pdfs will be saved) also I added headers to make it look a bit legit…but you can add yours…it’s not really necessary though. Also the BeautifulSoup is to parse the webpage for links





Step 3: The Main Program

This part of the program is where it actually parses the webpage for links and checks if it has a pdf extension and then downloads it. I also added a counter so you know how many pdfs have been downloaded.




Step 4: Now Just to Take Care of Exceptions

Nothing really to say here..just to make your program pretty..that is crash pretty XD XD



This post covers the case where you have to download all PDFs in a given webpage. You can easily extend it to the case of multiple webpages. In reality different webpages have different formats and it may not be as easy to identify the PDFs and therefore in the next post I will cover the different formats of the webpages that I encountered and what did I need to do to identify all the PDFs in them.

Thanks for reading till the end and hope you found this helpful!

Blog Post 2.0

Well, I am naming this as Blog Post 2.0, as there has been a serious revamping in terms of my project goals. Firstly, the work on 3D-modelling software is done. I will no longer be talking more about that. My whole focus will be on the data collection software, i.e. Suma. Oh wait, that’s off the table as well.

Will (my advisor) and I have come to a conclusion that after numerous failed attempts at implementing Suma, we will work on our own platform for data collection, which will be molded according to the needs of the Science and Engineering Library, and would be scalable enough, so that other libraries can, with some amount of work, implement this platform there as well.

Currently, the work that has been done is that I have created an array of 6 Library computers, and the librarians can just click on each computer to indicate that it is being used, or double click to indicate that the seat is occupied, but the PC is not being used and a personal Laptop is being used instead.

The next things that I am working on is to add the possibility of indicating that a certain PC is non-functional, and storing that information through different sessions of data collection.

The website is up and running (alpha-version) on

Lastly, let me introduce you to an amazing website that I used for the initial development of this website. It’s called, and it allows you to type in code for HTML5, CSS, and JS on the same portal, and it incorporates the code as soon as you complete typing. So, for the sake of testing, it is quite good a platform.

Hoping to complete a lot more by Blog Post 2.1

App4Apis (Update)

Phase: Final stages of completion

As we are going into the spring break, we want to the update the status of the project. Before diving into the details, a quick introduction of the project.

App4Apis: A one stop solution to access APIs that are designed to take the parameters in the query and return a json object. We have two types of configurations. The first is a list of preset APIs (Geocode, Human Resources Archive, Internet Archive) where we are well aware of the structure of the API and provide a form to update the necessary information for querying. The second is more generic where user provides an example url (including the query parameters) from which we identify the API request pattern. This pattern is then used to query the API with a larger dataset in the next step. We let the user download the results or the results can be sent as an email (helpful for particularly large datasets).

Status: In the last blog post, our to-do list was

  1. Finish few pending screens towards our goal.
  2. Integrate Geocode API into preset.
  3. Develop the functionality to support user to upload a file in the ad hoc query case.
  4. Cosmetic changes and make the website visually appealing.
  5. End to end exhaustive testing.
  6. Deploy!

From that list, we are able to finish the layout of pending screens, completed the integration of Geocode into the preset list, and made the website more visually appealing. We are in the process of end-to-end testing of the preset phase before moving onto the ad hoc query case. The redesigned screens of the website are attached at the end of the blog. Any feedback is welcome.

We are able to send the results to the user through email in the chunks of 1000 results per csv file. This will help the users to submit a task to the system and receive the results at a later point of time, enabling the system to handle large inputs and giving an option to user to not wait for the results.

Current tasks at hand are:

  1. Make the email sending process offline: Currently, we can send an email but we are not able to schedule it for a later point of time and allow the user to exit the screen. He can leave the application after submitting the tasks but we are not able to provide it as an option. I am working on taking the input as a task and schedule it for a later point of time.
  2. Complete testing of the preset APIs workflow.
  3. Complete testing of the ad-hoc requests workflow. 
  4. Deploy.

I am expecting to finish the first 3 tasks by this month end so that we can turn our focus on deployment from the start of next month.


Rohit Bharadwaj Gernapudi

The website screens are:





Motivated Object-oriented programming: Build something from scratch!

This semester, I am putting together a 3D data visualization tool using Processing, to demonstrate the usefulness of this language for rapid prototyping of ideas for 3D graphics applications, including virtual reality. The code, documentation, and issue tracking are, as of recently, hosted on Github here.

If you’ve ever taken a programming course or encountered an intermediate online tutorial in a language that encourages object-oriented programming (OOP), you will have probably studied an example that describes the language feature in terms of a motivating example (or two). You might have designed Dogs that subclass Animals, which emit ‘woof!’s; Cars with max_speeds; Persons with boolean genders, and so on…

But these are toy examples that are underwhelming and too straightforward to help in practice. Furthermore, they’re not all that interesting. Once you’ve completed the typical OOP introduction, you often end up with code that performs a function implemented elsewhere a hundred times over, with greater efficiency. That in itself is not that bad: we all need pedagogical examples simple enough to introduce in a 75-minute lecture. What’s worse is that you’ve written code that you simply don’t care about. And that’s a recipe for demoralization.  For the beginning programmer (yours truly) OOP becomes a convenient abstraction that bored you to death once or twice (and went ‘woof!’).

Nothing motivates design like the task of modeling a system with which you are otherwise quite familiar. Daniel Shiffman’s free online book, The Nature of Code, focuses on the simulation of natural (physical) systems and gently introduces OOP as a means to modeling “the real world” (rather than a toy example of a Car with an Engine). Of course, Shiffman’s models are simplistic too, relying basic mechanics and vector math to animate their construction. Nevertheless, his examples and exercises leverage your best guesses about how the world works and challenge you to implement them in code, which is the nature of (many kinds of) programming: to take the big world and, in code, make a small world that — invariably imperfectly — reflects the large.

My advice, then, is dive right in with a project that you care about, preferably a project that requires many “moving parts” such as interdependent entities (nodes that talk to and consume others), user extensibility, and large amounts of object reuse: a model of a mini-universe of sorts. The model doesn’t have to be physical: it can be of social relationships, knowledge, data. All it has to do is matter.

In the remainder of this post I will sketch the design of the project I am currently working.

Project design

The goal of the software is to generate 3D data visualizations from quantitative and qualitative data ingested from a CSV file. The display of the visualization should be separate from its construction, so that ultimately different display ‘engines’ can be swapped in and out to allow for the presentation of the visualization on, for example, the computer screen, a VR headset, a smartphone, or even in the form of a 3D-printed model.

As it stands, the engine has a structure as depicted below. Incidentally, there is a well-documented and ‘popular’ domain-specific language for the description of the relationships between objects, which can generate similar-looking diagrams out of code, called the Unified Modeling Language (UML), but to take it on requires its own post. So, the picture below is a rough approximation of the design of the engine rather than a reproducible blueprint (such as that provided by UML and the like).

There is exactly one Scene in the application, which contains a list of PrimitiveGroups, which themselves contain Primitives. A Primitive corresponds to a single data point: one row of the CSV input. A Primitive has a location in 3D space, as well as a velocity, which allows for the animated restructuring of the Primitives on the fly. Some simple primitives are included: a sphere (PrimitiveSphere), a cube (PrimitiveCube). New Primitives must subclass the Primitive class (which should never be instantiated: it is an abstract class). Primitives must have display() and update() methods. The display() method contains the calls to Processing’s draw functions (e.g. box()). At this point, you realize that Primitive should (and can) be implemented as a Java interface. After all, is Java at base. The Scene also contains an Axis object which can be switched on or off.

How does the engine generate the Primitives per the contents of the data file? And according to what rules? In many ways, this is the heart of any visualization engine. The concept of a DataBinding is introduced.

A DataBinding realizes a one-way mapping from the columns of a data source (i.e. kinds of data) to the properties of a Primitive, by returning a PrimitiveGroup which contains one Primitive for every row in the data source (read by the DataHandler, which is a very thin wrap around Processing’s Table object).

The mapping is specified by the contents of a DataBindingSchema, which is a hashmap (read in from a YAML file, see examples 1 2) in which the keys are the properties of Primitives and the values are column names in the data source. As a consequence, the DataBindingSchema specifies how the visual properties of Primitives respond to the data stored in the CSV file that is being read in. The DataBinding also has a validation method which throws a custom exception when the DataBindingSchema refers to column names and/or primitive properties which do not exist. It will ultimately also do type-checking.


Papal Documents Project Part #3 by Yanchen Liu

This post focuses on my work to digitize and transcribe Western MS 82, a major canon law text, describing some of the goals of this project as well as to indicate why this particular text should be of interest to the scholarly community and hence, why its introduction into broader circulation is particularly warranted. I then conclude with a brief note updating my work to produce a new, expanded version of the Libraries’ webpage Papal Documents: A Finding Aid.

Western MS 82, currently preserved in the Rare Book & Manuscript Library of Columbia University, is the most de luxe one of the six surviving manuscripts of Collectio Sinemuriensis, or Semur Collection. Modern-day scholars generally consider the second recension of this collection to be the earliest “Gregorian Reform” canonical collection, which embodies the spirit of the great Church reform movement of the eleventh and twelfth centuries. According to Linda Fowler-Magerl, the initial version of this canonical collection was composed at Reims at the close of the tenth century. While we know something of this first version, many of its canons have not survived. The second version, however, is better attested. There are six surviving examples of the second iteration, which date from the second half of the eleventh century and the early twelfth century. The earliest, most complete of these extant manuscripts, MS Semur-en-Auxois, BM M. 13, has given the collection its name. Columbia’s Western MS 82 seems to have been copied during the first decades of the twelfth century in northern France. To date there has been no critical edition or systematic study of either the Collectio Sinemuriensis or of Western MS 82.

Columbia acquired Western MS 82 in 2004. At the end of last year (2015) the manuscript was scanned into high-resolution images by the Preservation and Digital Conversion Division of Columbia. My project on this manuscript aims at producing a digital transcription that preserves nearly all the scribal and spelling features of the manuscript to provide historians, paleographers and medievalists not only with the contents of the full text but also textual clues that can provide valuable data for paleographical, linguistic, and historical investigations.

Western MS 82, as a parchment manuscript, comprises 15 quires and 119 folios, with the first folio and the last two folios of the last extant quire missing. On the verso of the first flyleaf, an eighteenth-century annotation indicates that this manuscript contains “Notitia provinciarum ecclesiasticarum Galliae“, also known as Notitia Galliarum, and “Collectio canonum“. Texts on the first nine folios, composed of the Notitia Galliarum and the capitulatio, a list of the rubrics of the canons in the collection, are laid out in two columns. The remaining text on ff. 10-119, i.e., the canons themselves, are written in single column. The canons are grouped into three books. Each book begins with an exaggerated and decorated initial (on folio 10r, 49v and 99v) that is nine to twenty-one lines high. The bodies of the canons are copied in a neat book hand. However, the rubrics of the canons, which are written in red ink, are very probably later add-ons, as they are often written in uneven lines at the end or even on the edge of paragraphs, and are enclosed by a curled line drawn on the left side of them to distinguish them from the canons. The script has an appearance of early stage Proto-Gothic.

Compared with other manuscripts of the Collectio Sinemuriensis, one of the most significant and intriguing features of Western MS 82 is that it opens with Notitia provinciarum ecclesiasticarum Galliae, a list of metropolitan cities and provinces of Gaul

.MS 82 picture 2

We do not have the first half of the list as a result of the missing first folio. Nevertheless, the surviving second half of the document, as well as the very fact that the compiler(s) of Western MS 82 chose to incorporate this document, raises many questions and invites further examination. This document, also known as Notitia Galliarum, was originally composed between the late fourth and the very early fifth century, before the massive Germanic invasions by the end of the first decade of the fifth century that isolated much of Gaul from the remainder of the fracturing Western Empire. There is still wide debate as to whether Notitia Galliarum was initially created out of administrative interests by a political institution or by a local church. Nevertheless, the earliest manuscript of this document, dating to the seventh-century, contains in its rubric “ut ordo exposcit pontificum” suggesting that at least since the early Middle Ages, Notitia Galliarum has been regarded and employed as a religious document mapping an ecclesiastical space

.MS 82 picture 3

The Notitia appears in several medieval canonical collections. There are, nevertheless, several peculiar facts about the specific version included in Western MS 82. In the first place, of all the surviving manuscripts of Collectio Sinemuriensis, only Western MS 82 incorporates this document. Further, while versions appears in other texts have been updated, this version appears to have retained its late antique form almost entirely. In other words, it was not updated to represent the actual provincial configuration of early twelfth century France. Only two new cities, civitas Nivernensium and civitas Nundunum, were added to what is found in the oldest extant version, included in a seventh-century manuscript. Some cities included in the list were not actually not bishoprics during the eleventh to the twelfth century, e.g., civitas Oscismorum, civitas Diablintum, civitas Bononensium and civitas Tungrorum. Why did the patron of Western MS 82 want such a “dated” text to stand at the beginning of the manuscript? Why did he invoke the ancient divisions of Gaul? What kind of a “conceptual territory” did he envision this canonical manuscript to represent? There may never be precise answers. Two hypotheses, however, could be employed to conjecture.

The first possibility is that this text is a product of territorial conflicts in northern France between ecclesiastical institutions. These were certainly common in the high Middle Ages. The patron of Western MS 82 might have requested the incorporation of old Roman texts, such as Notitia Galliarum, to buttress his territorial claims. The second possibility is that this ancient document, together with the lands of Gaul that it delineates, may have nothing to do with the real space, but that the patron of Western MS 82, by incorporating in the manuscript an ancient text that depicts the administrative system of the area, sought to invoke a sense of authority.

Both of these possibilities point to an emphasis on the authority and legitimacy manifested through antiquitas. Such emphasis is further accentuated in this manuscript through canons like the one that opens the second book (ff. 49v – 50r), where the initial letter P is exaggeratedly decorated (twenty-one lines high with a form of a bird or dragon vomiting tendrils and flowers). The red-ink rubric of the canon reads “Quod non liceat apostolicis successoribus constituta predecessorum infringere.” This canon, possibly drafted by Hincmar of Reims, is ascribed to Pope Symmachus (r. 498-514) and prohibits the successors of the ancient popes from abrogating the administrative and legal decisions of their predecessors.

MS 82 picture 1

At the same time, Western MS 82, through this specific canon, appears to have employed antiquitas to suppress, rather than buttress, the legislative power and ecclesiastical rights of the contemporary papacy. Such a feature would seem to distinguish this canon law manuscript from the kind of canonical collection likely to be favored by the reform papacy of the eleventh and the twelfth century, which generally asserted the intrinsic juridical power of the popes. Hence, modern scholars’ widely shared view of this canon law manuscript being essentially a product of the Gregorian Reform may have oversimplified the character of Western MS 82. Last but not least, this canon, together with other canons in this manuscript, appears to indicate a connection between Western MS 82 and Reims. These facts about this manuscript seem point us to the historical political tension between the archiepiscopal see Reims and the papacy, and the power relations between Rome and Reims during the Middle Ages.

The digital transcription of Western MS 82 will hopefully make this this manuscript more accessible to such investigation. The framing of the transcription with XMLtags in accordance with the rules of the Text Encoding Initiative will also hopefully enable users of the project to easily navigate and position themselves within the canons, to search for specific terms and variants within the codex and retrieve data in a more orderly fashion, and eventually to more easily conduct comparisons between Western MS 82 and the other surviving manuscripts of Collectio Sinemuriensis.

In closing, I should note that my project of updating the Papal Documents finding aid is under its final review. Most of the entries now have an annotation which introduces and summarizes the work. In addition, structure of the whole document has been adjusted to enhance its navigability. The “Background Bibliography” part has been considerably augmented, in order to help researchers and students to grasp the significance of individual works, and the history and terminology relevant to the study of papal documents.

Knight Lab Timeline JS3, an easy to use multimedia timeline tool

TimelineJS is an easy to use timeline creator developed by Northwestern University’s Knight Lab. The timelines created with TimelineJS can include a variety of multimedia, are easily publishable and embedded in websites, and is generated on demand for the user. This means that a timeline can be updated after being published — a very useful feature for projects which continue past a publication date. Further, TimelineJS is open source, highly customizable (for those technically inclined), and available through the Mozilla Public License (MPL), version 2.0. Most importantly, TimelineJS allows you to create beautiful projects with relative ease. You can see some examples of timelines created with TimelineJS by clicking here.

The documentation for TimelineJS is extensive and covers most of the basic topics, so there is no point in creating a basic tutorial in this blog post. A timeline can be created by following the 4 basic steps listed here. Instead of giving a crude tutorial, I will use this post to highlight some potential uses for this tool that are not immediately evident. With a little bit of work, this can be a fantastic tool to for film education and film production planning.

Two other uses for this tool come to mind for filmmakers and film students. The first is creating timelines that break down movies by shot in order to explain them more easily. I have prepared a timeline just for this purpose as an example, it can be seen below. Click here to see it in full screen.

The second use for this tool includes planning productions. With a little help, a director could easily upload his storyboards and visual inspirations for a scene into a timeline like this, making it easy for the crew to get a chronological list of inspirations for any given scene. A crew with iPads could have easy access to the entire archive of visual references for a scene without much trouble.

Timeline JS is a wonderful tool that, with a little bit of creativity, can be used for a variety of different purposes.

App4Apis (Pre-Alpha)

Phase: Pre-Alpha.

A month from the last update and we are back with another blogpost to quickly update our status. We will start with a brief summary of our work and then the mandatory status update.

What are we trying to do?

With an increase in the data resources around the world and as most of them are exposed through a REST API, we felt the need to develop an application that can ease the access of these datasets. Towards that direction, we intend to develop a web application which can help the users query the API of their choice without developing any new tools or scripts. More details on the project can be found at link.

Current Status:

During our last update, we were working on the designing the application and building the code to access 3 APIs of our interest (Geocode, Internet Archive, and Human Resources). This month, we freezed the design for our initial version. (Designer alert: Design is freezed only on paper and can be/ will be modified till project is released, and sometimes even after the release!) The screens below provide the barebones visualization of the application (by barebones, it is not visually appealing and reflects just the functionality). The cosmetic changes will be duly applied as we finish the skeleton and flesh of the application. The screens (in the order of their access are)Capture1


There are 2 flows in our application.

Flow 1: User will use any one of the preset APIs by uploading a csv file and give the column numbers of the required parameters to query the API. We will run the preset configuration for the csv file and will return a csv that can be downloaded with the results column populated from the API results. Corresponding screens are

Selecting the preset:

Capture2-1File upload and Values to be entered:


File populated with the results:







Flow 2: User will enter a new API url. We expect this url to be complete with all the necessary parameters to query the API as this url is used as an example for all the subsequent new requests. For example, 100 st&borough=manhattan&app_id=7a4e791c&app_key=2015DigitalIntern

Here the base url is and the query parameters are houseNumber=314&street=west 100 st&borough=manhattan&app_id=7a4e791c&app_key=2015DigitalIntern

We will query the API with this sample url to get all the output parameters and will allow the user to select the fields of his interest. We then allow him to give new values for the query parameters and display the results for the specified output parameters.  Corresponding screens are

User to enter the URL:Capture3-1

User selects the output parameters:

Capture3-2User gets the output:



As you can see, the screens are designed only to implement the required functionality. That is why we are still in the pre-alpha phase of the project development. The tasks that are in the pipeline are

  • Finish few pending screens towards our goal.
  • Integrate Geocode API into preset.
  • Develop the functionality to support user to upload a file in the adhoc query case.
  • Cosmetic changes and make the website visually appealing.
  • End to end exhaustive testing.
  • Deploy!

We are planning to finish these tasks by the end of January, 2016 and release the first version of our product.

Until next blog post, ciao.



OpenSCAD & SUMA updates (DCIP Blog Post 1.1)

As promised in the last blog post, here is a writeup about the “coding” based (amazing) 3D modelling software. OpenSCAD is a 3D modeling program based on constructive solid geometry (CSG), i.e. a complex surface or object can be created using Boolean operators to combine objects. There are some basic commands that one should know the usage of, before working on the software, and if they do, 3D modelling will be a “relative” piece of cake!

These 10 commands are in three different categories: shapes (cube, sphere, cylinder), transforms (translate, scale, rotate, mirror), and CSG (Boolean) operations (union, difference, intersection). Most of the operations that you need to do can be done with these 10 commands. So, to summarize, 3D modelling in OpenSCAD is as simple as mastering the usage of 10 commands, and then converting your problem (3d-model) into a combination of the shapes using CSG.

A use of the above commands is shown below for seeing just how powerful these commands can be — click to enlarge!

power of OpenSCAD commands

The other project that I mentioned in my last post is SUMA. There are two possible ways to go about it. A trial version can be developed using Vagrant and Virtual Box (they have put a version on their machine, and it can be accessed (theoretically) from your own PC).  I have been in constant touch with the group at NCSU Libraries, who developed this system, and they have been helping me with trying to debug the system, but there has been only limited success in this direction. I am able to run the client part of the system, but there are issues with the Ansible (automation platform) file, and I am waiting for them to release the 2.0 version of the same. It has been quite an informative interaction with them, and they have been quite helpful in this regard.

The other way is to go in full-fledged, and develop a complete system, coding all the server and client pages, and have everything on the same machine (instead of logging onto the virtual machine as in the case of the trial version). I have done a fair bit of coding, along with SQL integration for creating necessary databases for the program, managing correct permissions for admin and user. I am still working on the server part of things here, and since this is the first time I am working on something like that, there is a lot of reading involved as well. The plan is to first develop the framework on a PC, and then repeat it on a more restricted mobile/tablet platform, after the required system works seamlessly on the PC.


Perceptual Bases for Virtual Reality: Part 1, Audio

An important part of creating a truly immersive VR experience is the accurate representation of sounds in space to the user. If a sound source is in motion in virtual space, it stands to reason that we ought to hear the sound source moving.

One solution is to this problem is to use an array of loudspeakers arranged in space around the user. This technique – so-called ‘ambisonics’ – is not only expensive, but also requires space way in excess of the footprint of the average user seated at a consumer-grade computer. For example, Tod Machover’s (MIT) setup is shown below, and is typical of some ambisonic setups. The 5.1 standard for surround sound in home theatres (or related extensions, such as 7.1 – meaning 7 speakers plus a subwoofer) is consumer-grade technology which operates on a similar principle. Clever mixing and editing of movie soundtracks aims to trick the listener in to perceiving tighter sound-image associations by cueing sounds, the sources of which are apparent from the visual content of the media being displayed or projected, in a location in the sonic field corresponding to their virtual source.

Ambisonic sound set up with a circular array of Bowers and Wilkins loudspeakers surrounding a listenr

Tod Machover’s Ambisonic Setup (Source:

It might seem counterintuitive, but most of the psycho-acoustical cues that humans use to localize sounds in space can be replicated using headphones. This follows from the unsubtle observation that we have only two ears, and the slightly more subtle reflection on the results of experiments designed to establish precisely which sources of information our brains depend on in determining the perceived location of a sound source. This behavior is known in the related psychological literature as acoustic (or sound) source localization.

Jobbing programmers, however, don’t have to wade through the reams of scientific research that substantiate the details of the various mechanisms of acoustic source localization, as well their limitations and contingencies. The 3D Audio Rendering and Evaluation Guidelines (Level 1) spec provides baseline requirements for a minimally convincing 3D audio rendering, and provides physiological and psychological justifications for these requirements. Whilst it is exceptionally outdated and outmoded, it still provides a useful overview of the important perceptual bases for VR audio simulation. In particular, this specification is one of the motivating documents in the design of the (erstwhile) open source OpenAL 3D audio API and its descendants. In the remainder of this post, I briefly describe the most important binaural (i.e. stereo) audio cues which are thought to facilitate acoustic source localization in the human brain.

Interaural Intensity Difference

In plain terms: the intensity of the sound entering your ears will be different for each ear, depending on the location of the sound with respect to your head. This is due to two factors:

  1. sound attenuates in intensity over time as it passes through a medium, your ears are a non-zero distance apart, and sound propagates at a finite speed
  2. (more significantly) your head may ‘shadow’ the source of the sound when the source is off-center

You might think that you don’t have a big head, but it’s big enough to make a difference!

Interaural Time Difference

Since sound has a relatively unchanging velocity as it passes through the most common media that we may wish to model virtually, the time that it takes for sound to propagate from the source to one ear differs, very slightly. Our mind is sensitive to these differences, perhaps owing to the evolutionary utility of knowing the location of noisy predators (or prey). Knowing that the speed of sound is roughly constant, the mind performs a rudimentary triangulation in order to locate the sound source in the relevant plane, relative to the listener.

Audio-Visual Synergy

Finally, a less physiological cue: the co-incidence of aural and visual stimuli tricks the brain into attributing the contemporaneous sound to the source denoted or signified by the visual stimulus. By keeping latency between aural and visual stimuli low, we improve the likelihood of the perception of audio-visual synergy. This, in combination with the careful modeling of the above phenomena (amongst many others), contributes towards a more immersive aural experience. In turn, this improves the credibility of VR simulations that have an aural component.