On June 13, 2025, ProQuest announced that the popular text-mining environment TDM Studio now includes a beta feature that lets users integrate GPT models into their R or Python workbench notebooks. TDM Studio, available at no cost to researchers at Columbia, opens up the ProQuest databases to large-scale analyses of the full-text corpora. It comes […]
Libraries Acquire Full-Text Corpus Data
In December, the Libraries acquired twelve full-text corpus datasets, compiled by Mark Davies, a retired professor of linguistics from Brigham Young University. The corpora will help Columbia researchers across many disciplines to understand how language is and has been used around the world, and they serve as another mark in the Libraries’ commitment to supporting […]
Data Engineering in Python with Polars 1
Today, we begin learning Polars, an alternative data analysis Python library to pandas. We’ll learn about how Polars is similar to and different from pandas and why it is an appealing choice in 2025 for ETL (extract-transform-load) operations. […]
SQL and NoSQL Databases in Python with Pandas
Today we looked at using databases in Python. […]
Git and Gitting Organized (Also, Text Editing)
Today we talk a bit about project management and see how to use Git with VS Code. […]
Resource Spotlight: newly-purchased Dave Leip election datasets
The Research Data Services (RDS) just purchased a few new election datasets from Dave Leip for “United States Presidential Presidential Results” & “US Presidential Primary Election Results for Republican Party and Democratic Party”. All the RDS licensed Dave Leip datasets can be found in CLIO. This resource is available only to current Columbia affiliates. Please […]
Day One Exploratory Data Analysis with JavaScript
Today we return back to our Observable notebooks to learn how to do lightning fast exploratory data analysis! […]
Resource Spotlight: University of Florida Election Lab
The University of Florida Election Lab Data Resources is a new resource that presents precinct level data for US national state and local elections for recent election years (as far back as 2010). […]
Day One Generating Jamstack Websites
Today we ported our knowledge of the Observable workflow into making our own bespoke Jamstack websites. This was a rocky road, but everyone won in the end! […]
Installing Observable Framework from Zero
On November 7, we’ll be deconstructing websites built with Observable’s “Framework” framework for making data-driven web apps like dashboards. But before we can deconstruct, we have to construct. This short video shows you how to get an Observable Framework site running in four steps. […]