Python Open Lab Week 3: Feb 26

For week three we decided to tackle strings, string methods, and functions:

String Methods

Strings are amongst the most popular types in Python. We can create them simply by enclosing characters in quotes. Python treats single quotes the same as double quotes. Creating strings is as simple as assigning a value to a variable.

The string data type has multiple methods. Here are all of the methods of list objects:

Examples:

str.upper()  – Making Strings uppercase

str.lower() – Making Strings lowercase

str.join() method will concatenate two strings, but in a way that passes one string through another.

str.split() method returns a list of strings that are separated by whitespace if no other parameter is given.

str.replace() method can take an original string and return an updated string with some replacement.

Boolean methods:

Method True if
str.isalnum() If string consists of only alpha-numeric values
str.isalpha() If string consists of only alphabets
str.islower() If string consists of only lower-case  values
str.isnumeric() String consists of only numeric characters
str.isspace() String consists of only whitespace characters
str.istitle() String is in title case
str.isupper() String’s alphabetic characters are all upper case

String comparison

You can use ( > , < , <= , <= , == , !=  ) to compare two strings. Python compares string lexicographically i.e using ASCII value of the characters.

Suppose you have str1  as “Mary”  and str2  as “Mac” . The first two characters fromstr1  and str2 ( M  and M ) are compared. As they are equal, the second two characters are compared. Because they are also equal, the third two characters ( r  and c ) are compared. And because ‘r’  has greater ASCII value than ‘c’ , str1  is greater than str2 .

Functions:

You use functions in programming to bundle a set of instructions that you want to use repeatedly or that, because of their complexity, are better self-contained in a sub-program and called when needed. That means that a function is a piece of code written to carry out a specified task. To carry out that specific task, the function might or might not need multiple inputs. When the task is carried out, the function can or can not return one or more values.

There are three types of functions in Python:

  • Built-in functions, such as help() to ask for help, min() to get the minimum value, print() to print an object to the terminal
  • User-Defined Functions (UDFs), which are functions that users create to help them out
  • Anonymous functions, which are also called lambda functions because they are not declared with the standard def keyword.

Functions vs Methods:

A method refers to a function which is part of a class. You access it with an instance or object of the class. A function doesn’t have this restriction: it just refers to a standalone function. This means that all methods are functions but not all functions are methods.

 

Spring 2018 R Open Lab: More Fundamentals

Last week, we walked through the R starter kit which introduced most of the useful basic concepts in R such as vectors, matrices, and loops. This week, we continued to talk about more basics in R and demonstrated examples. The goal of this lab is to get attendants a better understanding of how R language works so that they can transform their specific real-life problem into R algorithms smoothly.

Here is the link to the script for this open lab:

https://drive.google.com/file/d/1SePSSVF980EJfCxv4eZ4AQlCP7vlTbe7/view?usp=sharing

The script also has comments and explanations. You can open it with R studio and run it step by step.

Thank you all for showing up. If you have further question regarding topics covered in the material, please feel free to drop by during next week’s lab or email me or leave a comment. See you all next week!

Python Open Lab Feb 19

On week two of the open labs, we continued building on the starter kit and worked on python data structures. We discussed loops (both for and while), dictionaries and dictionary methods.

A while loop is a concept that, when implemented, executes a piece of code over and over again, while a given condition remains true:

When constructing a while loop, you have to ensure that it has all three of the following elements:

  1. The while keyword
  2. A condition that translates to either true or false
  3. A block of code you want to repeat

Example:

# Input/pick a random number

number = 2

# Set the condition of the while loop

while number < 5 :

print(“Thank you”)

# Increment the value of the variable “number by 1”

number = number+1

When you run this code you produce the following result:

Thank you

Thank you

Thank you

 

For loop:

The for loop is used to iterate over elements of a sequence, it is often used when you have a piece of code which you want to repeat “x” number of times:

Example:

Let’s say you have a list:

fresh_fruits = [“Apple”, “Banana”, “Peach”, “Strawberry”]

for fruits in fresh_fruits:

print fruits

That means, for every element that we assign the variable fruits, in the list fresh_fruits, print out the variable fruits.

  1. Dictionaries

Dictionaries are similar to what their name suggests – a dictionary. In a dictionary, you have an ‘index’ of words, and for each of them a definition. In python, the word is called a ‘key’, and the definition a ‘value’. The values in a dictionary aren’t numbered – tare similar to what their name suggests – a dictionary. In a dictionary, you have an ‘index’ of words, and for each of them a definition. In python, the word is called a ‘key’, and the definition a ‘value’. The values in a dictionary aren’t numbered – they aren’t in any specific order, either – the key does the same thing. You can add, remove, and modify the values in dictionaries. Example: telephone book.

In the Python dictionary, each key is separated from its value by a colon (:), the items are separated by commas, and the whole thing is enclosed in curly braces.

An empty dictionary without any items is written with just two curly braces, like this: {}. Keys within a dictionary must be a data type such as strings, numbers, or tuples.

Example:

dict = {‘Name’: ‘Michael’, ‘Age’: 7, ‘Class’: ‘First’}

print “dict[‘Name’]: “, dict[‘Name’]
print “dict[‘Age’]: “, dict[‘Age’]

When we run this code, it produces the following result:

dict[‘Name’]:  Michael
dict[‘Age’]:  7

Dictionary Methods:

  1. clear(): Removes all items from the dictionary in place
  2. copy(): Returns a new dictionary with the same contents
  3. fromkeys(): creates a new dictionary from the given keys
  4. get(): A way to access dictionary elements. Returns None when element not present in the dictionary.
  5. has_key():  boolean method to check if a key is present in the dictionary
  6. pop(): returns value corresponding to the key and also removes the pair from the dictionary
  7. popitem(): pops out a random item from the dictionary

 

Highlights from the Data Collection: U.S. Election Data

The Libraries Numeric Data Catalog Holdings has some interesting data on the United States election results starting from 1912 to the most recent election year. Included are data on not only the presidential elections, but also gubernatorial, senatorial, congressional, and special senatorial elections.

The source of the data is Dave Leip’s Atlas of U.S. Presidential Elections, which you may already be familiar with. If you’re not, it’s a source that major news sources like The Atlantic and The Wall Street Journal, and The New York Times, have used for election reporting.

There are some fascinating things you can discover with the data sets. The following are some findings on the most recent presidential election in 2016.

Here is a visual representation of the total number of votes cast by state:

Total Votes by State – 2016 Presidential Election. Note: Alaska and Hawaii had 318,608 and 428,937 total votes cast, respectively.

 

And here, we can see the total number of votes cast by county across the country.  

Total Votes Cast by County – 2016 Presidential Election. Note: Alaska did not have any data on total votes cast by county.

 

Below, we have separated the Republican and Democrat votes and have displayed them by county as a percentage of the total votes per county.

2016 Presidential Election – Republican and Democrat Percentage of Total Votes, by County.

 

The data also has information about other presidential parties that received more than 5% of the total votes. We can see here, that in several counties around Salt Lake City, Utah, had around 20% or more of their total votes go to a candidate other than those supported by the Republican or Democrat parties.

2016 Presidential Election Votes to Other Party With Greater Than 5% of Total Votes, by County.

 

While total counts are useful and necessary information, we can do even more insightful things using this data set. One thing we can do with the data is see the differences in the percentage between the top candidates, also called “election competitiveness”, by state. For example, the Democratic candidate won California with a margin of over 20%, and in Pennsylvania the Republican candidate won by a slim margin of 0-5% of the votes.

Here is the election competitiveness by county across the 48 contiguous states:

Election Competitiveness, by State – 2016 Presidential Election.

 

and again, by county:

Election Competitiveness, by County. – 2016 Presidential Election. Note: All of Hawaii’s counties, Kauai, Honolulu, Maui, and Hawaii, had a Democratic win of a 20% plus margin. Alaska had no county data reported.

 

Since the data goes back all the way until 1912, let’s compare this information with a couple of the past recent presidential elections. 

Here’s the election competitiveness by state in the 2012 presidential elections compared to  2016:

Election Competitiveness in the 2012 vs. 2016 U.S. Presidential Elections, by State.

 

And 2008 compared to 2016:

Election Competitiveness in the 2008 vs. 2016 U.S. Presidential Elections, by State.

 

Just for fun, let’s take a look at the election competitiveness of the states from the election 20 years prior to the last one, in 1996.

Election Competitiveness in the 1996 vs. 2016 U.S. Presidential Elections, by State.

 

As you can see, this data set is very relevant and has plenty of intriguing information for those who are looking to do compelling analysis on U.S. elections.

The data includes detailed information such as the candidate names and party ballot listing per state, a national summary which summarizes vote totals by state for each candidate, and even data for New England towns (ME, MA, CT, RI, VT, NH) for those focused on election data of a particular town in the Northeast. You can read more about the what is included in this study, at it’s README file here.

Python Open Lab Nov 28: Blog Style!

Hello all,

Due to some complex scheduling issues, I am posting here the material we would have covered in lab tomorrow. Please feel free to contact me for any questions (data@library.columbia.edu). Enjoy!

Python Objects and Classes (cont’d!)

Self: 

What is the self variable in Python?

The self variable represents the instance of the object itself. Unlike most object-oriented languages that pass the instance of an object as a hidden parameter to the methods defined on an object; Python does not. It must be explicitly declared. All methods in python, including some special methods like initializer, have self.

In other words, the self  variable refers to the object which invokes the method. When you create new object the self parameter in the __init__  method is automatically set to reference the object you have just created.

 

More theory…do’s and don’ts of Self:

You use self when:

  1. Defining an instance method. It is passed automatically as the first parameter when you call a method on an instance, and it is the instance on which the method was called.
  2. Referencing a class or instance attribute from inside an instance method. Use it you want to call a method or access a name (variable) on the instance the method was called on, from inside that method.

You don’t use self when:

  1. You call an instance method normally. For example if you input [instance = MyClass()], you call [MyClass.my_method] as [instance.my_method(some_var)] not as [instance.my_method(self, some_var)].
  2. You reference a class attribute from outside an instance method but inside the class definition.

Let’s try an example!

First, we must create an object from class:

Input:

1       p1 – Person(‘anna’)   # here we have created a new person object called p1
2       print(p1.whoami())
3       print(p1.name)

 

1      You are anna
2      anna
As discussed in lab, it is bad practice to give access to your data fields outside of the class itself. Let’s see how we can hide data fields: To hide data fields, first you have to define private data fields. In Python, this can be done by using two leading underscores (__). Moreover, a private method can also be defined using two leading underscores.

Here’s an example I created on Jupyter notebook. If you’d like, I can email you the notebook as lines are colour formatted and the spaces match up to the correct inputs (as you know, your code will be affected by the spacing).

 

Expected Output:

 

 

Now, I’d like to show you if it’s possible to access  __balance  data field outside of the class.

Input:

 

As you can see, now __balance  is not accessible outside the class.

What questions do you have at this point? Does this match up with the aforementioned theoretical concept of self/hidden self data? Would you like more practice?

Fall 2017 Python Open Lab Week 3

October 10, 2017

Week 3’s lab was intense! We started with list methods, where we left off last week and went through the following:

          list.append(x): Add an item to the end of the list.

          list.extend(L): Extend the list by appending all the items in the given list.

          list.insert(i, x): Insert an item at a given position.

          list.remove(x): Remove the first item from the list whose value is x  (it will come up as an              error if there is no such item).

          list.pop([i]): Remove the item at the given position in the list, and return it. If no index is                specified, a.pop() removes and returns the last item in the list.

          list.index(x): Return the index in the list of the first item whose value is x (it will come up                as an error if there is no such item).

list.count(x): Return the number of times x appears in the list.

list.sort(cmp=None, key=None, reverse=False): Sort the items of the list in place.

list.reverse(): Reverse the elements of the list, in place.

 

Here is an example that uses most of the list methods 

>>> a = [66.25, 333, 333, 1, 1234.5]
>>> print(a.count(333), a.count(66.25), a.count(‘x’))
2 1 0
>>> a.insert(2, 1)
>>> a.append(333)
>>> a
[66.25, 333, -1, 333, 1, 1234.5, 333]
>>> a.index(333)
1
>>> a.remove(333)
>>> a
[66.25, -1, 333, 1, 1234.5, 333]
>>> a.reverse()
>>> a
[333, 1234.5, 1, 333, -1, 66.25]
>>> a.sort()
>>> a
[-1, 1, 66.25, 333, 333, 1234.5]
>>> a.pop()
1234.5
>>> a
[-1, 1, 66.25, 333, 333]

 

We then introduced the Python Dictionary:

Python Dictionary

In the Python dictionary, each key is separated from its value by a colon (:), the items are separated by commas, and the whole thing is enclosed in curly braces.

An empty dictionary without any items is written with just two curly braces, like this: {}. Keys within a dictionary must be a data type such as strings, numbers, or tuples.

Example:

dict = {‘Name’: ‘Michael’, ‘Age’: 7, ‘Class’: ‘First’}

print “dict[‘Name’]: “, dict[‘Name’]
print “dict[‘Age’]: “, dict[‘Age’]

When we run this code, it produces the following result:

dict[‘Name’]:  Michael
dict[‘Age’]:  7

 

And finally, we very briefly touched upon string methods and boolean methods – by no means did we cover all the material we intended (as you see below) – but will pick up in Week 4 with string methods first!

 

String Methods

The string data type has multiple methods. Here are all of the methods of list objects:

str.upper()  – Making Strings uppercase

str.lower() – Making Strings lowercase

str.join() method will concatenate two strings, but in a way that passes one string through another.

str.split() method returns a list of strings that are separated by whitespace if no other parameter is given.

str.replace() method can take an original string and return an updated string with some replacement.

Boolean methods:

Method True if
str.isalnum() If string consists of only alpha-numeric values
str.isalpha() If string consists of only alphabets
str.islower() If string consists of only lower-case  values
str.isnumeric() String consists of only numeric characters
str.isspace() String consists of only whitespace characters
str.istitle() String is in title case
str.isupper() String’s alphabetic characters are all upper case

Example:

>>> string = “Hello”

>>> string.upper()

‘HELLO’

>>> string.lower()

‘hello’

>>> string = “Hello,world”

>>> string.split(“,”)

[‘Hello’, ‘world’]

>>> string[::-1]

‘dlrow,olleH’

>>> len(string)

11

>>> string1 = “Hello”

>>> string2 = “World”

>>> string1+string2

‘HelloWorld’

Hope this material is helpful and inspiring. Please don’t hesitate to contact us for any clarifications. And as always, if you have any requests for material to cover in our labs, please email or contact us!

 

Fall 2017 Python Open Lab Week 2

Week 2: October 3rd

This week we started with a brief review of the basics from week 1 and the starter kit. We continued on with Data Structures and worked through Lists, Tuples and Dictionaries. These concepts were easy to approach and we went over many practice examples on the way. Towards the end of the session, we introduced the different types of lists that one can use in Python as well as list methods. List methods, however, are quite complex and we only got through .append and .extend in great detail.

Next week we will continue with list methods!

Please comment or email us for any questions. Below is the worksheet we used for this week:

 

Python Open Lab Week II

Outline and reference:

  1. Summary from Open Lab I – Starter Kit
  2. Data Structures:

Lists:

Lists are what they seem – a list of values. Each one of them is numbered, starting from zero – the first one is numbered zero, the second 1, the third 2, etc. You can remove values from the list, and add new values to the end. Example: Your many cats’ names.

Tuples:

Tuples are just like lists, but you can’t change their values. The values that you give it first up, are the values that you are stuck with for the rest of the program. Again, each value is numbered starting from zero, for easy reference. Example: the names of the months of the year.

Dictionaries:

Dictionaries are similar to what their name suggests – a dictionary. In a dictionary, you have an ‘index’ of words, and for each of them a definition. In python, the word is called a ‘key’, and the definition a ‘value’. The values in a dictionary aren’t numbered – tare similar to what their name suggests – a dictionary. In a dictionary, you have an ‘index’ of words, and for each of them a definition. In python, the word is called a ‘key’, and the definition a ‘value’. The values in a dictionary aren’t numbered – they aren’t in any specific order, either – the key does the same thing. You can add, remove, and modify the values in dictionaries. Example: telephone book.

Lists:

  • Indexing
  • Slicing
  • List operations (concat, etc.)
  • Multiplying
  • “In” operator

*Question: Given a year, month, day, print as follows:

21st July, 1991

Methods:

  • Append
  • Extend
  • Count
  • Index
  • Insert
  • Pop
  • Remove
  • Reverse
  • Sort/Sorted

Fall 2017 Python Open Lab Week 1

Week 1 September 26
In week 1 of our Python Open lab, we introduced the Python Starter Kit and went over Python basics such as expressions, variables and statements, floats, statements, integers, strings, booleans, and control flow statements. The class was a mix of students with little to no experience with Python, to a few advanced users, to those with some knowledge of basic concepts looking to strengthen their skillset.
We moved at a slower pace in this first session to ensure students grasped the basic concepts needed to continue on their programming journey with confidence.
We hope that anyone with outstanding questions contact us, and anyone thinking of attending sign up through the DSC website. All are welcome and we will ensure you are caught up with the relevant material we covered in the past!
Next week, we will start with a review of the starter kit and continue with lists.

New Additions to the Collection, May 22, 2017

The following titles were recently added or updated.

CU Numeric Data Catalog Holdings

New

CU Spatial Data Catalog Holdings

New: