Author Archives: ks3611

Python Open Lab November 9

This week we learned pandas, which is a package built on top of Numpy. It has Dataframe as its core data structure which is very useful for dealing with table data. Dataframe is made up of multidimensional arrays with rows and columns. It supports heterogeneous types and missing data, which is a great feature.

Pandas is great for loading files. It has different functions for reading csv, excel, html and sql. For example, if we want to load csv file, just use function read_csv().

import pandas as pd

taxi_data = pd.read_csv('./files/green_tripdata_2018-02.csv')

After this, the file will be loaded automatically as a dataframe. We can use function head() to check the first five lines of dataframe. This function will show column indexs and row indexs too. Another statistical function is describe(), it will show statistics like mean, std, min, 25%, 50%, 75%, max on each column.

To learn dataframe, we should have a good understanding of Series first. Series is very similar to List. The difference only is that Series has index for each element.


series_data = pd.Series([0.25, 0.5, 0.75, 1.0])



0    0.25

1    0.50

2    0.75

3    1.00

dtype: float64

For the above example, we can see that the index for value 0.25 is 0, the index for value 0.5 is 1,…  That’s what index of Series looks like.

The index of Series can also be characters.


series_data = pd.Series([0.25, 0.5, 0.75, 1.0],index=['a', 'b', 'c', 'd'])



a    0.25b    0.50c    0.75d    1.00dtype: float64

To access value 0.50, we can use “series_data[‘b’]”.

Dataframe is similar to Series, the major difference is that it is multidimensional.


 df = pd.DataFrame(np.random.rand(3, 2), columns=['foo', 'bar'], index=['a', 'b', 'c'])



             foo        bar

  a          0.77       0.52

  b          0.34       0.27

  c          0.69       0.51

If we want to get the second column, we can use or df[‘bar’]. To access a specific cell, there are two ways too. The first approach is to use function loc() which accesses cells by row index and column index. The second approach is to use function iloc() which accesses by row position and column position. For example, if we want to get value 0.27 in the above dataframe, we can use df.loc[‘b’,’bar’] or df.iloc[1,1]

Other features in pandas are also introduced, like unique() or dealing with missing values by function fillna(). In all, pandas is a very useful tool to deal with table data and do the analysis.

Python Open Lab November 2

This week we learned about File IO. IO means input and output. So the content is basically about reading and writing file.

Before doing any operations on file, we need to open the file. The command is open(filename, mode). ‘filename’ need to include the path of file. There are two ways to show the path, absolute path and relative path. Absolute path sees the file from the global view while relative path shows the file from the position of present script. In relative path, we use ‘.’ to show the current directory the script is in and use ‘..’ to show the parent directory of current directory.

Mode is about how we open the file. Common mode are ‘r’, ‘w’ and ‘a’. ‘r’ means we open the file to read only. ‘w’ means we open the file for writing. ‘a’ is similar to ‘w’. The different between ‘a’ and ‘w’ is that ‘w’ will erase previous content of the file to write and ‘a’ just append to the tail of previous content.

An example of opening file is :  afile = open(‘a.txt’,’r’)    afile is the file object we get from opening file a.txt in the read mode.

After opening the file, we can begin to do operations on it. We learned reading file first. To read write, we must open the file in the ‘r’ mode. Function read() can get all contents of the file. Function readline() can read file line by line. Function readlines() can get all lines of the file and return a list.

To write to file, we use function write(astring) to implement that.  Pass a string parameter to write() function, and the string will be written to file.

An example of reading file and writing file:

     afile = open(‘a.txt’,’r’)
     #try to read all content of file a.txt to content
     content =



     afile = open(‘a.txt’,’w’)
     #erase all content in a.txt, and write "hellow world" to it
     afile.write(“hello world”)


Python Open Lab October 26

This week we learned functions, which is very important for programmers. Functions are useful for procedural decomposition, maximize code reuse and minimize redundancy.

Functions should be declared like a variable before using.

def function(parameter1, parameter2…):

    do something

    return value

‘def’ is the keyword to show that we are defining a function. ‘function’ can be replaced by the function name.

An example :

def printHelloWorld():

    print(“hello world”)


After declaring a function, we call it when we want to use it. In the above example, we define a function called ‘printHelloWorld’ in line1 and line2. In the line3, we call it by its name.

Function parameters are values passed into the function when we call the function. By using parameters, we can introduce variables outside into the function.


The return value is to show the result of function to the main program. So main program assigns a task to the function and function executes the task. After the execution, function gives the result to main program.

An example of using function parameter and return(get the bigger number from two sums):

def sum(x, y):

    return float(x)+float(y)

    num1 = sum(1.0, 2.5) #num1 = 3.5

    num2 = sum(2.4, 1.6) #num2 = 4.0

    if num1 > num2:




Python Open Lab October 19

This week we mainly learned about condition statements.

First we learned how to read user-input from console by using function input(). Input() can introduce user input to our program so user can define some values and program can get that.

Then we looked at the definition of condition statement, which means when condition is met, code will be executed, otherwise the next statement will be executed. ‘If’ statement is introduced first. If the condition of the ‘if’ statement is satisfied, code in the ‘if’ structure is executed. ‘else’ statement can appear below the ‘if’ statement. When the condition of ‘if’ statement is not met, condition ‘else’ statement is stratified naturally.  ‘elif’ statement is to help program decide on different conditions. It is similar to ‘else’ statement but we can write conditions in the ‘elif’ statement. The structure is like this:

                        if <some condition>:

                               do A

                        elif <some condition>:

                               do B


                               do C

We learned how to apply ‘if-elif-else’ statement in the loop to do more complicated task.  Then we got an idea about what is ‘continue’ and ‘break’ statements. ‘continue’ skips the rest code of present iteration in the loop. ‘break’ jumps out of the present loop(end it). Nested loop is introduced for students to have a better understanding of ‘continue’ and ‘break’ statements.

Python Open Lab October 12

In this week, we continue to learn string, which is very important. Loop is also introduced. Examples like loop for a list, loop for a dictionary or loop for a string are taught.

We learned some useful functions of string.

  • len(str) — find the length of present string
  • str.find(“ab”) — search a string in present string
  • str.rstrip() — remove whitespace
  • str.replace(“red”,”green”) — replacement
  • str.split(“,”) — split
  • str.isdigit() — decide whether string is all digit
  • lower(), upper() — change string to uppercase or lowercase
  • str.endswith(“hello”) — test whether present string ends with another string

We talked about type conversion, like changing variable type from int to string or string to int.

Then we learned loop, which is about repeat steps/statements. We looked at the ‘while’ loop. ‘While’ loop can use an iteration variable to control the loop. There are generally two types of loop, finite loop and infinite loop. Finite loop stops when the termination condition is satisfied any more. Infinite loop never stops because the termination condition is never met. We learned ‘for’ loop then, which is very useful for iterating over a sequence. ‘for’ loop in a list is iterating over all elements in the list. ‘for’ loop in a dictionary is iterating over all keys of elements in the dictionary. ‘for’ loop in a string is iterating over all characters in a string.

With loop, we have the tool to scan data structures like list, dictionary and string without writing duplicate code.

Python Open Lab, October 5

In the Python Open Lab of this week, we learned list, dictionary, string.

For list, it can store multiple elements and many useful functions about list are introduced.

  • append(x) —   put an element to the tail of a list
  • insert(x) —   insert an element to specific position of a list
  • count(x) —   count number of a element in the list
  • remove(x) —   remove the first x in the list
  • sort()      —   sort a list
  • extend(List b) — put another list to the tail of present list

Slice of list and two-dimension list are also learned.


For dictionary, it is different from list because it appears in the form of “key: value” pairs. The keys need to be unique. It is need to be declared by using “{}”. The storage of dictionary is frank, just like “dict[“key”] = value”, so we finish storing one pair into the dictionary. To fetch the value in the dictionary, we just need to use its key, like “print(dict[“key”])”. Dictionary is very flexible and any value of a dictionary can be a list or dictionary. Complex operations can be done by dictionary.


For string, the concept string is that anything between double quotes and a single quote is a string. String concatenation is very simple, just using the operator “+”. Element fetch and slice of a string is same as that in a list.


Python has great built-in types like list, dictionary and string. They are powerful and time-saving for programmers. A good master of them can enhance coding efficiency and avoid mistakes.



Python Open Lab, September 28

This week is the first week of Python Open Lab in fall 2018. We talked about fundamental concepts about Python and basic types and operations in Python.

We first looked at the basic concept of programming languages and introduced python and its usage. The installation of Python for windows users were included so students can use Python in their terminal to write basic commands.

We learned the variable in Python and their different types. Types include int, float and bool. Function type() can be used to make judgement of type of variables. int, float have basic operations of addition, subtraction, multiplication, division. bool has operations of ‘and’ and ‘or’.

At last, list is introduced as an advanced data structure which can handle things that int,float and bool can not solve.


Basic types like int, float and bool are very important to programmers because they are used at a very high frequency. Understanding their concepts and operations can give students a good foundation for learning other advanced types.


The GitHub link of Python Open Lab is After installing Git, students can use terminal or git bash enter a new folder and use command “git clone”, after this operation, all files will be downloaded automatically. When we post new files, students can just use terminal to enter their git folder and use command “git pull” to get new files.