As mentioned in the instructions, all materials can be open in Colab as Jupyter notebooks. In this way users can run the code in the cloud. It is highly recommanded to follow the tutorials in the right order.


Background

A bug occurs when things do not work the way you want it to, even though Python only give what you ask for. It is because we have a different understandings compared to the programming languages. To resolve this, we need to find out the sources of errors and adjust our code accordingly. In this notebook you will learn different approach to resolve potential issues and some tips to avoid them from happening.


Presumption:

https://www.w3schools.com/python/python_try_except.asp

https://www.tutorialsteacher.com/python/error-types-in-python



1) Understand Error Messages

As you have seen from the link above, there are many different types of errors in Python which is meant to be helpful to tell user "what is wrong?". But some errors might be more common than the others. Let's look at some common errors.

Let's say you want to add a second collection to the first one (collection). But instead of "collection", you wrote "Collection".

In this case Python tells you

NameError: name 'Collection' is not defined

It is because Python is searching for "Collection" and cannot find one.

Name Error: Raised when a variable is not found in the local or global scope.

What you need to do it just correct the name.

collection = ["Chinese Posters in Harvard-Yenching Manchukuo Collection"]

Collection + ["Linked Archive of Asian Postcards"]
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-3-cba7c442d3f4> in <module>()
      1 collection = ["Chinese Posters in Harvard-Yenching Manchukuo Collection"]
      2 
----> 3 Collection + ["Linked Archive of Asian Postcards"]

NameError: name 'Collection' is not defined

Error occurs too when the library is not imported before use.

np.max([1,2])
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-6-9be6f90ccf89> in <module>()
----> 1 np.arange(1,5)

NameError: name 'np' is not defined

Or when you have one item in list while you are asking for the second one (Remember Python starts with 0). You get:

IndexError: list index out of range

Index Error: Raised when the index of a sequence is out of range.

Then you need to correct the index.

collection = ["Chinese Posters in Harvard-Yenching Manchukuo Collection"]

collection[1]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-4-e10b1e230939> in <module>()
      1 collection = ["Chinese Posters in Harvard-Yenching Manchukuo Collection"]
      2 
----> 3 collection[1]

IndexError: list index out of range

There is also Type Error: Raised when a function or operation is applied to an object of an incorrect type.

It occurs because np.max() looks for a number and we input a string.

import numpy as np
np.max([3]) # this works
3
import numpy as np
np.max([collection])
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-11-9da63d4b2b0e> in <module>()
      1 import numpy as np
----> 2 np.max([collection])

<__array_function__ internals> in amax(*args, **kwargs)

/usr/local/lib/python3.7/dist-packages/numpy/core/fromnumeric.py in amax(a, axis, out, keepdims, initial, where)
   2704     """
   2705     return _wrapreduction(a, np.maximum, 'max', axis, None, out,
-> 2706                           keepdims=keepdims, initial=initial, where=where)
   2707 
   2708 

/usr/local/lib/python3.7/dist-packages/numpy/core/fromnumeric.py in _wrapreduction(obj, ufunc, method, axis, dtype, out, **kwargs)
     85                 return reduction(axis=axis, out=out, **passkwargs)
     86 
---> 87     return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
     88 
     89 

TypeError: cannot perform reduce with flexible type

Another error that is really common is that when you type something grammatically wrong in the Python sense.

Syntax Error: Raised by the parser when a syntax error is encountered.

collection = ["Chinese Posters in Harvard-Yenching Manchukuo Collection"]

collection[0) # it should be [0], not[0)
  File "<ipython-input-18-011172c21475>", line 3
    collection[0)
                ^
SyntaxError: invalid syntax

2) Spot the Sources of Errors

In order to make errors easier to spot, it is always a good practice if you run only a small chunk of code in a time. For example, you call separate a long chunk of code into different cells.

Every cell can be used for one main action, for example:

 
 
 
 

Another way to spot errors in code is that try to reduce the code you have. For example:

Let's say you want to find out the length of the first advertisement in the list (商務印書館發行書目介紹), and you built a function for it. But instead of 11 characters, you get 2.

Although there is no problems running this code, as it is not giving you what you want, it is also a bug.

What you can do is to reduce the code and to see if things work the way you want.

ad = ["商務印書館發行書目介紹","女界寶、非洲樹皮丸、助肺呼吸香膠、家普魚肝油、清血解毒海波藥、納佛補天汁、良丹(五洲大藥房)"]

import re # this you will learn in the notebook web scrapping so you do not need to understand everything for now

def number_of_character(text):
  list_ = list(text) # list() is used to split words into a list of characters
  length = len(list_) # len() is to check the length 
  return length

number_of_character(ad)
2

Let's check the first step: Can we use list to split the characters as we want?

list(ad)
['商務印書館發行書目介紹', '女界寶、非洲樹皮丸、助肺呼吸香膠、家普魚肝油、清血解毒海波藥、納佛補天汁、良丹(五洲大藥房)']

And we realize, we want to split 商務印書館發行書目介紹, but this is not done!

list("advertisement") # this is what we want, to split characters
['a', 'd', 'v', 'e', 'r', 't', 'i', 's', 'e', 'm', 'e', 'n', 't']

And then we might find out, it only work if we use ad[0] instead.

list(ad[0])
['商', '務', '印', '書', '館', '發', '行', '書', '目', '介', '紹']
ad = ["商務印書館發行書目介紹","女界寶、非洲樹皮丸、助肺呼吸香膠、家普魚肝油、清血解毒海波藥、納佛補天汁、良丹(五洲大藥房)"]

import re # this you will learn in the notebook web scrapping so you do not need to understand everything for now

def number_of_character(text):
  list_ = list(text)
  length = len(list_)
  return length

number_of_character(ad[0])
11

3) Google and stackoverflow

Sometimes errors are not so clear to you. What you can do is to copy the errors and Google them. Stackoverflow is also another website that can be really helpful.

collection = ["Chinese Posters in Harvard-Yenching Manchukuo Collection", "Linked Archive of Asian Postcards"]
typ = [1,0]

collection[typ]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-13-9eca3a76ef67> in <module>()
      2 typ = [1,0]
      3 
----> 4 collection[typ]

TypeError: list indices must be integers or slices, not list

Then you might find out what you need to select collection item from typ is to use Numpy array instead of list.

collection = np.array(["Chinese Posters in Harvard-Yenching Manchukuo Collection", "Linked Archive of Asian Postcards"])
typ = np.array([1,0])

collection[typ]
array(['Linked Archive of Asian Postcards',
       'Chinese Posters in Harvard-Yenching Manchukuo Collection'],
      dtype='<U56')

4) Use Try and Except

In order to avoid errors, what we can also do is to use try and except. It means asking Python to try something out, if it does not work, then do Plan B instead of giving you errors.

Be careful that this method only appy if the errors come from unexpected outliners in inputs. It will not give you any meaningful results if the errors lie in your code itself.


It works as follow:

try:

(do plan a) # <- indented block

except:

(do plan b) # <- indented block

try:
  print(1+1)
except:
  print(0)
2


For example, the same function work well for the first and second item, but there is a typo in the third item, so that it is not a string, but an integer.

In this case, the function will not work as it expects a string.

collection = ["Chinese Posters in Harvard-Yenching Manchukuo Collection", "Linked Archive of Asian Postcards", 4]
def number_of_character(text):
  list_ = list(text)
  length = len(list_)
  return length

number_of_character(collection[2])
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-50-b4200bdffa93> in <module>()
      4   return length
      5 
----> 6 number_of_character(collection[2])

<ipython-input-50-b4200bdffa93> in number_of_character(text)
      1 def number_of_character(text):
----> 2   list_ = list(text)
      3   length = len(list_)
      4   return length
      5 

TypeError: 'int' object is not iterable

What we can do is to set up a Plan B:

If the words cannot be split, then use an empty list ([]).


OUR PLAN B

except:

list_ = []

def number_of_character(text):
  try:
    list_ = list(text)
  except:
    list_ = []
  length = len(list_)
  return length

number_of_character(collection[2])
0

Now, we do not get the same error anymore. It is particularly useful when we are automating the task: because if we want going through thousands of documents, we do not want the code to crash because of one little typo.

5) Check Documentation

If the errors come from your code itself, the easiest way to inspect the problems is to check the documentation of the function you used. If you are using Jupyter Notebook, you can always highlight the function you typed and the documentation of this function will appear, illustrating what inputs are expected.

You can also choose to directly Google the function and check the examples. For exmaple, if the following error occurs, by searching the reverse function "python reverse()" we can see from the documentation that the list is expected before the function ad(), not inside the ().

reverse(ad)
ad
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-20-e88da2471207> in <module>()
      1 # error
----> 2 reverse(ad)
      3 ad

NameError: name 'reverse' is not defined
ad.reverse()
ad
['女界寶、非洲樹皮丸、助肺呼吸香膠、家普魚肝油、清血解毒海波藥、納佛補天汁、良丹(五洲大藥房)', '商務印書館發行書目介紹']



Previous Lesson: Coding Practice Basics

Next Lesson: Functions and Loops Basics


Additional information

This notebook is provided for educational purpose and feel free to report any issue on GitHub.


Author: Ka Hei, Chow

License: The code in this notebook is licensed under the Creative Commons by Attribution 4.0 license.

Last modified: December 2021




References:

https://www.tutorialsteacher.com/python/error-types-in-python