As mentioned in the instructions, all materials can be open in Colab as Jupyter notebooks. In this way users can run the code in the cloud. It is highly recommanded to follow the tutorials in the right order.

This tutorial is a follow-up from the last Pandas tutorial which introduce functions for working with text. Nonetheless, numerical operations in Pandas is also essential knowledge when it comes to statistics in the data. Here we use an example of UNESCO heritage sites to demonstrate how to work with numbers and datetime in Pandas. It will also cover some basic knowledge about plotting using Matplotlib and Pandas.


Presumptions: Same as the previous notebook


import io
import pandas as pd
import requests

# read data
url = 'https://examples.opendatasoft.com/explore/dataset/world-heritage-unesco-list/download/?format=csv&timezone=Europe/Berlin&lang=en&use_labels_for_header=true&csv_separator=%3B'

df = pd.read_csv(url, sep=";")
df.head(2) # head() is used for viewing the first few rows of data
Name (EN) Name (FR) Short description (EN) Short Description (FR) Justification (EN) Justification (FR) Date inscribed Danger list Longitude Latitude Area hectares Category Country (EN) Country (FR) Continent (EN) Continent (FR) Geographical coordinates
0 Architectural, Residential and Cultural Comple... Ensemble architectural, résidentiel et culture... The Architectural, Residential and Cultural Co... L’ensemble architectural, résidentiel et cultu... Criterion (ii): The architectural, residential... Critère (ii) : L’ensemble architectural, résid... 2005-01-01 NaN 26.69139 53.22278 0.0 Cultural Belarus Bélarus Europe and North America Europe et Amérique du nord 53.22278,26.69139
1 Rock Paintings of the Sierra de San Francisco Peintures rupestres de la Sierra de San Francisco From c. 100 B.C. to A.D. 1300, the Sierra de S... Dans la réserve d'El Vizcaíno, en Basse-Califo... NaN NaN 1993-01-01 NaN -112.91611 27.65556 182600.0 Cultural Mexico Mexique Latin America and the Caribbean Amérique latine et Caraïbes 27.65556,-112.91611
df["Country (EN)"] # selecting one column
0                                                 Belarus
1                                                  Mexico
2                                                 Romania
3                                                   Italy
4                                          Belgium,France
                              ...                        
1047     Bosnia and Herzegovina,Croatia,Serbia,Montenegro
1048                                                China
1049    United Kingdom of Great Britain and Northern I...
1050                                                 Chad
1051                                               France
Name: Country (EN), Length: 1052, dtype: object

By typing .values, we can convert one column in the Pandas dataframe to Numpy array.

country_arr = df["Country (EN)"].values # to numpy
country_arr
array(['Belarus', 'Mexico', 'Romania', ...,
       'United Kingdom of Great Britain and Northern Ireland', 'Chad',
       'France'], dtype=object)
import numpy as np

unique_name = np.unique(country_arr) # the list of country. np.unique() return values only one time no matter how many times do they appear
unique_name[:3] # check the first three countries only
array(['Afghanistan', 'Albania', 'Algeria'], dtype=object)

Data Inspection

We can inspect our data fame by filtering, querying, and subsetting. For example, we can check all the entries from China. Let's first filter the relevant columns (Name, Category and Country) from our data frame.

df.filter(items=["Name (EN)","Category","Country (EN)"])
Name (EN) Category Country (EN)
0 Architectural, Residential and Cultural Comple... Cultural Belarus
1 Rock Paintings of the Sierra de San Francisco Cultural Mexico
2 Monastery of Horezu Cultural Romania
3 Mount Etna Natural Italy
4 Belfries of Belgium and France Cultural Belgium,France
... ... ... ...
1047 Stećci Medieval Tombstones Graveyards Cultural Bosnia and Herzegovina,Croatia,Serbia,Montenegro
1048 Jiuzhaigou Valley Scenic and Historic Interest... Natural China
1049 Blenheim Palace Cultural United Kingdom of Great Britain and Northern I...
1050 Lakes of Ounianga Natural Chad
1051 Mont-Saint-Michel and its Bay Cultural France

1052 rows × 3 columns

What we can also do is to query. For example, we can query heritages that are from China and belong to cultural category. Please remember all operations without assigning back to the dataframe itself is only temporary (except inplace = True). Using query(), we need to input a query string and we need to be careful that we need to use back ticks(`) to enclose column names with space, as well as to use single and double quoatation marks to avoid confusion.

For example, "Country (EN)== 'China' & Category == 'Cultural'" will be okay but "Country (EN)== "China" & Category == "Cultural"" will run into errors.

df.query("`Country (EN)` == 'China' & Category == 'Cultural'")
Name (EN) Name (FR) Short description (EN) Short Description (FR) Justification (EN) Justification (FR) Date inscribed Danger list Longitude Latitude Area hectares Category Country (EN) Country (FR) Continent (EN) Continent (FR) Geographical coordinates
32 Tusi Sites Sites du tusi Located in the mountainous areas of south-west... Situé dans les régions montagneuses du sud-oue... NaN NaN 2015-01-01 NaN 109.966944 28.998611 781.2800 Cultural China Chine Asia and the Pacific Asie et pacifique 28.9986111111,109.966944444
38 The Great Wall La Grande Muraille In c. 220 B.C., under Qin Shi Huang, sections ... Vers 220 av. J.-C., Qin Shin Huang entreprit d... NaN NaN 1987-01-01 NaN 116.083330 40.416670 2151.5500 Cultural China Chine Asia and the Pacific Asie et pacifique 40.41667,116.08333
68 Mausoleum of the First Qin Emperor Mausolée du premier empereur Qin No doubt thousands of statues still remain to ... Sur ce site archéologique qui ne fut découvert... NaN NaN 1987-01-01 NaN 109.100000 34.383333 0.0000 Cultural China Chine Asia and the Pacific Asie et pacifique 34.38333333,109.1
81 Site of Xanadu Site de Xanadu North of the Great Wall, the Site of Xanadu en... Situé au nord de la Grande Muraille, ce site d... NaN NaN 2012-01-01 NaN 116.185128 42.358000 25131.2700 Cultural China Chine Asia and the Pacific Asie et pacifique 42.358,116.185127778
127 Temple and Cemetery of Confucius and the Kong ... Temple et cimetière de Confucius et résidence ... The temple, cemetery and family mansion of Con... Le temple, le cimetière et la demeure de famil... NaN NaN 1994-01-01 NaN 116.975000 35.611670 0.0000 Cultural China Chine Asia and the Pacific Asie et pacifique 35.61167,116.975
134 Fujian <em>Tulou</em> <em>Tulou</em> du Fujian Fujian Tulou is a property of 46 buildings con... Le site des Tulou du Fujian, comprend 46 maiso... NaN NaN 2008-01-01 NaN 117.685833 25.023056 152.6500 Cultural China Chine Asia and the Pacific Asie et pacifique 25.0230555556,117.685833333
165 Dazu Rock Carvings Sculptures rupestres de Dazu The steep hillsides of the Dazu area contain a... Les montagnes abruptes de la région de Dazu ab... Criterion (i): The Dazu carvings represent the... Critère (i) : De par leur grande qualité esthé... 1999-01-01 NaN 105.705000 29.701110 20.4100 Cultural China Chine Asia and the Pacific Asie et pacifique 29.70111,105.705
219 Lushan National Park Parc national de Lushan Mount Lushan, in Jiangxi, is one of the spirit... Le site du mont Lushan, dans le Jiangxi, const... The Committee decided to inscribe this propert... Le Comité a décidé d'inscrire ce bien sur la b... 1996-01-01 NaN 115.866667 29.433333 0.0000 Cultural China Chine Asia and the Pacific Asie et pacifique 29.43333333,115.8666667
220 Imperial Tombs of the Ming and Qing Dynasties Tombes impériales des dynasties Ming et Qing It represents the addition of three Imperial T... L’extension ajoute trois tombes impériales de ... Criterion (i): The harmonious integration of r... Critère (i) : l'intégration harmonieuse d'ense... 2000-01-01 NaN 124.793889 41.707222 3434.9399 Cultural China Chine Asia and the Pacific Asie et pacifique 41.70722222,124.7938889
449 Imperial Palaces of the Ming and Qing Dynastie... Palais impériaux des dynasties Ming et Qing à ... Seat of supreme power for over five centuries ... Siège du pouvoir suprême pendant plus de cinq ... Criterion (i): The Imperial Palaces represent ... Critère (i) : Les Palais impériaux représenten... 1987-01-01 NaN 123.446944 41.794167 12.9600 Cultural China Chine Asia and the Pacific Asie et pacifique 41.79416667,123.4469444
457 Mountain Resort and its Outlying Temples, Chengde Résidence de montagne et temples avoisinants à... The Mountain Resort (the Qing dynasty's summer... La résidence de montagne, palais d'été de la d... NaN NaN 1994-01-01 NaN 117.938330 40.986940 0.0000 Cultural China Chine Asia and the Pacific Asie et pacifique 40.98694,117.93833
459 West Lake Cultural Landscape of Hangzhou Paysage culturel du lac de l’Ouest de Hangzhou The West Lake Cultural Landscape of Hangzhou, ... Le paysage inscrit a inspiré des poètes, artis... NaN NaN 2011-01-01 NaN 120.140833 30.237500 3322.8800 Cultural China Chine Asia and the Pacific Asie et pacifique 30.2375,120.140833333
545 Cultural Landscape of Honghe Hani Rice Terraces Paysage culturel des rizières en terrasse des ... The Cultural Landscape of Honghe Hani Rice Ter... Ce site de 16 603 hectares est situé dans le s... NaN NaN 2013-01-01 NaN 102.779981 23.093278 16603.2200 Cultural China Chine Asia and the Pacific Asie et pacifique 23.0932777778,102.779980556
590 Ancient Building Complex in the Wudang Mountains Ensemble de bâtiments anciens des montagnes de... The palaces and temples which form the nucleus... Les palais et temples qui constituent le noyau... NaN NaN 1994-01-01 NaN 111.000000 32.466670 0.0000 Cultural China Chine Asia and the Pacific Asie et pacifique 32.46667,111.0
601 Historic Centre of Macao Centre historique de Macao Macao, a lucrative port of strategic importanc... Macao, riche port marchand d’une grande import... Criterion (ii): The strategic location of Maca... Critère (ii) : L’emplacement stratégique de Ma... 2005-01-01 NaN 113.536461 22.191292 16.1678 Cultural China Chine Asia and the Pacific Asie et pacifique 22.1912919444,113.536461111
605 Yin Xu Yin Xu The archaeological site of Yin Xu, close to An... Le site archéologique de Yin Xu, proche de la ... NaN NaN 2006-01-01 NaN 114.313889 36.126667 414.0000 Cultural China Chine Asia and the Pacific Asie et pacifique 36.1266666666,114.313888889
628 Yungang Grottoes Grottes de Yungang The Yungang Grottoes, in Datong city, Shanxi P... Les grottes de Yungang, à Datong, province du ... Criterion (i): The assemblage of statuary of t... Critère (i) : L’ensemble de la statuaire des g... 2001-01-01 NaN 113.122220 40.109720 348.7500 Cultural China Chine Asia and the Pacific Asie et pacifique 40.10972,113.12222
653 Peking Man Site at Zhoukoudian Site de l'homme de Pékin à Zhoukoudian Scientific work at the site, which lies 42 km ... À 42 km au sud-ouest de Pékin, le site, dont l... NaN NaN 1987-01-01 NaN 115.916667 39.733333 480.0000 Cultural China Chine Asia and the Pacific Asie et pacifique 39.73333333,115.9166667
671 Historic Ensemble of the Potala Palace, Lhasa Ensemble historique du Palais du Potala, Lhasa The Potala Palace, winter palace of the Dalai ... Le palais du Potala, palais d'hiver du dalaï-l... NaN NaN 1994-01-01 NaN 91.117170 29.657920 60.5000 Cultural China Chine Asia and the Pacific Asie et pacifique 29.65792,91.11717
739 Summer Palace, an Imperial Garden in Beijing Palais d'Été, Jardin impérial de Beijing The Summer Palace in Beijing – first built in ... Le palais d'Été de Beijing, créé en 1750, détr... Criterion i: The Summer Palace in Beijing is a... Critère i : le Palais d'Eté de Beijing est une... 1998-01-01 NaN 116.141111 39.910556 297.0000 Cultural China Chine Asia and the Pacific Asie et pacifique 39.91055556,116.1411111
743 Mount Qingcheng and the Dujiangyan Irrigation ... Mont Qingcheng et système d’irrigation de Duji... Construction of the Dujiangyan irrigation syst... La construction du système d'irrigation de Duj... Criterion (ii): The Dujiangyan Irrigation Syst... Critère (ii) : Le système d’irrigation de Duji... 2000-01-01 NaN 103.605280 31.001670 0.0000 Cultural China Chine Asia and the Pacific Asie et pacifique 31.00167,103.60528
746 Historic Monuments of Dengfeng in “The Centre ... Monuments historiques de Dengfeng au « centre ... Mount Songshang is considered to be the centra... Songshang est considéré comme le mont sacré ce... NaN NaN 2010-01-01 NaN 113.067719 34.458747 825.0000 Cultural China Chine Asia and the Pacific Asie et pacifique 34.4587472222,113.067719444
769 The Grand Canal Le Grand Canal The Grand Canal is a vast waterway system in t... Ce vaste système de navigation intérieure au s... NaN NaN 2014-01-01 NaN 112.468333 34.693889 20819.1100 Cultural China Chine Asia and the Pacific Asie et pacifique 34.6938888889,112.468333333
772 Old Town of Lijiang Vieille ville de Lijiang The Old Town of Lijiang, which is perfectly ad... La vieille ville de Lijiang, harmonieusement a... The Committee decided to inscribe this site on... Le Comité a décidé d’inscrire ce site sur la b... 1997-01-01 NaN 100.233330 26.866670 145.6000 Cultural China Chine Asia and the Pacific Asie et pacifique 26.86667,100.23333
808 Kaiping Diaolou and Villages Diaolou et villages de Kaiping Kaiping Diaolou and Villages feature the Diaol... Les diaolou, maisons fortifiées de village de ... NaN NaN 2007-01-01 NaN 112.565861 22.285519 371.9480 Cultural China Chine Asia and the Pacific Asie et pacifique 22.2855194444,112.565861111
839 Ancient Villages in Southern Anhui – Xidi and ... Anciens villages du sud du Anhui – Xidi et Hon... The two traditional villages of Xidi and Hongc... Les deux villages traditionnels de Xidi et de ... Criterion (iii): The villages of Xidi and Hong... Critère (iii) : Les villages de Xidi et de Hon... 2000-01-01 NaN 117.987500 29.904444 52.0000 Cultural China Chine Asia and the Pacific Asie et pacifique 29.90444444,117.9875
851 Zuojiang Huashan Rock Art Cultural Landscape Paysage culturel de l’art rupestre de Zuojiang... Located on the steep cliffs in the border regi... Situés sur des falaises abruptes dans les régi... NaN NaN 2016-01-01 NaN 107.023056 22.255556 6621.6000 Cultural China Chine Asia and the Pacific Asie et pacifique 22.2555555556,107.023055556
883 Classical Gardens of Suzhou Jardins classiques de Suzhou Classical Chinese garden design, which seeks t... Le paysagisme classique chinois, qui cherche à... The Committee decided to inscribe this propert... Le Comité a décidé d'inscrire ce bien sur la b... 1997-01-01 NaN 120.450000 31.316667 11.9220 Cultural China Chine Asia and the Pacific Asie et pacifique 31.31666667,120.45
903 Ancient City of Ping Yao Vieille ville de Ping Yao Ping Yao is an exceptionally well-preserved ex... Ping Yao est un exemple exceptionnellement bie... The Committee decided to inscribe this propert... Le Comité a décidé d'inscrire ce bien sur la b... 1997-01-01 NaN 112.154440 37.201390 245.6200 Cultural China Chine Asia and the Pacific Asie et pacifique 37.20139,112.15444
921 Longmen Grottoes Grottes de Longmen The grottoes and niches of Longmen contain the... Les grottes et niches de Longmen abritent le p... Criterion (i): The sculptures of the Longmen G... Critère (i) : Les sculptures des grottes de Lo... 2000-01-01 NaN 112.466667 34.466667 331.0000 Cultural China Chine Asia and the Pacific Asie et pacifique 34.46666667,112.4666667
930 Temple of Heaven: an Imperial Sacrificial Alta... Temple du Ciel, autel sacrificiel impérial à B... The Temple of Heaven, founded in the first hal... Fondé dans la première moitié du XVe siècle, l... Criterion i: The Temple of Heaven is a masterp... Critère i : Le Temple du Ciel est un chef-d'œu... 1998-01-01 NaN 116.444722 39.845556 215.0000 Cultural China Chine Asia and the Pacific Asie et pacifique 39.84555556,116.4447222
971 Mount Wutai Mont Wutai With its five flat peaks, Mount Wutai is a sac... Avec ses cinq plateaux, le Mont Wutai est une ... NaN NaN 2009-01-01 NaN 113.563333 39.030556 18415.0000 Cultural China Chine Asia and the Pacific Asie et pacifique 39.0305555556,113.563333333
1008 Mogao Caves Grottes de Mogao Situated at a strategic point along the Silk R... Situées en un point stratégique de la Route de... NaN NaN 1987-01-01 NaN 94.816670 40.133330 0.0000 Cultural China Chine Asia and the Pacific Asie et pacifique 40.13333,94.81667
1010 Capital Cities and Tombs of the Ancient Kogury... Capitales et tombes de l’ancien royaume de Kog... The site includes archaeological remains of th... Ce site comprend les vestiges archéologiques d... Criterion (i): The tombs represent a masterpie... Critère (i) : Les tombes représentent un chef ... 2004-01-01 NaN 126.187222 41.156944 4164.8599 Cultural China Chine Asia and the Pacific Asie et pacifique 41.15694444,126.1872222

To avoid those confusions, another option is to subset data frame without query string. We can do it simply using []. However, we also need to make use of () to group our query into orders.

For example:

df["Country (EN)"] == "China" & df["Category"] == "Cultural"

In the above line, there is no separation between "China" and & so it might be interpreted as:

df["Country (EN)"] == ("China" & df["Category"] == "Cultural")

which will run into errors.

What we need to do is to group them:

(df["Country (EN)"] == "China") & (df["Category"] == "Cultural")

so we make sure it will be interpreted as:

(col-A == a) & (col-B == b)

df[(df["Country (EN)"] == "China") & (df["Category"] == "Cultural")]
Name (EN) Name (FR) Short description (EN) Short Description (FR) Justification (EN) Justification (FR) Date inscribed Danger list Longitude Latitude Area hectares Category Country (EN) Country (FR) Continent (EN) Continent (FR) Geographical coordinates
32 Tusi Sites Sites du tusi Located in the mountainous areas of south-west... Situé dans les régions montagneuses du sud-oue... NaN NaN 2015-01-01 NaN 109.966944 28.998611 781.2800 Cultural China Chine Asia and the Pacific Asie et pacifique 28.9986111111,109.966944444
38 The Great Wall La Grande Muraille In c. 220 B.C., under Qin Shi Huang, sections ... Vers 220 av. J.-C., Qin Shin Huang entreprit d... NaN NaN 1987-01-01 NaN 116.083330 40.416670 2151.5500 Cultural China Chine Asia and the Pacific Asie et pacifique 40.41667,116.08333
68 Mausoleum of the First Qin Emperor Mausolée du premier empereur Qin No doubt thousands of statues still remain to ... Sur ce site archéologique qui ne fut découvert... NaN NaN 1987-01-01 NaN 109.100000 34.383333 0.0000 Cultural China Chine Asia and the Pacific Asie et pacifique 34.38333333,109.1
81 Site of Xanadu Site de Xanadu North of the Great Wall, the Site of Xanadu en... Situé au nord de la Grande Muraille, ce site d... NaN NaN 2012-01-01 NaN 116.185128 42.358000 25131.2700 Cultural China Chine Asia and the Pacific Asie et pacifique 42.358,116.185127778
127 Temple and Cemetery of Confucius and the Kong ... Temple et cimetière de Confucius et résidence ... The temple, cemetery and family mansion of Con... Le temple, le cimetière et la demeure de famil... NaN NaN 1994-01-01 NaN 116.975000 35.611670 0.0000 Cultural China Chine Asia and the Pacific Asie et pacifique 35.61167,116.975
134 Fujian <em>Tulou</em> <em>Tulou</em> du Fujian Fujian Tulou is a property of 46 buildings con... Le site des Tulou du Fujian, comprend 46 maiso... NaN NaN 2008-01-01 NaN 117.685833 25.023056 152.6500 Cultural China Chine Asia and the Pacific Asie et pacifique 25.0230555556,117.685833333
165 Dazu Rock Carvings Sculptures rupestres de Dazu The steep hillsides of the Dazu area contain a... Les montagnes abruptes de la région de Dazu ab... Criterion (i): The Dazu carvings represent the... Critère (i) : De par leur grande qualité esthé... 1999-01-01 NaN 105.705000 29.701110 20.4100 Cultural China Chine Asia and the Pacific Asie et pacifique 29.70111,105.705
219 Lushan National Park Parc national de Lushan Mount Lushan, in Jiangxi, is one of the spirit... Le site du mont Lushan, dans le Jiangxi, const... The Committee decided to inscribe this propert... Le Comité a décidé d'inscrire ce bien sur la b... 1996-01-01 NaN 115.866667 29.433333 0.0000 Cultural China Chine Asia and the Pacific Asie et pacifique 29.43333333,115.8666667
220 Imperial Tombs of the Ming and Qing Dynasties Tombes impériales des dynasties Ming et Qing It represents the addition of three Imperial T... L’extension ajoute trois tombes impériales de ... Criterion (i): The harmonious integration of r... Critère (i) : l'intégration harmonieuse d'ense... 2000-01-01 NaN 124.793889 41.707222 3434.9399 Cultural China Chine Asia and the Pacific Asie et pacifique 41.70722222,124.7938889
449 Imperial Palaces of the Ming and Qing Dynastie... Palais impériaux des dynasties Ming et Qing à ... Seat of supreme power for over five centuries ... Siège du pouvoir suprême pendant plus de cinq ... Criterion (i): The Imperial Palaces represent ... Critère (i) : Les Palais impériaux représenten... 1987-01-01 NaN 123.446944 41.794167 12.9600 Cultural China Chine Asia and the Pacific Asie et pacifique 41.79416667,123.4469444
457 Mountain Resort and its Outlying Temples, Chengde Résidence de montagne et temples avoisinants à... The Mountain Resort (the Qing dynasty's summer... La résidence de montagne, palais d'été de la d... NaN NaN 1994-01-01 NaN 117.938330 40.986940 0.0000 Cultural China Chine Asia and the Pacific Asie et pacifique 40.98694,117.93833
459 West Lake Cultural Landscape of Hangzhou Paysage culturel du lac de l’Ouest de Hangzhou The West Lake Cultural Landscape of Hangzhou, ... Le paysage inscrit a inspiré des poètes, artis... NaN NaN 2011-01-01 NaN 120.140833 30.237500 3322.8800 Cultural China Chine Asia and the Pacific Asie et pacifique 30.2375,120.140833333
545 Cultural Landscape of Honghe Hani Rice Terraces Paysage culturel des rizières en terrasse des ... The Cultural Landscape of Honghe Hani Rice Ter... Ce site de 16 603 hectares est situé dans le s... NaN NaN 2013-01-01 NaN 102.779981 23.093278 16603.2200 Cultural China Chine Asia and the Pacific Asie et pacifique 23.0932777778,102.779980556
590 Ancient Building Complex in the Wudang Mountains Ensemble de bâtiments anciens des montagnes de... The palaces and temples which form the nucleus... Les palais et temples qui constituent le noyau... NaN NaN 1994-01-01 NaN 111.000000 32.466670 0.0000 Cultural China Chine Asia and the Pacific Asie et pacifique 32.46667,111.0
601 Historic Centre of Macao Centre historique de Macao Macao, a lucrative port of strategic importanc... Macao, riche port marchand d’une grande import... Criterion (ii): The strategic location of Maca... Critère (ii) : L’emplacement stratégique de Ma... 2005-01-01 NaN 113.536461 22.191292 16.1678 Cultural China Chine Asia and the Pacific Asie et pacifique 22.1912919444,113.536461111
605 Yin Xu Yin Xu The archaeological site of Yin Xu, close to An... Le site archéologique de Yin Xu, proche de la ... NaN NaN 2006-01-01 NaN 114.313889 36.126667 414.0000 Cultural China Chine Asia and the Pacific Asie et pacifique 36.1266666666,114.313888889
628 Yungang Grottoes Grottes de Yungang The Yungang Grottoes, in Datong city, Shanxi P... Les grottes de Yungang, à Datong, province du ... Criterion (i): The assemblage of statuary of t... Critère (i) : L’ensemble de la statuaire des g... 2001-01-01 NaN 113.122220 40.109720 348.7500 Cultural China Chine Asia and the Pacific Asie et pacifique 40.10972,113.12222
653 Peking Man Site at Zhoukoudian Site de l'homme de Pékin à Zhoukoudian Scientific work at the site, which lies 42 km ... À 42 km au sud-ouest de Pékin, le site, dont l... NaN NaN 1987-01-01 NaN 115.916667 39.733333 480.0000 Cultural China Chine Asia and the Pacific Asie et pacifique 39.73333333,115.9166667
671 Historic Ensemble of the Potala Palace, Lhasa Ensemble historique du Palais du Potala, Lhasa The Potala Palace, winter palace of the Dalai ... Le palais du Potala, palais d'hiver du dalaï-l... NaN NaN 1994-01-01 NaN 91.117170 29.657920 60.5000 Cultural China Chine Asia and the Pacific Asie et pacifique 29.65792,91.11717
739 Summer Palace, an Imperial Garden in Beijing Palais d'Été, Jardin impérial de Beijing The Summer Palace in Beijing – first built in ... Le palais d'Été de Beijing, créé en 1750, détr... Criterion i: The Summer Palace in Beijing is a... Critère i : le Palais d'Eté de Beijing est une... 1998-01-01 NaN 116.141111 39.910556 297.0000 Cultural China Chine Asia and the Pacific Asie et pacifique 39.91055556,116.1411111
743 Mount Qingcheng and the Dujiangyan Irrigation ... Mont Qingcheng et système d’irrigation de Duji... Construction of the Dujiangyan irrigation syst... La construction du système d'irrigation de Duj... Criterion (ii): The Dujiangyan Irrigation Syst... Critère (ii) : Le système d’irrigation de Duji... 2000-01-01 NaN 103.605280 31.001670 0.0000 Cultural China Chine Asia and the Pacific Asie et pacifique 31.00167,103.60528
746 Historic Monuments of Dengfeng in “The Centre ... Monuments historiques de Dengfeng au « centre ... Mount Songshang is considered to be the centra... Songshang est considéré comme le mont sacré ce... NaN NaN 2010-01-01 NaN 113.067719 34.458747 825.0000 Cultural China Chine Asia and the Pacific Asie et pacifique 34.4587472222,113.067719444
769 The Grand Canal Le Grand Canal The Grand Canal is a vast waterway system in t... Ce vaste système de navigation intérieure au s... NaN NaN 2014-01-01 NaN 112.468333 34.693889 20819.1100 Cultural China Chine Asia and the Pacific Asie et pacifique 34.6938888889,112.468333333
772 Old Town of Lijiang Vieille ville de Lijiang The Old Town of Lijiang, which is perfectly ad... La vieille ville de Lijiang, harmonieusement a... The Committee decided to inscribe this site on... Le Comité a décidé d’inscrire ce site sur la b... 1997-01-01 NaN 100.233330 26.866670 145.6000 Cultural China Chine Asia and the Pacific Asie et pacifique 26.86667,100.23333
808 Kaiping Diaolou and Villages Diaolou et villages de Kaiping Kaiping Diaolou and Villages feature the Diaol... Les diaolou, maisons fortifiées de village de ... NaN NaN 2007-01-01 NaN 112.565861 22.285519 371.9480 Cultural China Chine Asia and the Pacific Asie et pacifique 22.2855194444,112.565861111
839 Ancient Villages in Southern Anhui – Xidi and ... Anciens villages du sud du Anhui – Xidi et Hon... The two traditional villages of Xidi and Hongc... Les deux villages traditionnels de Xidi et de ... Criterion (iii): The villages of Xidi and Hong... Critère (iii) : Les villages de Xidi et de Hon... 2000-01-01 NaN 117.987500 29.904444 52.0000 Cultural China Chine Asia and the Pacific Asie et pacifique 29.90444444,117.9875
851 Zuojiang Huashan Rock Art Cultural Landscape Paysage culturel de l’art rupestre de Zuojiang... Located on the steep cliffs in the border regi... Situés sur des falaises abruptes dans les régi... NaN NaN 2016-01-01 NaN 107.023056 22.255556 6621.6000 Cultural China Chine Asia and the Pacific Asie et pacifique 22.2555555556,107.023055556
883 Classical Gardens of Suzhou Jardins classiques de Suzhou Classical Chinese garden design, which seeks t... Le paysagisme classique chinois, qui cherche à... The Committee decided to inscribe this propert... Le Comité a décidé d'inscrire ce bien sur la b... 1997-01-01 NaN 120.450000 31.316667 11.9220 Cultural China Chine Asia and the Pacific Asie et pacifique 31.31666667,120.45
903 Ancient City of Ping Yao Vieille ville de Ping Yao Ping Yao is an exceptionally well-preserved ex... Ping Yao est un exemple exceptionnellement bie... The Committee decided to inscribe this propert... Le Comité a décidé d'inscrire ce bien sur la b... 1997-01-01 NaN 112.154440 37.201390 245.6200 Cultural China Chine Asia and the Pacific Asie et pacifique 37.20139,112.15444
921 Longmen Grottoes Grottes de Longmen The grottoes and niches of Longmen contain the... Les grottes et niches de Longmen abritent le p... Criterion (i): The sculptures of the Longmen G... Critère (i) : Les sculptures des grottes de Lo... 2000-01-01 NaN 112.466667 34.466667 331.0000 Cultural China Chine Asia and the Pacific Asie et pacifique 34.46666667,112.4666667
930 Temple of Heaven: an Imperial Sacrificial Alta... Temple du Ciel, autel sacrificiel impérial à B... The Temple of Heaven, founded in the first hal... Fondé dans la première moitié du XVe siècle, l... Criterion i: The Temple of Heaven is a masterp... Critère i : Le Temple du Ciel est un chef-d'œu... 1998-01-01 NaN 116.444722 39.845556 215.0000 Cultural China Chine Asia and the Pacific Asie et pacifique 39.84555556,116.4447222
971 Mount Wutai Mont Wutai With its five flat peaks, Mount Wutai is a sac... Avec ses cinq plateaux, le Mont Wutai est une ... NaN NaN 2009-01-01 NaN 113.563333 39.030556 18415.0000 Cultural China Chine Asia and the Pacific Asie et pacifique 39.0305555556,113.563333333
1008 Mogao Caves Grottes de Mogao Situated at a strategic point along the Silk R... Situées en un point stratégique de la Route de... NaN NaN 1987-01-01 NaN 94.816670 40.133330 0.0000 Cultural China Chine Asia and the Pacific Asie et pacifique 40.13333,94.81667
1010 Capital Cities and Tombs of the Ancient Kogury... Capitales et tombes de l’ancien royaume de Kog... The site includes archaeological remains of th... Ce site comprend les vestiges archéologiques d... Criterion (i): The tombs represent a masterpie... Critère (i) : Les tombes représentent un chef ... 2004-01-01 NaN 126.187222 41.156944 4164.8599 Cultural China Chine Asia and the Pacific Asie et pacifique 41.15694444,126.1872222

So now, let's filter all Chinese sites regardless of heritage types.

china_site = df[df["Country (EN)"] == "China"]
china_site.head(1)
Name (EN) Name (FR) Short description (EN) Short Description (FR) Justification (EN) Justification (FR) Date inscribed Danger list Longitude Latitude Area hectares Category Country (EN) Country (FR) Continent (EN) Continent (FR) Geographical coordinates
5 Sichuan Giant Panda Sanctuaries - Wolong, Mt S... Sanctuaires du grand panda du Sichuan - Wolong... Sichuan Giant Panda Sanctuaries, home to more ... Les Sanctuaires du grand panda du Sichuan abri... NaN NaN 2006-01-01 NaN 103.0 30.833333 924500.0 Natural China Chine Asia and the Pacific Asie et pacifique 30.8333333333,103.0

Check the number of sites we have in the data frame:

china_site['Name (EN)'].count()
49

As the column namings are a bit confusion with the spacings and capital letters, we will rename the columns:

china_site = china_site[["Name (EN)","Date inscribed","Category"]] # select multiple columns in a list []
china_site = china_site.rename(columns={"Name (EN)": "name", "Date inscribed": "date", "Category": "type"}) # rename the columns for easy reading

china_site.head(1) # check the updates
name date type
5 Sichuan Giant Panda Sanctuaries - Wolong, Mt S... 2006-01-01 Natural

Date Time

Pandas dataframe also support datetime for time series analysis. But first, we need to read the column as date time using to_datetime(). It is as if we are telling Pandas the date column is not just a string, but datetime objects.

china_site['date'] =  pd.to_datetime(china_site["date"])

By converting the column to datetime objects, we can do multiple operations inside the data fame, just as extracting only the year information. It can be done by .year. As the year information for us is more relevant than the month and day, we will remove the original column and add a year column.

china_site['year'] = pd.DatetimeIndex(china_site['date']).year # set up a new year column
china_site = china_site.drop(columns=['date']) # remove the original date column

china_site.head(1) # check the first row
name type year
5 Sichuan Giant Panda Sanctuaries - Wolong, Mt S... Natural 2006

UNESCO Sites


Using the china_site variable, we can check on the number of sites for each year using groupby(). After groupby(), we keep only the name column and count the number of rows using count(). Yet, we will get a Pandas series (pandas.core.series.Series) as return, not a data frame. We need to convert it back to data frame using reset_index() and add name="count" to tell Pandas the new column will be called "count".

count = china_site.groupby("year")["name"].count()
count
year
1987    6
1990    1
1992    3
1994    4
1996    2
1997    3
1998    2
1999    2
2000    4
2001    1
2003    1
2004    1
2005    1
2006    2
2007    2
2008    2
2009    1
2010    2
2011    1
2012    2
2013    2
2014    1
2015    1
2016    2
Name: name, dtype: int64
count_df = count.reset_index(name="count")
count_df
year count
0 1987 6
1 1990 1
2 1992 3
3 1994 4
4 1996 2
5 1997 3
6 1998 2
7 1999 2
8 2000 4
9 2001 1
10 2003 1
11 2004 1
12 2005 1
13 2006 2
14 2007 2
15 2008 2
16 2009 1
17 2010 2
18 2011 1
19 2012 2
20 2013 2
21 2014 1
22 2015 1
23 2016 2
  • ### Cumulative totals of the heritage sites

The above table displays the number of sites inscribed in China every year, however, what if we want to know the total number of heritage sites in China for every year? We can use cumsum(), standing for cumulative summation. Let's put the total number of sites into a new column called "total".

count_df["total"] = count_df["count"].cumsum()
  • ### Set Index

Let's reset the index using year.

count_df = count_df.set_index("year")
count_df.head()
count total
year
1987 6 6
1990 1 7
1992 3 10
1994 4 14
1996 2 16

Visualization

Plotting using Pandas data frame is fairly easy. We only need to add .plot() after selecting the column(s) we need (multiple column names need to be put in a list using []). To customize the layout, we need to import matplotlib library too.

import matplotlib.pyplot as plt # import library

plt.figure(figsize=(15,5)) # optional, define figure size
count_df.total.plot(color="red") # plot, add color argument
plt.xlabel("Year") # x label
plt.ylabel("Number of UNESCO sites") # y label
plt.title("Increasing number of UNESCO sites in China") # title
Text(0.5, 1.0, 'Increasing number of UNESCO sites in China')

We can also plot a table instead.

data = {"Year": count_df.index.values, "Total Sites": count_df.total.values}
df = pd.DataFrame(data)

fig, ax = plt.subplots(1, 1)

# Hide axes
ax.xaxis.set_visible(False) 
ax.yaxis.set_visible(False)
ax.axis('tight')
ax.axis('off')

ax.table(cellText=df.values, colLabels=df.keys(), loc='center')
plt.show()

To export the data for further use, we can export them as csv file.

from google.colab import files
df.to_csv('UNESCO.csv', encoding='utf_8_sig', index=False) 
files.download('UNESCO.csv')

Or print it as LaTeX.

print(df.to_latex(index=False))  
\begin{tabular}{rr}
\toprule
 Year &  Total Sites \\
\midrule
 1987 &            6 \\
 1990 &            7 \\
 1992 &           10 \\
 1994 &           14 \\
 1996 &           16 \\
 1997 &           19 \\
 1998 &           21 \\
 1999 &           23 \\
 2000 &           27 \\
 2001 &           28 \\
 2003 &           29 \\
 2004 &           30 \\
 2005 &           31 \\
 2006 &           33 \\
 2007 &           35 \\
 2008 &           37 \\
 2009 &           38 \\
 2010 &           40 \\
 2011 &           41 \\
 2012 &           43 \\
 2013 &           45 \\
 2014 &           46 \\
 2015 &           47 \\
 2016 &           49 \\
\bottomrule
\end{tabular}

Plotting

Do some plotting using our data.

Let's say we are interested in the progress of UNESCO sites application from different countries. We want to do a plot to see which countries are having the largest share of sites and how the trend develops over time. Creating the plot is typically the final step for data visualization. Before that, there are some steps need to be done.


Typical Workflow for Visualization:

1) Get data: grab them online or offline

2) Clean data: get rid of missing data, clean the irrelevant information, groupping parameters, etc.

3) Design on the type of visualization: what type of chart (depends on your message & purpose & nature of data)? by what parameters (eg. by year or by country or focus on one country only)?

4) Prepare the data in the format fitting your visualization type (wide or long format?)


Data Preparation

the first thing we need to do is to collect the top 10 countries having the most UNESCO sites.

df = df[['Country (EN)',"Name (EN)","Date inscribed","Category"]] # select multiple columns in a list []
df = df.rename(columns={"Country (EN)": "country","Name (EN)": "name", "Date inscribed": "date", "Category": "type"}) # rename the columns for easy reading
top_10 = df.groupby(df["country"]).count().sort_values(by=['name'], ascending=False).head(10)
top_10
name date type
country
China 49 49 49
Italy 47 47 47
Spain 41 41 41
France 38 38 38
Germany 35 35 35
Mexico 34 34 34
India 33 33 33
United Kingdom of Great Britain and Northern Ireland 27 27 27
Russian Federation 21 21 21
Iran (Islamic Republic of) 21 21 21

Then we convert the countries to Numpy array and save it to a variable sub_cnty.

sub_cnty = top_10.index.values
sub_cnty
array(['China', 'Italy', 'Spain', 'France', 'Germany', 'Mexico', 'India',
       'United Kingdom of Great Britain and Northern Ireland',
       'Russian Federation', 'Iran (Islamic Republic of)'], dtype=object)

top_df is the data frame including the top 10 countries only. As we aim to plot number of heritage sites every years for each country, we need to use groupby() grouping both country and date after filtering the rows to the top 10 countries using df['country'].isin(sub_cnty).

After groupby() we need to indicate the method count(). We will only select the name column only (country and date will also be included as they are the objects used to "groupby").

In order to get a data frame as output, we need to use reset_index().

top_df = df[df['country'].isin(sub_cnty)].groupby(['country','date']).count()['name'].reset_index()
top_df.head(5)
country date name
0 China 1987-01-01 6
1 China 1990-01-01 1
2 China 1992-01-01 3
3 China 1994-01-01 4
4 China 1996-01-01 2

pivot() from Pandas is a function to convert a data frame from long to wide. In this case, it basically display every unique item in the country column into a separate column.

pivot = top_df.pivot(index='date', columns='country', values='name')
pivot = pivot.fillna(0)
pivot.head(10)
country China France Germany India Iran (Islamic Republic of) Italy Mexico Russian Federation Spain United Kingdom of Great Britain and Northern Ireland
date
1978-01-01 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1979-01-01 0.0 5.0 0.0 0.0 3.0 1.0 0.0 0.0 0.0 0.0
1980-01-01 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0
1981-01-01 0.0 5.0 2.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1982-01-01 0.0 1.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0
1983-01-01 0.0 3.0 1.0 4.0 0.0 0.0 0.0 0.0 0.0 0.0
1984-01-01 0.0 0.0 1.0 2.0 0.0 0.0 0.0 0.0 5.0 0.0
1985-01-01 0.0 1.0 1.0 3.0 0.0 0.0 0.0 0.0 5.0 0.0
1986-01-01 0.0 0.0 1.0 4.0 0.0 0.0 0.0 0.0 4.0 7.0
1987-01-01 6.0 0.0 1.0 3.0 0.0 2.0 6.0 0.0 1.0 2.0
pivot = pivot.cumsum()
pivot.head(10)
country China France Germany India Iran (Islamic Republic of) Italy Mexico Russian Federation Spain United Kingdom of Great Britain and Northern Ireland
date
1978-01-01 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1979-01-01 0.0 5.0 1.0 0.0 3.0 1.0 0.0 0.0 0.0 0.0
1980-01-01 0.0 5.0 1.0 0.0 3.0 2.0 0.0 0.0 0.0 0.0
1981-01-01 0.0 10.0 3.0 0.0 3.0 2.0 0.0 0.0 0.0 0.0
1982-01-01 0.0 11.0 3.0 0.0 3.0 3.0 0.0 0.0 0.0 0.0
1983-01-01 0.0 14.0 4.0 4.0 3.0 3.0 0.0 0.0 0.0 0.0
1984-01-01 0.0 14.0 5.0 6.0 3.0 3.0 0.0 0.0 5.0 0.0
1985-01-01 0.0 15.0 6.0 9.0 3.0 3.0 0.0 0.0 10.0 0.0
1986-01-01 0.0 15.0 7.0 13.0 3.0 3.0 0.0 0.0 14.0 7.0
1987-01-01 6.0 15.0 8.0 16.0 3.0 5.0 6.0 0.0 15.0 9.0

Visualization

Now we can start plotting. We first plot a chart showing the temporal development of heritage inscriptions for the 10 countries.

import matplotlib.pyplot as plt # import library

pivot.plot(figsize=(20,6)) # define plot size

plt.xlabel("Year", fontsize=14) # x label
plt.ylabel("Count", fontsize=14) # y label
plt.title("UNESCO Sites", fontsize=20) # title
Text(0.5, 1.0, 'UNESCO Sites')

Style Use

We can also choose to change style of our plot using seaborn library. What we need to do is to import the library and set the parameters before we call the functions with our Pandas data frame.

import seaborn as sns # import library

custom_params = {"axes.spines.right": False, "axes.spines.top": False} # define axes parameters
sns.set_theme(style="ticks", rc=custom_params, context="talk") # define theme
plt.style.use("dark_background") # define background color (here we try with a dark theme)

Linechart

import matplotlib.dates as mdates # import mdates to plot datae time data

plt.figure(figsize=(20, 6))
sns.lineplot(data=pivot) # call lineplot using seaborn library
plt.title("UNESCO Sites", fontsize=28)
plt.xlabel("Year")
plt.ylabel("Count")
plt.legend(bbox_to_anchor=(1.02, 0.9), loc=2, borderaxespad=0.) # display a legend at the position out of our plot

plt.tight_layout() # adjust spacings of elements

Stacked Area Chart

Apart from line chart, we can also do a stacked area plot to create more visuals. The principle is similar, but this time we will use stackplot together with "sym" baseline (Symmetric around zero and is sometimes called 'ThemeRiver'). If you are particularly interested in different plotting types, stick with the tutorials as we will discuss more in the next chapters.

Also, feel free to check out this for a cataloge of data visualization using Python.

fig, ax = plt.subplots(figsize=(15, 6)) # figure size
ax.stackplot(pivot.index.values, [pivot[name].values for name in pivot], baseline="sym", colors=palette, labels=sub_cnty) # stacked area
ax.axhline(0, color="red", ls="--", linewidth=.8) # red line in the middle
plt.title("UNESCO Sites", fontsize=25) # title
plt.xlabel("Year") # labels
plt.ylabel("Count")
plt.legend(bbox_to_anchor=(1.02, 0.9), loc=2, borderaxespad=0.) # legend

plt.tight_layout() # adjust spacings

Lollipop Chart

Just another example that we can also do a lollipop chart instead. Here then we will only focus on temporal development of sites in China.

plt.figure(figsize=(24,6))
plt.hlines(y=pivot.index, xmin=0, xmax=pivot['China'], color="#FCC700")
plt.plot(pivot['China'], pivot.index, "o", markersize=8, color="#FC4900")
plt.title("UNESCO Sites in China", fontsize=26)
plt.xlabel("Count")
plt.ylabel("Year")
Text(0, 0.5, 'Year')



Previous Lesson: Pandas Text Analysis

Next Lesson: Coming soon...


Additional information

This notebook is provided for educational purpose and feel free to report any issue on GitHub.


Author: Ka Hei, Chow

License: The code in this notebook is licensed under the Creative Commons by Attribution 4.0 license.

Last modified: December 2021




References:

https://examples.opendatasoft.com/explore/dataset/world-heritage-unesco-list/table/