2. Introduction to Python: Lists, Tuples, Dictionaries, Arrays

2. Introduction to Python: Lists, Tuples, Dictionaries, Arrays#

Variables#

Containers for store data values. This variable has a label or name. This variable is an object in memory.

# define a variable 
x = 50 

# prints the specified message to the screen
print(x) 

# In a Jupyter Notebook environment, will automatically display the result of the last expression in the cell without the need for using print
x

#return the type of data stored in the object
type(x) 

int

Python Data Types#

There are four key python data types: int, float, string and boolean.

Integers: They consist of positive or negative whole numbers (without fractions or decimals).
Float: This type represents real numbers with a floating-point notation, specified by a decimal point.
String: It is a collection of one or more characters enclosed in single, double, or triple quotes.
Boolean: Objects equal to True are truthy and those equal to False are falsy.

a = 30
type(a)

int

b = 30.3
type(b)

float

c = "30"
type(c)

str

d = False
type(d)

bool

String #

A string is a sequence of characters.

String Exploration#

name = 'Jose Felipe'
print(name)

Jose Felipe

# Function len() returns the length of a string, including all spaces
len(name) 

# Indexing operation: returns the character at the specified position in the string
name[3]

'e'

# First character (in python, indexing starts at 0)
name[0]

'J'

# Last character (in python, negative indexing is used to access elements from the end of a sequence)
name[-1]

'e'

# Slicing 2nd to 5th character (extract a portion of the string)
# In Python, slicing includes the starting index (0 in this case) but excludes the ending index (3 in this case).
print(name[0:3])

Jos

Other String Operations#

# Using "" is useful when you have '' into the string, and vice versa
my_string1 = " la casa rosada 'SBA' " 
print(my_string1)

my_string2 = ' la casa rosada "SBA" '
print(my_string2)

 la casa rosada 'SBA' 
 la casa rosada "SBA" 

my_string = ''' Hello$%&15 '''
print(my_string , "\n") # "\n" is used to create a new line

 Hello$%&15

course_name = str_1 + str_2 + str_3
course_name # concatenation of the strings 

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[17], line 1
----> 1 course_name = str_1 + str_2 + str_3
      2 course_name # concatenation of the strings 

NameError: name 'str_1' is not defined

str_1 = "The name"
str_2 = "of the course"
str_3 = "is Introduction to Python"
print(str_1, str_2, str_3) # it does not concatenate the strings; instead, it separates them with spaces when printing

The name of the course is Introduction to Python

dir = "C:/Users/Alexander/Dropbox/Pakistan_ID_DIME/Data/Data_requests/"
print(dir + "data_1")
print(dir + "data_2")

C:/Users/Alexander/Dropbox/Pakistan_ID_DIME/Data/Data_requests/data_1
C:/Users/Alexander/Dropbox/Pakistan_ID_DIME/Data/Data_requests/data_2

The Python Boolean type is one of Python’s built-in data types. It’s used to represent the truth value of an expression. For example, the expression 1 <= 2 is True, while the expression 0 == 1 is False.

print(10 == 9)

False

response = 10 > 9
response

True

# True is considered equivalent to the integer value 1 in a boolean context
print(True == 1)
print(True == 0)

True
False

# False is considered equivalent to the integer value 1 in a boolean context
print(False == 0)
print(False == 1)

True
False

# True / True evaluates to 1.0, False is equivalent to 0. So, the final result of the expression is 1.0.
(True / True) + False 

1.0

Lists #

It is an ordered and mutable Python container. Its itmes are orderd, changeable, and allow duplicate values and different type of objects. Finally, every item has index because lists have a defined order.

# We use brackets to create a list
# A list can contain elements of various data types
my_list = [ 18, 20 , 30, "alex" ]
my_list 

[18, 20, 30, 'alex']

Method	Definition
type()	It returns the class type of an object.
copy()	Returns a copy of the list.
sort()	Sorts the list in ascending order.
append()	Adds a single element to a list. This element will be located at the end of the list.
extend()	Adds multiple elements to a list.
index()	Returns the first appearance of the specified value.

grades = [15, 18, 16, 5, 8 ]
grades

[15, 18, 16, 5, 8]

Copy#

# This is useful if you want to perform operations on the new list without affecting the original list
new_grades = grades.copy()
new_grades

[15, 18, 16, 5, 8]

Sort#

new_grades.sort()
new_grades

[5, 8, 15, 16, 18]

Append#

new_grades.append( 20 )
new_grades

[5, 8, 15, 16, 18, 20]

Extend#

other_grades = [ 4, 14, 15 ]
other_grades

[4, 14, 15]

new_grades.extend( other_grades )
new_grades

[5, 8, 15, 16, 18, 20, 4, 14, 15]

Index#

# if the value appears more than once in the list, return the first occurrence
new_grades.index(15)

Function	Definition
max(list)	It returns an item from the list with max value.
min(list)	It returns an item from the list with min value.
len(list)	It gives the total length of the list.
list(seq)	Converts a tuple into a list.

Max#

max( new_grades )

Min#

min( new_grades )

Len#

len (new_grades )

List#

my_tuple  = ( 1, 3, 5, 7 )
type(my_tuple)

tuple

my_list = list( my_tuple )
my_list

[1, 3, 5, 7]

Tuple #

It is an ordered and unchangeable Python container. We cannot change, add or remove items after the tuple has been created. Tuple items are ordered, unchangeable, and allow duplicate values.

# A tuple can contain elements of various data types
new_tuple = ('alex', 5, True)
new_tuple

('alex', 5, True)

tuple1 = (1, 3, 3, 5, 10)
tuple1[1]

# I want to change the value in the index 1
tuple1[1] = 4
# It is not possible to change values

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [119], in <cell line: 2>()
      1 # I want to change the value in the index 1
----> 2 tuple1[1] = 4

NameError: name 'tuple1' is not defined

Method	Definition
count()	Returns the number of times a specified value occurs in a tuple
index()	Searches the tuple for a specified value and returns the position of where it was found

tuple1 = (1, 3, 3, 5, 10, 5, 5)

Count#

tuple1.count( 5 )

Index#

tuple1.index( 3 )

Function	Definition
max(tuple)	It returns an item from the tuple with max value.
min(tuple)	It returns an item from the tuple with min value.
len(tuple)	It gives the total length of the tuple.
tuple( list )	Converts a list into a tuple.

my_tuple = ( 1, 2, 3, 4, 5, 10 )

Len#

# Length
len( my_tuple ) 

Tuple#

# Tuple
my_list = [ 1, 3, 5, 7]
my_tuple = tuple( my_list )
my_tuple

(1, 3, 5, 7)

Nested Tuple#

# Creates a tuple tuple1 with elements 1, 3, 3, 5, 10, and a nested tuple (5, 6, 7)
# Nested tuple: tuple written inside another tuple
new_tuple = ( 1, 3, 3, 5, 10, (5, 6, 7) )
print(new_tuple)
print(new_tuple[-1])

(1, 3, 3, 5, 10, (5, 6, 7))
(5, 6, 7)

Dictionaries #

It is a ordered (Python >= 3.7) and mutable Python container. It does not allow duplicate key. They must be unique.

alexander = { 'lastname': "Quispe,", 'age': 28, 'birth_place': "SJL", 'male' : True }
maria = { 'lastname': "rojas", 'age': 27, 'birth_place': "SMP", 'male' : False }

print(alexander)
print(maria)

{'lastname': 'Quispe,', 'age': 28, 'birth_place': 'SJL', 'male': True}
{'lastname': 'rojas', 'age': 27, 'birth_place': 'SMP', 'male': False}

From list to dictionary#

# Defines two lists, lastname and ages, containing strings and integers, respectively
lastname = ["Quispe", "Rojas", "Rodriguez"]
ages = [28, 29, 30]

# A dictionary dict_1 is created with keys "lastname" and "ages", 
# where each key is associated with its respective list
dict_1 = { "lastname" : lastname, "ages": ages}
dict_1

{'lastname': ['Quispe', 'Rojas', 'Rodriguez'], 'ages': [28, 29, 30]}

type( dict_1 )

dict

# Using the key "lastname" to access the corresponding value in the dictionary
dict_1["lastname"]

['Quispe', 'Rojas', 'Rodriguez']

Method	Definition
clear()	Removes all the elements from the dictionary
copy()	Returns a copy of the dictionary
fromkeys()	Returns a dictionary with the specified keys and value
get()	Returns the value of the specified key
items()	Returns a list containing a tuple for each key value pair
keys()	Returns a list containing the dictionary’s keys
pop()	Removes the element with the specified key
popitem()	Removes the last inserted key-value pair
setdefault()	Returns the value of the specified key. If the key does not exist: insert the key, with the specified value
update()	Updates the dictionary with the specified key-value pairs
values()	Returns a list of all the values in the dictionary

# Dictionary containing the population of the 5 largest german cities
population = {'Berlin': 3748148, 'Hamburg': 1822445, 'Munich': 1471508, 'Cologne': 1085664, 'Frankfurt': 753056 }
population

{'Berlin': 3748148,
 'Hamburg': 1822445,
 'Munich': 1471508,
 'Cologne': 1085664,
 'Frankfurt': 753056}

Copy#

pop_2 = population.copy()
pop_2

{'Berlin': 3748148,
 'Hamburg': 1822445,
 'Munich': 1471508,
 'Cologne': 1085664,
 'Frankfurt': 753056}

Clear#

pop_2.clear()
pop_2

{}

Get, items, keys#

# Get information from key
population.get('Munich')

# Get information from key
population.items()

dict_items([('Berlin', 3748148), ('Hamburg', 1822445), ('Munich', 1471508), ('Cologne', 1085664), ('Frankfurt', 753056)])

population.keys()

dict_keys(['Berlin', 'Hamburg', 'Munich', 'Cologne', 'Frankfurt'])

Pop#

# Drop a key
population.pop("Frankfurt")
population

{'Berlin': 3748148, 'Hamburg': 1822445, 'Munich': 1471508, 'Cologne': 1085664}

Update#

stadiums = { "munich":"Alianz Arena","dormundt": "SIP", "ulm": "VFLULM", "shalke": "GAZPROM"}

population.update( {"stadiums": stadiums} )
population

{'Berlin': 3748148,
 'Hamburg': 1822445,
 'Munich': 1471508,
 'Cologne': 1085664,
 'stadiums': {'munich': 'Alianz Arena',
  'dormundt': 'SIP',
  'ulm': 'VFLULM',
  'shalke': 'GAZPROM'}}

Pop item#

# Drop an item (the last inserted key-value pair: stadiums)
population.popitem( )
population

{'Berlin': 3748148, 'Hamburg': 1822445, 'Munich': 1471508, 'Cologne': 1085664}

Add new items#

population.update( { "Bonn" :  327258 } )
population.update( { "Ulm" : 100000  } )
population

{'Berlin': 3748148,
 'Hamburg': 1822445,
 'Munich': 1471508,
 'Cologne': 1085664,
 'Bonn': 327258,
 'Ulm': 100000}

population.update( { "Bonn" :  {"population":100 , "km2" : 500, "president" : "Anzony"} } )
print( population )

{'Berlin': 3748148, 'Hamburg': 1822445, 'Munich': 1471508, 'Cologne': 1085664, 'Bonn': {'population': 100, 'km2': 500, 'president': 'Anzony'}, 'Ulm': 100000}

# Get all keys
population.keys()

dict_keys(['Berlin', 'Hamburg', 'Munich', 'Cologne', 'Bonn', 'Ulm'])

# Get all values from all keys
population.values()

dict_values([3748148, 1822445, 1471508, 1085664, {'population': 100, 'km2': 500, 'president': 'Anzony'}, 100000])

From lists to dictionaries#

# keys
cities = ['Fray Martin','Santa Rosa de Puquio','Cuchicorral','Santiago de Punchauca',
          'La Cruz (11 Amigos)','Cerro Cañon','Cabaña Suche','San Lorenzo',
          'Jose Carlos Mariategui','Pascal','La Esperanza','Fundo Pancha Paula','Olfa',
          'Rio Seco','Paraiso','El Rosario','Cerro Puquio','La Campana','Las Animas',
          'Vetancio','Roma Alta','San Jose','San Pedro de Carabayllo','Huacoy',
          'Fundo Pampa Libre','Ex Fundo Santa Ines','Reposo','Carmelito','Santa Elena','Don Luis','Santa Ines Parcela','Asociacion Santa Ines','Roma Baja','Residencial Santa Lucia','San Francisco','Santa Margarita - Molinos','Sipan Peru','Fundo Cuadros','Bello Horizonte','El Hueco','Ex Fundo Mariategui','Naranjito','Vista Hermosa','El Sabroso de Jose Carlos Mariategui','Granja Carabayllo','Agropecuario Valle el Chillon','Camino Real','Copacabana','El Trebol','Tablada la Virgen','San Fernando de Carabayllo','San Fernando de Copacabana','La Manzana','Chacra Grande','Torres de Copacabana','San Pedro de Carabayllo','San Lorenzo','Chaclacayo','Chorrillos','Cieneguilla','Lindero','Pichicato','San Isidro','San Vicente','Piedra Liza','Santa Rosa de Chontay (Chontay)','La Libertad','El Agustino','Independencia','Jesus Maria','La Molina','La Victoria','Lince','Las Palmeras','Chosica','Lurin','Los Almacigos','Rinconada del Puruhuay','Fundo Santa Genoveva','Los Maderos','Casco Viejo','Vista Alegre','Buena Vista Alta','Lomas Pucara','Fundo la Querencia','Magdalena del Mar','Pueblo Libre','Miraflores','Pachacamac','Puente Manchay','Tambo Inga','Pampa Flores','Manchay Alto Lote B','Invasion Cementerio','Manchay Bajo','Santa Rosa de Mal Paso','Cardal','Jatosisa','Tomina','Pucusana','Honda','Quipa','Los Pelicanos','Playa Puerto Bello','Ñaves','Granja Santa Elena','Alvatroz II','Poseidon - Lobo Varado','Playa Minka Mar','Playa Acantilado','Puente Piedra','Punta Hermosa','Capilla Lucumo','Cucuya','Pampapacta','Avicola San Cirilo de Loma Negra - 03','Avicola San Cirilo de Loma Negra - 02','Avicola San Cirilo de Loma Negra - 01','Pampa Mamay','Cerro Botija','Agricultores y Ganaderos','Pampa Malanche Avicola Puma','Punta Negra','Chancheria','Rimac','San Bartolo','Plantel 41','Granja 4','Granja 5','Granja 07','Granja 44','Granja 47','Santa Maria I','Las Torres Santa Fe','San Francisco de Borja','San Isidro','San Juan de Lurigancho','Ciudad de Dios','San Luis','Barrio Obrero Industrial','San Miguel','Santa Anita - los Ficus','Santa Maria del Mar','Don Bruno','Santa Rosa','Santiago de Surco','Surquillo','Villa el Salvador','Villa Maria del Triunfo', 'Pueblo libre']
# values
postal_code = [15001,15003,15004,15006,15018,15019,15046,15072,15079,15081,15082,15083,15088,15123,15004,15011,15012,15019,15022,15023,15026,15476,15479,15483,15487,15491,15494,15498,15047,15049,15063,15082,15083,15121,15122,15313,15316,15318,15319,15320,15321,15324,15320,15320,15320,15320,15320,15320,15121,15320,15320,15121,15320,15320,15121,15121,15122,15122,15121,15121,15121,15320,15320,15320,15320,15320,15320,15121,15121,15121,15320,15121,15319,15121,15121,15121,15320,15320,15121,15121,15121,15121,15320,15320,15320,15122,15122,15122,15122,15122,15122,15122,15122,15121,15121,15122,15122,15121,15121,15122,15122,15121,15122,15122,15122,15472,15476,15054,15056,15057,15058,15063,15064,15066,15067,15593,15594,15593,15593,15593,15593,15593,15593,15593,15311,15312,15313,15314,15316,15324,15326,15327,15328,15332,15003,15004,15006,15007,15008,15009,15011,15018,15022,15311,15328,15331,15332,15333,15046, 15001]

len(cities)

len(postal_code)

list(zip( cities , postal_code ))

[('Fray Martin', 15001),
 ('Santa Rosa de Puquio', 15003),
 ('Cuchicorral', 15004),
 ('Santiago de Punchauca', 15006),
 ('La Cruz (11 Amigos)', 15018),
 ('Cerro Cañon', 15019),
 ('Cabaña Suche', 15046),
 ('San Lorenzo', 15072),
 ('Jose Carlos Mariategui', 15079),
 ('Pascal', 15081),
 ('La Esperanza', 15082),
 ('Fundo Pancha Paula', 15083),
 ('Olfa', 15088),
 ('Rio Seco', 15123),
 ('Paraiso', 15004),
 ('El Rosario', 15011),
 ('Cerro Puquio', 15012),
 ('La Campana', 15019),
 ('Las Animas', 15022),
 ('Vetancio', 15023),
 ('Roma Alta', 15026),
 ('San Jose', 15476),
 ('San Pedro de Carabayllo', 15479),
 ('Huacoy', 15483),
 ('Fundo Pampa Libre', 15487),
 ('Ex Fundo Santa Ines', 15491),
 ('Reposo', 15494),
 ('Carmelito', 15498),
 ('Santa Elena', 15047),
 ('Don Luis', 15049),
 ('Santa Ines Parcela', 15063),
 ('Asociacion Santa Ines', 15082),
 ('Roma Baja', 15083),
 ('Residencial Santa Lucia', 15121),
 ('San Francisco', 15122),
 ('Santa Margarita - Molinos', 15313),
 ('Sipan Peru', 15316),
 ('Fundo Cuadros', 15318),
 ('Bello Horizonte', 15319),
 ('El Hueco', 15320),
 ('Ex Fundo Mariategui', 15321),
 ('Naranjito', 15324),
 ('Vista Hermosa', 15320),
 ('El Sabroso de Jose Carlos Mariategui', 15320),
 ('Granja Carabayllo', 15320),
 ('Agropecuario Valle el Chillon', 15320),
 ('Camino Real', 15320),
 ('Copacabana', 15320),
 ('El Trebol', 15121),
 ('Tablada la Virgen', 15320),
 ('San Fernando de Carabayllo', 15320),
 ('San Fernando de Copacabana', 15121),
 ('La Manzana', 15320),
 ('Chacra Grande', 15320),
 ('Torres de Copacabana', 15121),
 ('San Pedro de Carabayllo', 15121),
 ('San Lorenzo', 15122),
 ('Chaclacayo', 15122),
 ('Chorrillos', 15121),
 ('Cieneguilla', 15121),
 ('Lindero', 15121),
 ('Pichicato', 15320),
 ('San Isidro', 15320),
 ('San Vicente', 15320),
 ('Piedra Liza', 15320),
 ('Santa Rosa de Chontay (Chontay)', 15320),
 ('La Libertad', 15320),
 ('El Agustino', 15121),
 ('Independencia', 15121),
 ('Jesus Maria', 15121),
 ('La Molina', 15320),
 ('La Victoria', 15121),
 ('Lince', 15319),
 ('Las Palmeras', 15121),
 ('Chosica', 15121),
 ('Lurin', 15121),
 ('Los Almacigos', 15320),
 ('Rinconada del Puruhuay', 15320),
 ('Fundo Santa Genoveva', 15121),
 ('Los Maderos', 15121),
 ('Casco Viejo', 15121),
 ('Vista Alegre', 15121),
 ('Buena Vista Alta', 15320),
 ('Lomas Pucara', 15320),
 ('Fundo la Querencia', 15320),
 ('Magdalena del Mar', 15122),
 ('Pueblo Libre', 15122),
 ('Miraflores', 15122),
 ('Pachacamac', 15122),
 ('Puente Manchay', 15122),
 ('Tambo Inga', 15122),
 ('Pampa Flores', 15122),
 ('Manchay Alto Lote B', 15122),
 ('Invasion Cementerio', 15121),
 ('Manchay Bajo', 15121),
 ('Santa Rosa de Mal Paso', 15122),
 ('Cardal', 15122),
 ('Jatosisa', 15121),
 ('Tomina', 15121),
 ('Pucusana', 15122),
 ('Honda', 15122),
 ('Quipa', 15121),
 ('Los Pelicanos', 15122),
 ('Playa Puerto Bello', 15122),
 ('Ñaves', 15122),
 ('Granja Santa Elena', 15472),
 ('Alvatroz II', 15476),
 ('Poseidon - Lobo Varado', 15054),
 ('Playa Minka Mar', 15056),
 ('Playa Acantilado', 15057),
 ('Puente Piedra', 15058),
 ('Punta Hermosa', 15063),
 ('Capilla Lucumo', 15064),
 ('Cucuya', 15066),
 ('Pampapacta', 15067),
 ('Avicola San Cirilo de Loma Negra - 03', 15593),
 ('Avicola San Cirilo de Loma Negra - 02', 15594),
 ('Avicola San Cirilo de Loma Negra - 01', 15593),
 ('Pampa Mamay', 15593),
 ('Cerro Botija', 15593),
 ('Agricultores y Ganaderos', 15593),
 ('Pampa Malanche Avicola Puma', 15593),
 ('Punta Negra', 15593),
 ('Chancheria', 15593),
 ('Rimac', 15311),
 ('San Bartolo', 15312),
 ('Plantel 41', 15313),
 ('Granja 4', 15314),
 ('Granja 5', 15316),
 ('Granja 07', 15324),
 ('Granja 44', 15326),
 ('Granja 47', 15327),
 ('Santa Maria I', 15328),
 ('Las Torres Santa Fe', 15332),
 ('San Francisco de Borja', 15003),
 ('San Isidro', 15004),
 ('San Juan de Lurigancho', 15006),
 ('Ciudad de Dios', 15007),
 ('San Luis', 15008),
 ('Barrio Obrero Industrial', 15009),
 ('San Miguel', 15011),
 ('Santa Anita - los Ficus', 15018),
 ('Santa Maria del Mar', 15022),
 ('Don Bruno', 15311),
 ('Santa Rosa', 15328),
 ('Santiago de Surco', 15331),
 ('Surquillo', 15332),
 ('Villa el Salvador', 15333),
 ('Villa Maria del Triunfo', 15046),
 ('Pueblo libre', 15001)]

# Return a dictionarie
ct_pc = dict( zip( cities , postal_code ) )

ct_pc

{'Fray Martin': 15001,
 'Santa Rosa de Puquio': 15003,
 'Cuchicorral': 15004,
 'Santiago de Punchauca': 15006,
 'La Cruz (11 Amigos)': 15018,
 'Cerro Cañon': 15019,
 'Cabaña Suche': 15046,
 'San Lorenzo': 15122,
 'Jose Carlos Mariategui': 15079,
 'Pascal': 15081,
 'La Esperanza': 15082,
 'Fundo Pancha Paula': 15083,
 'Olfa': 15088,
 'Rio Seco': 15123,
 'Paraiso': 15004,
 'El Rosario': 15011,
 'Cerro Puquio': 15012,
 'La Campana': 15019,
 'Las Animas': 15022,
 'Vetancio': 15023,
 'Roma Alta': 15026,
 'San Jose': 15476,
 'San Pedro de Carabayllo': 15121,
 'Huacoy': 15483,
 'Fundo Pampa Libre': 15487,
 'Ex Fundo Santa Ines': 15491,
 'Reposo': 15494,
 'Carmelito': 15498,
 'Santa Elena': 15047,
 'Don Luis': 15049,
 'Santa Ines Parcela': 15063,
 'Asociacion Santa Ines': 15082,
 'Roma Baja': 15083,
 'Residencial Santa Lucia': 15121,
 'San Francisco': 15122,
 'Santa Margarita - Molinos': 15313,
 'Sipan Peru': 15316,
 'Fundo Cuadros': 15318,
 'Bello Horizonte': 15319,
 'El Hueco': 15320,
 'Ex Fundo Mariategui': 15321,
 'Naranjito': 15324,
 'Vista Hermosa': 15320,
 'El Sabroso de Jose Carlos Mariategui': 15320,
 'Granja Carabayllo': 15320,
 'Agropecuario Valle el Chillon': 15320,
 'Camino Real': 15320,
 'Copacabana': 15320,
 'El Trebol': 15121,
 'Tablada la Virgen': 15320,
 'San Fernando de Carabayllo': 15320,
 'San Fernando de Copacabana': 15121,
 'La Manzana': 15320,
 'Chacra Grande': 15320,
 'Torres de Copacabana': 15121,
 'Chaclacayo': 15122,
 'Chorrillos': 15121,
 'Cieneguilla': 15121,
 'Lindero': 15121,
 'Pichicato': 15320,
 'San Isidro': 15004,
 'San Vicente': 15320,
 'Piedra Liza': 15320,
 'Santa Rosa de Chontay (Chontay)': 15320,
 'La Libertad': 15320,
 'El Agustino': 15121,
 'Independencia': 15121,
 'Jesus Maria': 15121,
 'La Molina': 15320,
 'La Victoria': 15121,
 'Lince': 15319,
 'Las Palmeras': 15121,
 'Chosica': 15121,
 'Lurin': 15121,
 'Los Almacigos': 15320,
 'Rinconada del Puruhuay': 15320,
 'Fundo Santa Genoveva': 15121,
 'Los Maderos': 15121,
 'Casco Viejo': 15121,
 'Vista Alegre': 15121,
 'Buena Vista Alta': 15320,
 'Lomas Pucara': 15320,
 'Fundo la Querencia': 15320,
 'Magdalena del Mar': 15122,
 'Pueblo Libre': 15122,
 'Miraflores': 15122,
 'Pachacamac': 15122,
 'Puente Manchay': 15122,
 'Tambo Inga': 15122,
 'Pampa Flores': 15122,
 'Manchay Alto Lote B': 15122,
 'Invasion Cementerio': 15121,
 'Manchay Bajo': 15121,
 'Santa Rosa de Mal Paso': 15122,
 'Cardal': 15122,
 'Jatosisa': 15121,
 'Tomina': 15121,
 'Pucusana': 15122,
 'Honda': 15122,
 'Quipa': 15121,
 'Los Pelicanos': 15122,
 'Playa Puerto Bello': 15122,
 'Ñaves': 15122,
 'Granja Santa Elena': 15472,
 'Alvatroz II': 15476,
 'Poseidon - Lobo Varado': 15054,
 'Playa Minka Mar': 15056,
 'Playa Acantilado': 15057,
 'Puente Piedra': 15058,
 'Punta Hermosa': 15063,
 'Capilla Lucumo': 15064,
 'Cucuya': 15066,
 'Pampapacta': 15067,
 'Avicola San Cirilo de Loma Negra - 03': 15593,
 'Avicola San Cirilo de Loma Negra - 02': 15594,
 'Avicola San Cirilo de Loma Negra - 01': 15593,
 'Pampa Mamay': 15593,
 'Cerro Botija': 15593,
 'Agricultores y Ganaderos': 15593,
 'Pampa Malanche Avicola Puma': 15593,
 'Punta Negra': 15593,
 'Chancheria': 15593,
 'Rimac': 15311,
 'San Bartolo': 15312,
 'Plantel 41': 15313,
 'Granja 4': 15314,
 'Granja 5': 15316,
 'Granja 07': 15324,
 'Granja 44': 15326,
 'Granja 47': 15327,
 'Santa Maria I': 15328,
 'Las Torres Santa Fe': 15332,
 'San Francisco de Borja': 15003,
 'San Juan de Lurigancho': 15006,
 'Ciudad de Dios': 15007,
 'San Luis': 15008,
 'Barrio Obrero Industrial': 15009,
 'San Miguel': 15011,
 'Santa Anita - los Ficus': 15018,
 'Santa Maria del Mar': 15022,
 'Don Bruno': 15311,
 'Santa Rosa': 15328,
 'Santiago de Surco': 15331,
 'Surquillo': 15332,
 'Villa el Salvador': 15333,
 'Villa Maria del Triunfo': 15046,
 'Pueblo libre': 15001}

Excersises#

Write a Python script to check whether Lima is a key of ct_pc.
Write a Python script to join two Python dictionaries.
Write a Python script to add a key to a dictionary.

Numpy #

Numpy is the core library for scientific computing in Python. It provides a high-performance multidimensional array object, and tools for working with these arrays. If you are already familiar with MATLAB, you might find this tutorial useful to get started with Numpy.

Arrays#

A numpy array is a grid of values, all of the same type, and is indexed by a tuple of nonnegative integers. The number of dimensions is the rank of the array; the shape of an array is a tuple of integers giving the size of the array along each dimension.

import numpy as np

a = np.array( [1, 2, 3, 4, 5] )

# 1D array
a = np.array( [1, 2, 3, 4, 5] )
print(a)

# 2D array
M = np.array( [ [1, 2, 3], [4, 5, 6] ] )

print(M)

X = np.array( [ [1, 2, 3, 4], [4, 5, 6, 7] ] )
X

Function	Description
np.array(a)	Create -dimensional np array from sequence a
np.linspace(a,b,N)	Create 1D np array with N equally spaced values from a to b (inclusively)
np.arange(a,b,step)	Create 1D np array with values from a to b (exclusively) incremented by step
np.zeros(N)	Create 1D np array of zeros of length
np.zeros((n,m))	Create 2D np array of zeros with rows and columns
np.ones(N)	Create 1D np array of ones of length
np.ones((n,m))	Create 2D np array of ones with rows and columns
np.eye(N)	Create 2D np array with rows and columns with ones on the diagonal (ie. the identity matrix of size )
np.concatenate( )	Join a sequence of arrays along an existing axis
np.hstack( )	Stack arrays in sequence horizontally(column wise)
np.vstack( )	Stack arrays in sequence vertically(row wise)
np.column_stack( )	Stack 1-D arrays as columns into a 2-D array
np.random.normal()	Draw random samples from a normal (Gaussian) distribution.
np.linalg.inv()	Compute the (multiplicative) inverse of a matrix.
np.dot() / @	Matrix Multiplication.

# Create a 1D NumPy array with 11 equally spaced values from 0 to 1:
x = np.linspace( 0, 1, 11 )
print(x)

# Create a 1D NumPy array with values from 0 to 20 (exclusively) incremented by 5:
y = np.arange( 0, 20, 1 )
print(y)

# Create a 1D NumPy array of zeros of length 5:
z = np.zeros(5)
print(z)

# Create a 2D NumPy array of zeros of shape ( 5, 10 ) :
M = np.zeros( (5, 10) )
print(M)

# Create a 1D NumPy array of ones of length 7:
w = np.ones(7)
print(w)

# Create a 2D NumPy array of ones with 35ows and 25 columns:
N = np.ones( (5, 5) )
print(N)

np.eye(5)

# Create the identity matrix of size 10:
I = np.eye(10)
print(I)

# Shape
print( I.shape )

# Size
print(I.size)

# Concateante
g = np.array([[5,6],[7,8]])
g

h = np.array([[1,2]])
h

print(g, "\n")
print(h , "\n")

g.shape

h.shape

h.shape

h_2 = h.reshape(2, 1)
h_2

g_h = np.concatenate((g, h_2), axis = 1)
g_h

h_2 = h.reshape(2, 1)
h_2

jesus = np.hstack((g,h_2))
jesus

# vstack 
x = np.array([1,1,1])
y = np.array([2,2,2])
z = np.array([3,3,3])

vstacked = np.vstack( (x, y, z) )
vstacked

vstacked = np.vstack((x,y,z))
print(vstacked)

# hstack 
hstacked = np.hstack((x,y,z))
print(hstacked)

OLS with Numpy#

x0.reshape(-1, 1).shape

# X data generation
n_data = 200
x1 = np.linspace(200, 500, n_data)
x0 = np.ones(n_data)
X = np.hstack(( x0.reshape(-1, 1 ) , x1.reshape(-1, 1 ) ))
X.shape

# select parameters
beta = np.array([5, -2]).reshape(-1, 1 )
beta.shape

# y ture
y_true = X @ beta
y_true.shape

y_true

y_true + (np.random.normal(0, 1, n_data) * 20).reshape(-1, 1)

#   add random normal noise
sigma = 20
y_actual = y_true + (np.random.normal(0, 1, n_data) * sigma).reshape(-1, 1)
print(y_actual[0:4, :])

The matrix equation for the estimated linear parameters is as below: $${\hat {\beta }}=(X^{T}X)^{-1}X^{T}y.$$

# estimations
beta_estimated = np.linalg.inv(X.T @ X) @ X.T @ y_actual

import matplotlib.pyplot as plt

plt.plot(x1, y_actual, 'o')
plt.plot(x1, y_true, 'g-', c = 'black')

Calculate the sum of squared residual errors $$ RSS=y^{T}y-y^{T}X(X^{T}X)^{{-1}}X^{T}y $$

y_actual

RSS = ( y_actual.T @ y_actual - y_actual.T @ X @ np.linalg.inv(X.T @ X) @ X.T @ y_actual )

Calculated the Total Sum of Squares of the spread of the actual (noisy) values around their mean $$ TSS=(y-{\bar y})^{T}(y-{\bar y})=y^{T}y-2y^{T}{\bar y}+{\bar y}^{T}{\bar y} $$

y_mean = ( np.ones(n_data) * np.mean(y_actual) ).reshape( -1 , 1 )
TSS = (y_actual - y_mean).T @ (y_actual - y_mean)
TSS

# get predictions
y_pred = X @ beta_estimated

Calculate the Sum of Squares of the spread of the predictions around their mean. $$ ESS=({\hat y}-{\bar y})^{T}({\hat y}-{\bar y})={\hat y}^{T}{\hat y}-2{\hat y}^{T}{\bar y}+{\bar y}^{T}{\bar y} $$

ESS = (y_pred - y_mean).T @ (y_pred - y_mean)

ESS

TSS, ESS + RSS

Get $R^2$ $$ 1 - RSS / TSS $$

1 - RSS / TSS

Standard error of regression#

Calculate the standard error of the regression. We divide by (n-2), because the Expectation of the sum of squares is (n-2)*sigma^2.

sr2 = ( (1 / (n_data - 2)) * (y_pred - y_actual).T  @ (y_pred - y_actual))
sr = np.sqrt(sr2)
sr

Get variance and covariance Matrix#

In order to get the standard errors for our linear parameters, we use the matrix formula below: $$ Var(β^)=σ^2(X′X)^{-1} $$

var_beta = sr2 * np.linalg.inv(X.T @ X)
var_beta

print(
    f'Std Error for b0 {np.sqrt(var_beta[0, 0])}, \nStd Error for b1 {np.sqrt(var_beta[1, 1])}'
)

pwd

2. Introduction to Python: Lists, Tuples, Dictionaries, Arrays

Contents

2. Introduction to Python: Lists, Tuples, Dictionaries, Arrays#

Variables#

Python Data Types#

String#

String Exploration#

Other String Operations#

Bool#

Lists#

Copy#

Sort#

Append#

Extend#

Index#

Max#

Min#

Len#

List#

Tuple#

Count#

Index#

Len#

Tuple#

Nested Tuple#

Dictionaries#

From list to dictionary#

Copy#

Clear#

Get, items, keys#

Pop#

Update#

Pop item#

Add new items#

From lists to dictionaries#

Excersises#

Numpy#

Arrays#

OLS with Numpy#

Standard error of regression#

Get variance and covariance Matrix#

String #

Bool #

Lists #

Tuple #

Dictionaries #

Numpy #