1. Reference

Table of Contents

Little Things

Files

Lists

Sets

Dictionaries

Iterators

Little Things

Numerics

1e2 is short-hand notation for 100.0. Similarly, 1e3 is 1000.0, 1e-4 is 0.0001 and 8e4 is 80000.0.

Enhance readability of large numbers by using underscores in numbers. 1_000 is 1000 and 1_000_000 is the same as int(1e6). Make string in comma-separated form: format(n, ',') _python-3.6

/ is true division and // is floor division.

38 / 5 # = 7.6     '/' is true division operator and returns float
38 // 5 # = 7      '//' is floor division operator and returns int

Data types

In Python, there is no char type like in C/C++/Java. There's only strings; hence ' can be interchangeably used with ". Hence all stringy methods like isdigit work for strings having 1 or more characters.

Assignment never copies data. a = b = []; b.append(3) sets a to be [3] too. a = b = [] is unlike a, b = [], []. Use copy or deepcopy for copying.
We can exchange variables using tuple unpacking: (b, a) = (a, b). A tuple (a, b) can also be written as a, b, hence, simplest way to exchange variables is b, a = a, b. 👏

An iterator is an iterable but an iterable may or may not be iterator. List is iterable not iterator. Looping over an iterator exhausts it and makes it empty. ? A book full of pages is an iterable and the bookmark is an iterator.

Tricks

Given a list, nums = [1, 2, 3, 4, 5], nums[::-1] reverses the list. The way slicing works is [start:stop:jump]
Remove duplicates from a list: list(set([1, 8, 4, 5, 5, 8, 1])) gives [8, 1, 4, 5]. Similarly, to count number of unique elements in a list: len(set(some_list))
Remove duplicates maintaining the insertion order: list(dict.fromkeys([1, 8, 4, 5, 5, 8, 1])) gives [1, 8, 4, 5]. Since python-3.7, dictionary retains the insertion order.

Syntax Quirks

Python's libraries like json, pickle usually have two methods for loading, load and loads. load is used for loading a file while loads is used for a string of bytes. The s in loads stands for string.

== checks equality while is operator compares identities.

a = b = [1, 2, 3] # a and b refer to same objects
a == b # True
a is b # True

c, d = [1, 2, 3], [1, 2, 3] # c and d refer to different objects whose value is same
c == d # True
c is d # False; to inspect further, print id(c) and id(d)

f-strings (formatted string literals) are awesome! ⭐️ _python-3.6

age = 30
age_fstr = f'age = {age}' 
# age_fstr is 'age = 30'

first_name, last_name = 'James', 'Bond'
print(f'The name is {last_name}, {first_name} {last_name}.')

Extras

Add a progress bar to a long-running loop using the module tqdm

from tqdm import tqdm

for _ in tqdm(range(int(1e8))):
  pass

Easter eggs: import this (prints The Zen of Python); import antigravity
Use a code formatter, e.g., Black. pip install black and use as black mycode.py
Try an autocomplete plugin, e.g., Jedi

Timeline of major features

3.0

zip, enumerate, reversed become iterators
print is a function, no longer a statement

3.7

Dictionaries maintain insertion order

📁 Files

Read a file line-by-line into list

with open(filepath) as handle: 
  mylist = handle.read().splitlines()

splitlines returns a list of lines in the string removing the trailing '\n'

Read few (N) lines of a file

with open(filepath) as handle:
  head = [next(handle) for _ in range(N)]

Another way using itertools' islice

Iterate over file

If file is too large to fit into memory or would hog the RAM, consider iterating line-by-line

with open(filepath) as handle:
  for line in handle:
    print(line) # do something

File mode codes

File mode	Code
Read (default)	`r`
Write	`w`
Append	`a`

Lists

Get the last element: some_list[-1]
[1, 5, 3, 2][-1] returns 2
Check if an element exists in the list: in operator
4 in [1, 5, 3, 5, 2] returns False. Has linear-time complexity
Count occurrences of an item in list: count(item)
[1, 5, 3, 5, 2].count(5) returns 2
To count occurrences of multiple items in a list, use collections.Counter since count will perform multiple passes and would degrade performance.
Find the index of an item: index(item)
[1, 5, 3, 5, 2].index(5) returns 1
An index call has linear-time complexity 🤔, returns only the index of the first match, and throws ValueError ⛔️ if element is not present in list.

To add an element, use append, and to add all elements of a list, use extend.

x = [1, 3, 5]
x.append(4) 
# x is [1, 3, 5, 4]

x.extend([6, 9])
# x is [1, 3, 5, 4, 6, 9]

Flattening a nested list; the syntax is like nested for loops.

flat_list = [item for sublist in original_list for item in sublist]

is equivalent to

flat_list = []

for sublist in original_list:
    for item in sublist:
      flat_list.append(item)

List comprehension with an if condition has the form

vals = [expression 
    for value in collection 
    if condition]

is equivalent to

vals = []
for value in collection:
    if condition:
        vals.append(expression)

Sets

Set is an unordered collection of unique elements.

Initialization

colors = set() # initialise an empty set
colors = {'red', 'blue', 'green'} # initialise non-empty set; is called set-literal notation

colors.add('yellow') # add method adds an element to set
colors.update(['magenta', 'violet']) # update method adds multiple elements

⚠️ colors = {} initialises an empty dictionary not a set

Operations on sets

a = s1 | s2     # Union of s1 and s2 
b = s1 & s2     # Intersection of s1 and s2 
c = s1 – s2     # Set difference (items in s1, but not in s2)
d = s1 ^ s2     # Symmetric difference (items in s1 or s2, but not both)
e = s1 <= s2    # True if s1 is a subset of s2

If however, one needs to work with many sets, we can use set functions

list_of_sets = [s1, s2, ..., sn]
a = set.union(*list_of_sets)                # s1 | s2 | ... | sn
b = set.intersection(*list_of_sets)         # s1 & s2 & ... & sn 
c = set.difference(*list_of_sets)           # s1 - s2 - ... - sn
d = set.symmetric_difference(s1, s2)        # symmetric difference is not defined for a list of sets
e = set.issubset(s1, s2)                    # True if s1 is a subset of s2

To summarize, s1 & s2 is same as s1.intersection(s2) and set.intersection(s1, s2).
Sets are partially ordered, because, {1, 2} <= {1, 3} is False and so is {1, 3} <= {1, 2}. So, there's no order between {1, 2} and {1, 3}; hence, {1} <= {1, 2}, {1, 3} <= {1, 2, 3}. (Here ≤ is more like ⊆)

Set comprehension

prime_pairs = {(x, x+2) for x in range(2,100) if is_prime(x) and is_prime(x+2)}

Dictionaries

Dictionary comprehension

stock_prices = {'ACME': 45.2, 'AAPL': 612.7, 'IBM': 205.5, 'HPQ': 37.2, 'FB': 10.7}
highvalue_stocks = { key:value for key, value in prices.items() if value > 200 }

Construct a dictionary from pairs

names = ['raymond', 'rachel', 'matthew']
colors = ['red', 'green', 'blue']

d = dict(zip(names, colors))
# d is {'matthew': 'blue', 'rachel': 'green', 'raymond': 'red'}

Looping over dictionary

Looping over keys

for key in some_dict:
  # do something

Looping over key-value pairs

for key, value in some_dict.items():
  # do something

Inverting a dictionary: 2 ways

inverted1 = {v: k for k, v in d.items()}      # dictionary comprehension
inverted2 = dict(zip(d.values(), d.keys()))   # dictionary constructor

Booleans

`any` and `all`

any will return True when at least one of the elements is Truthy. all will return True when all the elements are Truthy.

any(l == 't' for l in 'python') # True
all(l == 't' for l in 'python') # False

shortcircuit

Iterable	any	all
All Truthy values	True	True
All Falsy values	False	False
At least one Truthy value and at least one Falsy	True	False
Empty Iterable	False	True

`*` and `**`

Commonly referred to as *args and **kwargs (args for arguments and kwargs for keyword arguments),

* is the iterable unpacking operator
** is the dictionary unpacking operator

>>> *[1]
SyntaxError: can't use starred expression here
>>> *[1],
(1,)
>>> *[1], 2
(1, 2)
>>> [*[1, 2, 3]]
[1, 2, 3]
>>> [*(1, 2, 3)]
[1, 2, 3]
>>> {*range(4), 4, *(5, 6, 7)}
{0, 1, 2, 3, 4, 5, 6, 7}
>>> {**{'a': 1, 'c': 3}, **{'b': 2, 'd': 4}}
{'a': 1, 'c': 3, 'b': 2, 'd': 4}

As shown above; a trick to merge two dictionaries d1 and d2: {**d1, **d2} _python-3.5

* helps us write a function that accepts any number of function arguments

def avg(first, *rest):
    return (first + sum(rest))/(1 + len(rest))

More

Functions

Default arguments

def func(l=5):
    return l**2

Collections

Named tuples

from collections import namedtuple
Point = namedtuple('Point', 'x y')
pt1 = Point(1.0, 5.0)
pt2 = Point(2.5, 1.5)

line_length_squared = (pt1.x-pt2.x)**2 + (pt1.y-pt2.y)**2

Counter

Strings

import string
string.ascii_lowercase # 'abcdefghijklmnopqrstuvwxyz'
string.punctuation # '!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'

Split strings

>>> fruits = ' apple banana   mango'
>>> fruits.split()
['apple', 'banana', 'mango']

>>> fruits = 'apple; banana,  mango'
>>> fruits.split()
['apple;', 'banana,', 'mango']

split() splits on whitespace by default and ignores multiple whitespace. If we have multiple delimiters, we could use regular expressions.

>>> import re
>>> fruits = 'apple; banana,  mango'
>>> re.split(r'[,;]\s*', fruits)
['apple', 'banana', 'mango']

`startswith`

>>> url = 'http://google.com'
>>> url[:4] == 'http'
True
>>> url.startswith('http') # more readable
True
>>> filename = 'spam.txt'
>>> filename.endswith('.txt')
True

Profiling

python -m cProfile -o output.prof myscript.py
snakeviz output.prof

Sampling

import random
random.sample(population, SAMPLE_SIZE)

Return a new SAMPLE_SIZE-length list of unique elements chosen from the population sequence or set. Used for random sampling without replacement.

random.sample(my_set, 5) returns list of 5 unique elements random sampled from my_set. Useful for introspecting a set.

🙈 Ignore Stuff: The Underscore _

If you need to run a loop n times and there's no need of a loop variable

for _ in range(10):
  # do something

Go programming language likes this idea so much that it won't compile a code having an unused variable

Ignore a value while unpacking an iterable

name, _, phone_number = ('riyaz', 'bangalore', '9393939393')

Unpacking iterables: The *

user_record = ('Dave', '[email protected]', '773-555-1212', '847-555-1212')
name, email, *phone_numbers = user_record
# name is 'Dave', email is '[email protected]', phone_numbers is "['773-555-1212', '847-555-1212']"

To ignore last elements (here, phone_numbers)

name, email, *_ = user_record

Iterators

An iterator is any object whose class has a __next__ method and an __iter__ method that does return self.
It has no __prev__, no __len__ and so on; only __next__.
Every generator is an iterator, but not vice versa. A generator is built by calling a function that has one or more yield expressions.
An iterator method doesn't have len function. len(i for i in range(10)) gives

TypeError: object of type 'generator' has no len()

Length of an iterator is unknown until you iterate through it.
It is possibly infinite, for e.g., itertools.count() is an infinite iterator of whole numbers.
You can only iterate once, because iterating an iterator empties it.
One trick for finding length is sum(1 for _ in range(10)). The obvious way is to make it list: len([1 for _ in range(10)])

import random

def gen(n):
    for i in range(n):
        if random.randint(0, 1) == 0:
            yield i

iterator = gen(10) # it's length is unknown until iterated

Manually iterating an iterator 👇

r = iter([1, 2, 5, 9])
try:
    while True:
        print(next(r), end=' ')
except StopIteration:
pass

prints 1 2 5 9

Generator Expressions

nums = [1, 2, 3, 4, 5]
s = sum(x * x for x in nums)

No intermediate list is created, hence it's great for very large list of numbers 🏃

Catching Exceptions

try:
  doSomething()
except: 
  pass

try:
  doSomething()
except Exception as e: 
  print(e)

The first one will also catch KeyboardInterrupt, SystemExit etc., which are derived directly from exceptions.BaseException, not exceptions.Exception.

Regex

Decorators

The function decorator syntax:

@decorator
def F():
  ...

is equivalent to

def F():
  ...

F = decorator(F)

Example usage:

def uppercase(func):
  def wrapper(*args, **kwargs):
    return func(*args, **kwargs).upper()
  return wrapper

@uppercase
def greet(name):
  return f'Hello {name}'

Here, greet is renamed to

def wrapper(name):
  return greet(name).upper()

Save objects to disk/Serialization

Pickle is Python's native binary serialization tool; it converts in-memory Python objects to/from bytestreams

import pickle

obj = ('serialization & de-serialization', [1, 4, 16], {1: 'hello', 2: 'world'})

# save to disk / serialize
with open('filename.pickle', 'wb') as handle:
    pickle.dump(obj, handle)

# read from disk / de-serialize
with open('filename.pickle', 'rb') as handle:
    obj2 = pickle.load(handle)

print(obj == obj2)

Pickle gives us a trick to perform a deepcopy of nested objects:

deepcopy = lambda x: pickle.loads(pickle.dumps(x))

dumps and loads are used for dumping and loading a string instead of a file; as stated above, s stands for string.

Pretty print

import pprint as pp
animals = [{'animal': 'dog', 'legs': 4, 'breeds': ['Border Collie', 'Pit Bull', 'Huskie']}, {'animal': 'cat', 'legs': 4, 'breeds': ['Siamese', 'Persian', 'Sphynx']}]
pp.pprint(animals, width=1)

Python Modules

Command line utilities

python -m json.tool
Pretty print JSON in terminal. Try echo '{"a": 1, "b": 2}' | python -m json.tool
python -m http.server
Starts a simple web server
python -m timeit
Measure time taken for a snippet to run. Try

$ python -m timeit "1 + 2"
100000000 loops, best of 3: 0.0162 usec per loop
$ python -m timeit "sum(x*x for x in range(int(1e5)))"
100 loops, best of 3: 9.79 msec per loop

python3.4 -m pip install Install a library in a version-specific Python environment, e.g. python3.4 -m pip install requests

Reserved keywords

False class finally is return
None continue for lambda try
True def from nonlocal while
and del global not with
as elif if or yield
assert else import pass
break except in raise

Direct links

Iterators

Bell Curve

1. Reference

Little Things

Numerics

Data types

Tricks

Syntax Quirks

Extras

Timeline of major features

3.0

3.7

📁 Files

Read a file line-by-line into list

Read few (N) lines of a file

Iterate over file

File mode codes

Lists

Sets

Initialization

Operations on sets

Set comprehension

Dictionaries

Dictionary comprehension

Construct a dictionary from pairs

Looping over dictionary

Booleans

any and all

* and **

Functions

Default arguments

Collections

Named tuples

Counter

Strings

Split strings

startswith

Profiling

Sampling

🙈 Ignore Stuff: The Underscore _

Unpacking iterables: The *

Iterators

Generator Expressions

Catching Exceptions

Regex

Decorators

Save objects to disk/Serialization

Pretty print

Python Modules

Reserved keywords

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Direct links

Clone this wiki locally

`any` and `all`

`*` and `**`

`startswith`