2. Idioms

Dataflow

1. Chained comparison operators

if x <= y and y <= z:
  print('ok')

Better

if x <= y <= z:
  # do something

Examples

>>> 2.999 < 3 == 3.0
True
>>> 1e5 >= 100 == 1e2 <= 1000
True

2. Ternary operator

value = 0
if cond:
  value = 1

Better

value = 1 if cond else 0

Intuitively it's like how we write in maths, f(x) = |x| = x if x > 0 else -x

3. `or` operator

if x:
  y = x
else:
  y = 'fallback'

Better: use or

y = x or 'fallback'

or returns the first operand if the first operand evaluates to True, and the second operand if the first operand evaluates to False. It is similar to Null coalescing operator. Examples:

'' or 'default' # 'default'
0 or 1 # 1
None or 0 # 0
[] or [3] # [3]
None or [] # []
False or 0 # 0

x or 'fallback' is same as x if x else 'fallback'

Check existence in a collection

if city == 'Nairobi' or city == 'Kampala' or city == 'Lagos':
  found = True

Better: use in keyword

city = 'Nairobi'
found = city in {'Nairobi', 'Kampala', 'Lagos'}

Here we used a set of cities, though we could also have used

a tuple: ('Nairobi', 'Kampala', 'Lagos')
a list: ['Nairobi', 'Kampala', 'Lagos']

Set will be advantageous when number of cities is very large. In summary, use in where possible:

Contains: if x in items
Iteration: for x in items

Concatenating strings

strings = ['ab', 'cd', 'ef']
cs = ''
for s in strings:
  cs += s
# cs is 'abcdef'

Above code uses the Shlemiel the painter’s algorithm and is accidentaly quadratic 👎. Instead use join

' '.join(strings)

Looping

Simple Looping

for i in range(len(my_list)):
  print(my_list[i])

Better 👇

for elem in my_list:
  print(elem)

Looping over a collection with indices

for i in range(len(my_list)):
  print(i, my_list[i])

Better: use enumerate

for idx, element in enumerate(my_list):
  print (idx, element)

enumerate returns an iterator

Looping backwards

colors = ['red', 'green', 'blue', 'yellow']

for i in range(len(colors)-1, -1, -1):
  print(colors[i])

Better: use slicing [::-1]

for color in colors[::-1]:
  print(color)

Even Better: use reversed 👌. It returns an iterator.

for color in reversed(colors):
  print(color)

Example Problem: Let's say a polynomial 4x^5 + 2 * x^2 - x + 3 is represented by its coefficients (4, 0, 0, 2, -1, 3), then given coefficients as this tuple we need to calculate value of the polynomial at a given x.

sum(c * x ** idx for idx, c in enumerate(reversed(coefficients)))

Looping over two collections

names = ['raymond', 'rachel', 'matthew']
colors = ['red', 'green', 'blue', 'yellow']

n = min(len(names), len(colors))
for i in range(n):
  print(names[i], '--->', colors[i])

Better: use zip

for name, color in zip(names, colors):
  print(name, '--->', color)

zip too returns an iterator.

Make (an iterable of) bigrams of items in iterable: zip(mylist, mylist[1:])

words = 'A girl has no name'.split()
bigrams = list(zip(words, words[1:]))
# bigrams is [('A', 'girl'), ('girl', 'has'), ('has', 'no'), ('no', 'name')]

Transpose/Unzip an iterable of tuples: zip(*data)
_{Prerequisite: *args and **kwargs}

data = [(1, 2, 3), (4, 5, 6)]
transposed = list(zip(*data))
# transposed is [(1, 4), (2, 5), (3, 6)]

zip(*) is equivalent to unzip/transpose

ls1 = [1, 2, 3, 4, 5]
ls2 = list('abcde')
c = list(zip(ls1, ls2)) 
# c is [(1, 'a'), (2, 'b'), (3, 'c'), (4, 'd'), (5, 'e')]
d = list(zip(*c)) # equivalent to unzip
# d is [(1, 2, 3, 4, 5), ('a', 'b', 'c', 'd', 'e')]

Summary: The iterators enumerate, zip, reversed are syntax goodies (syntactic sugar) that cover many usual cases to make code more readable and pretty.

Dict's default value: `get`

1. Default value for item not in dictionary

color_weights = {'blue': 1, 'green': 2, 'red': 3}
yellow_weight = color_value['yellow'] if 'yellow' in color_weights else -1

Better: use get

yellow_value = color_value.get('yellow', -1)

2. Counting with dictionaries

colors = ['red', 'green', 'red', 'blue', 'green', 'red']

d = {}
for color in colors:
    if color not in d:
        d[color] = 0
    d[color] += 1

# {'blue': 1, 'green': 2, 'red': 3}

Better

d = {}
for color in colors:
    d[color] = d.get(color, 0) + 1

Use collections 💪

from collections import Counter

Counter(colors)

Grouping

Use defaultdict

`any` function

Let's simulate an experiment to shuffle 'n' cards each with a unique label in 0...n-1, and then check if any k^th card's label is k.

We will use sample function from random module for that. sample is used for sampling with replacement; sample(range(n), n) is equivalent to shuffling the list 0...n-1.

from random import sample
idx_labels = enumerate(sample(range(n), n))

To proceed with the experiment:

for idx, label in idx_labels:
  if idx == label:
    print(True)
print(False)

Better: use any

outcome = any(idx == label for idx, label in idx_labels)
print(outcome)

We could also have used a list instead of a generator: any([idx == label for idx, label in idx_labels]), but obviously generator-expression used above is memory-efficient.

Generators

def squares(n):
    return [i*i for i in range(n)]

def even(iter):
    return [i for i in iter if i%2 == 0]

even_squares = even(squares(n))

Better: use yield

def squares(n):
    for i in range(n):
        yield i*i

def even(iter):
    for i in iter:
        if i % 2 == 0:
            yield i

list(even(squares(10)))

Using yield instead of return makes an otherwise normal function a generator. It is an easier way to create an iterator than to define __iter__ and __next__ methods. It lets us interleave two functions and transfer control and hence this function is also called coroutine.

Repeatedly transferring control to a generator is way faster than calling a function since a stack frame is saved in case of generator.

Using yield from

def countdown(n):
    yield from range(n, 0, -1)

>>> list(countdown(5))
[5, 4, 3, 2, 1]

The `with` statement

foo = open('/tmp/foo', 'w')
try:
  foo.write('sometext')
finally:
  foo.close()

👆code is equivalent to 👇. Use with

with open('/tmp/foo', 'w') as handle:
  handle.write('sometext')

Comprehensions

squares = list(map(lambda x: x**2, range(1,10)))
even_squares = list(map(lambda x: x**2, filter(lambda x: x % 2 == 0, range(1,10))))

List comprehensions 👇are more readable and pythonic! 🤘

squares = [x**2 for x in range(1,10)]
even_squares = [x**2 for x in range(1,10) if x % 2 == 0]

Specialized tool beats a general purpose tool

Specialized tools usually outperform or are more accurate than general purpose tools

math.sqrt(x) is more accurate than x ** 0.5

math.log2() is exact for powers of two

from math import log, log2
all(log(2 ** x, 2) == x for x in range(100)) # False
all(log2(2 ** x) == x for x in range(100)) # True

In PySpark, key_value_rdd.countByKey() is way faster thankey_value_rdd.groupBy().mapValues(len).collect() because of less shuffling involved.

Direct links

Iterators

Bell Curve

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

2. Idioms

Dataflow

1. Chained comparison operators

2. Ternary operator

3. `or` operator

Check existence in a collection

Concatenating strings

Looping

Simple Looping

Looping over a collection with indices

Looping backwards

Looping over two collections

Dict's default value: `get`

1. Default value for item not in dictionary

2. Counting with dictionaries

Grouping

`any` function

Generators

The `with` statement

Comprehensions

Specialized tool beats a general purpose tool

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Direct links

Clone this wiki locally

2. Idioms

Dataflow

1. Chained comparison operators

2. Ternary operator

3. or operator

Check existence in a collection

Concatenating strings

Looping

Simple Looping

Looping over a collection with indices

Looping backwards

Looping over two collections

Dict's default value: get

1. Default value for item not in dictionary

2. Counting with dictionaries

Grouping

any function

Generators

The with statement

Comprehensions

Specialized tool beats a general purpose tool

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Direct links

Clone this wiki locally

3. `or` operator

Dict's default value: `get`

`any` function

The `with` statement