Sets VS list comprehension. Which is faster??

Hello there! This is my first post on Hashnode. I've decided to keep it simple. However, in my future posts I would be writing about machine learning and computer vision. So stay tuned!.

Sets are the most underrated data structure as most python developers often use them only to remove duplicates. Well, it would please you to know that they are actually faster when comparing two or more arrays than using list comprehension.

I once had a task to do in order to extract unique values out of a list by comparing two lists. Since I've been warned several times not to use explicit 'for loops' (those things really slow down your code though I must confess), I decided to look for the most efficient way to do this and decided to do a simple test to compare the speed of both sets and list comprehensions.

Lets move!. By the way, I used python's default interpreter.

#First lets import numpy to easily create a large array and our time module.
import numpy as np
import time

#Lets define two lists; 'a' and 'b' ,(I know right, weird variable names)

a = np.arange(0,10000)
b = np.arange(24,10000)

Now, lets define our list comprehension method.

def comp_time():
      start = time.time()
      result = [i for i in a if i not in b]
      print(result)
      end = time.time()
      return str(end - start)[:-13] + ' seconds'

#output
comp_time()
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
'0.1199 seconds'

Time for the man of the hour (Sets)

def sets_time():
    start = time.time()
    result = set.difference(set(a),set(b))
    print(result)
    end = time.time()
    return str(end - start)[:-13] + ' seconds'

#output
sets_time()
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11}
'0.0159 seconds'

Note that this time will vary on your machine. For mine, after running it a few times, my minimum time for list comprehension was '0.1199 seconds' while for sets was '0.0159 seconds'.

Clearly, next time you come across a scenario where sets can be used instead of list comprehension, please do use it. :)

Note: Sets are faster in scenarios like this but when it comes to iterating over its contents it's slower than lists.