When you work with data in Python, it’s essential to understand various data structures. One of the structures is called a very helpful Python sets.
In this blog post, we will explore sets in Python in-depth. We’ll learn about their properties, what operations you can do with them, and how you can use them effectively in your Python programming. By the end of this post, you’ll have a strong grasp of sets and how to use them in your Python projects.
what is set in python
In Python, a set is an unordered collection of unique elements. It is a built-in data structure that allows you to store a collection of items, such as numbers, strings, or even other objects. The elements in a set are unique, meaning that duplicates are not allowed.
Sets are defined by enclosing a comma-separated list of elements within curly braces {}.
For example:
my_set = {1, 2, 3, 4, 5}
Key features of sets in Python
Sets in Python have some important features that make them useful for working with data:
- Uniqueness: Sets in Python ensure that each element is unique. This means you can’t have duplicate elements in a set. If you try to add a duplicate element, it will be ignored. For example, if you have a set of numbers, you won’t have any repeated numbers in it.
- Unordered: Sets don’t have a specific order for their elements. Unlike lists or tuples, you can’t access elements in a set using indexes or positions. This lack of order allows for faster operations when checking if an element exists in a set or performing set operations.
- Mutable: Sets are mutable, which means you can modify them. You can add new elements to a set using the add() method, remove elements using the remove() or discard() methods, or even clear the entire set using the clear() method. This flexibility allows you to update a set as your needs change.
- Mathematical Set Operations: Sets in Python support various mathematical operations. These operations include union, intersection, difference, and symmetric difference. For example, you can combine two sets to create a new set that contains all unique elements from both sets (union), or find the common elements between two sets (intersection). These operations are similar to the ones you learn in mathematics.
How to create sets in Python
Creating sets in Python is simple. You can define a set by enclosing a collection of unique elements within curly braces {}.
Let’s take an example 1 :
fruits = {'apple', 'banana', 'orange', 'apple'}
print(fruits)
Output:
In this example, we define a set called fruits that contains the elements ‘apple’, ‘banana’, and ‘orange’. Notice that the duplicate element ‘apple’ is ignored because sets cannot contain duplicate elements. When you print the set, you will see that it only includes the unique elements:
{'orange', 'banana', 'apple'}
Example 2 :
numbers = {1, 2, 3, 4, 5, 5, 4, 3, 2, 1}
print(numbers)
Output:
In this case, we define a set called numbers that contains the elements 1, 2, 3, 4, and 5. Notice that even though there are duplicates (e.g., 5 appears twice and 1 appears twice), the set automatically removes them, ensuring that each element is unique.
{1, 2, 3, 4, 5}
Python Sets Method
Operation | Method | Description |
---|---|---|
Union | union() | Combines unique elements from two or more sets to create a new set. |
Intersection | intersection() | Returns a new set with common elements between two or more sets. |
Difference | difference() | Creates a new set with elements from the first set that are not in the other sets. |
Symmetric Difference | symmetric_difference() | Generates a set with elements unique to each set, excluding the common elements. |
Subset | issubset() | Checks if one set is a subset of another set. |
Superset | issuperset() | Checks if one set is a superset of another set. |
Disjoint | isdisjoint() | Determines if two sets have no common elements. |
Add | add() | Adds an element to a set. |
Remove | remove() | Removes a specified element from a set. |
Discard | discard() | Removes a specified element from a set if it exists. |
Clear | clear() | Removes all elements from a set, making it empty. |
Copy | copy() | Creates a shallow copy of a set. |
Update | update() | Updates a set by adding elements from another set or iterable. |
Intersection Update | intersection_update() | Updates a set with common elements between itself and another set or iterable. |
Difference Update | difference_update() | Updates a set by removing elements found in another set or iterable. |
Symmetric Difference Update | symmetric_difference_update() | Updates a set with elements unique to itself and another set or iterable. |
Pop | pop() | Removes and returns an arbitrary element from a set. |
Frozen Set | frozenset() | Returns an immutable frozenset object, which is a set that cannot be modified. |
Length | len() | Returns the number of elements in a set. |
Adding Elements in Python sets
add(element)
Adds a single element to the set. If the element is already present in the set, it is ignored.
Example:
my_set = {1, 2, 3, 4, 5, }
my_set.add(6)
print(my_set)
Output:
{1, 2, 3, 4, 5, 6}
update(iterable)
Adds multiple elements to the set from an iterable (e.g., list, tuple, set).
Example:
my_set = {1, 2, 3, 5, 4}
my_set.update([6, 7, 9, 8])
print(my_set)
Output:
{1, 2, 3, 4, 5, 6, 7, 8, 9}
Removing Elements in Python sets
remove(element)
Removes a specified element from the set. If the element is not present, it raises a KeyError.
Example:
my_set = {1, 2, 3, 5, 4}
my_set.remove(2)
print(my_set)
Output:
{1, 3, 4, 5}
discard(element)
Removes a specified element from the set if it exists. If the element is not present, it does nothing.
Example
my_set = {1, 2, 3, 4, 5}
my_set.discard(4)
print(my_set)
Output:
{1, 2, 3, 5}
pop()
Removes and returns an arbitrary element from the set. As sets are unordered, the popped element is not guaranteed to be the first or last element. If the set is empty, it raises a KeyError.
Example:
my_set = {1, 2, 3, 4, 5, 6, 7, 8, 9}
popped_element = my_set.pop()
print(my_set)
print(popped_element)
Output:
{2, 3, 4, 5, 6, 7, 8, 9}
1
clear()
Removes all elements from the set, making it empty.
Example:
my_set = {1, 2, 3, 4, 5, 6, 7, 8, 9}
my_set.clear()
print(my_set)
Output:
set()
Union (|) in Python sets
The union of two sets contains all the unique elements from both sets.
fruits = {'apple', 'banana', 'orange'}
more_fruits = {'grape', 'kiwi', 'apple'}
all_fruits = fruits | more_fruits
print(all_fruits)
Output
{'apple', 'banana', 'orange', 'grape', 'kiwi'}
Intersection (&) in Python sets
The intersection of two sets contains only the common elements between them.
fruits = {'apple', 'banana', 'orange'}
more_fruits = {'grape', 'kiwi', 'apple'}
common_fruits = fruits & more_fruits
print(common_fruits)
Output:
{'apple'}
Difference (-) in Python sets
The difference between the two sets contains elements that are present in the first set but not in the second set.
fruits = {'apple', 'banana', 'orange'}
more_fruits = {'grape', 'kiwi', 'apple'}
unique_fruits = fruits - more_fruits
print(unique_fruits)
Output:
{'banana', 'orange'}
Symmetric Difference (^) in Python sets
The symmetric difference between two sets contains elements that are present in either set, but not both.
fruits = {'apple', 'banana', 'orange'}
more_fruits = {'grape', 'kiwi', 'apple'}
symmetric_diff = fruits ^ more_fruits
print(symmetric_diff)
Output:
{'banana', 'orange', 'grape', 'kiwi'}
Example of Movie Recommendation System (Bollywood and Hollywood)
Here I am building a movie recommendation system that includes both Bollywood and Hollywood movies using sets. The system will recommend movies to users based on their preferences and similarities with other users, regardless of the movie’s origin.
Step 1: Creating the Movie Recommendation System
In this step, we build a system for recommending movies to users. We create a class called “MovieRecommendationSystem” that represents this system. It has a special function called “init” which prepares an empty storage area called “user_movies” to keep track of each user’s movie preferences.
Step 2: Adding User Movie Preferences
In this step, we enable users to tell us about their movie preferences. We have a function called “add_user_movies” that takes the user’s ID and a list of movies they like as input.
We convert the list of movies into a set, which ensures that there are no repeated movies. Then, we store this set of movies in the “user_movies” dictionary with the user’s ID as the reference.
Step 3: Finding Similar Users To recommend movies
we need to find users who have similar movie preferences. We use a method called “find_similar_users” to calculate the similarity between users. This method applies a mathematical concept called the “Jaccard similarity coefficient” that measures how much two users’ movie preferences overlap.
We compare the common movies between two users to all the movies they like. Then, we sort the users based on their similarity scores.
Step 4: Recommending Movies
In this step, we generate movie recommendations for a specific user. We use the “recommend_movies” method, which takes a user’s ID as input. Inside this method, we find the most similar users using the “find_similar_users” method we discussed earlier.
We create an empty set called “recommended_movies” to store the recommended movies. We iterate through the similar users and add the movies they like (which the current user hasn’t watched) to the “recommended_movies” set.
Step 5: Using the Movie Recommendation System
Here, we put the movie recommendation system into action. We create an instance of the “MovieRecommendationSystem” class called “recommendation_system”. Then, we add movie preferences for three users using the “add_user_movies” method.
After that, we specify the user for whom we want to get movie recommendations by providing their ID. We call the “recommend_movies” method of the “recommendation_system” instance, passing in the user’s ID. This method returns a set of recommended movies. Finally, we print the recommended movies for the user to see.
# Step 1: Creating the Movie Recommendation System
class MovieRecommendationSystem:
def __init__(self):
self.user_movies = {} # Dictionary to store user movies as sets
# Step 2: Adding User Movie Preferences
def add_user_movies(self, user_id, movies):
self.user_movies[user_id] = set(movies)
# Step 3: Finding Similar Users
def find_similar_users(self, user_id):
similarity_scores = {}
user_movies = self.user_movies[user_id]
# Calculate Jaccard similarity coefficient for each user
for other_user_id, other_user_movies in self.user_movies.items():
if other_user_id != user_id:
intersection = user_movies.intersection(other_user_movies)
union = user_movies.union(other_user_movies)
similarity_scores[other_user_id] = len(intersection) / len(union)
# Sort users by similarity score in descending order
similar_users = sorted(similarity_scores, key=similarity_scores.get, reverse=True)
return similar_users
# Step 4: Recommending Movies
def recommend_movies(self, user_id):
similar_users = self.find_similar_users(user_id)
recommended_movies = set()
# Find movies from similar users that the user hasn't watched
for similar_user in similar_users:
recommended_movies.update(self.user_movies[similar_user] - self.user_movies[user_id])
return recommended_movies
# Step 5: Using the Movie Recommendation System
# Create an instance of the MovieRecommendationSystem
recommendation_system = MovieRecommendationSystem()
# Add user movie preferences
recommendation_system.add_user_movies("user1", ['Dilwale Dulhania Le Jayenge', 'Inception', 'Kabhi Khushi Kabhie Gham'])
recommendation_system.add_user_movies("user2", ['Kabhi Alvida Naa Kehna', 'Dil Chahta Hai', 'The Dark Knight'])
recommendation_system.add_user_movies("user3", ['Gadar: Ek Prem Katha', 'Sholay', 'The Godfather'])
# Get movie recommendations for a user
user_id = "user1"
recommended_movies = recommendation_system.recommend_movies(user_id)
# Print the recommended movies
print(f"Recommended movies for {user_id}:")
for movie in recommended_movies:
print(movie)