Welcome to the Python tutorial, today we’ll dive into the concept of serializing Python objects using the Pickle library. It is a common concept, almost all the programming languages provide some way to do it. We’ll thoroughly explain the step-by-step process for enabling the serialization of Python objects.
Serialization in Python Using Pickle
Serialization is an object-oriented programming concept. It processes data or objects and writes them to a file. It can transform data structures or objects so that you can offload them to:
- A file, or
- A Memory cache, or
- Transmitted over the network connection.
Moreover, you can reconstruct the object later in the same or different environment.
Another way to understand it is that it converts the object into a stream of bytes also called marshaling an object. The reverse process of rebuilding the object from the stream of bytes is deserialization or unmarshalling.
Serialization in Python
In Python, serialization and deserialization are achieved through the “Pickle” library.
What is Pickle?
Pickle library is developed using the C programming language like the Python interpreter is. It can save arbitrarily complex Python data structures. Pickle is extensible, cross-version, and not very secure (not secure against erroneous or maliciously constructed data).
What data does Pickle store?
The pickle module stores the following data types:
- All the native data types that Python maintains: Booleans, integers, floating point numbers, complex numbers, strings, byte objects, byte arrays, and None.
- Lists, tuples, dictionaries, and sets holding any sequence of native data types.
- Lists, dictionaries, tuples, and sets with the following variations.
- Sets carrying any combination of lists/tuples/dictionaries, and
- Sets enclosing any combination of native data types (and so on, to the maximum nesting level that Python allows).
- Functions, classes, and the instances of classes (with limitations).
Pickle has two primary methods. The first one is a dump that drops an object to a file. The second method is the load that loads the object from a file object.
How to use Pickle in Python?
Step#1 Construct Pickle data
We will use dictionary-type data for pickling that contains the information related to our website:
website = {'title' : 'Techbeamers', 'site_link' : '/','site_type': 'technology blog','owner':'Python Pickle tutorial','established_date':'Sep2015'}
Step#2 Saving data as a pickle file
Now, we have a dictionary that holds all the information about the website. Let’s save it as a pickle file:
import pickle with open ('website.pickle','wb') as f: pickle.dump(website,f)
- We’ve used the “wb” file mode to open the file in binary mode for the write operation.
- Enclose it using a “with” statement to make sure the file is automatically closed after we are done with it.
- The dump() method in the pickle module takes a serializable Python data structure, in this case, the dictionary created by us and performs the following operation.
- Serializes it into a binary format using the latest version of the pickle protocol.
- Saves it to an open file.
- For your information, the pickle is a protocol that is Python-centric. There is no surety of cross-language compatibility.
- The most recent version of the pickle protocol requires a binary format. So, please make sure to open the pickle files only in binary mode. Otherwise, the data will get corrupted while writing.
Step#3 Loading data from the Pickle file
The following piece of code loads the data from the pickle file.
import pickle with open ('website.pickle', 'rb') as f: data = pickle.load(f) print (data)
Output: {'site_link': '/', 'title': 'Techbeamers', 'owner': 'Python Serialization tutorial', 'established_date': 'Sep2015', 'site_type': 'technology blog'}
- The above code clearly states and you can check that we’ve opened the
‘website.pickle’
file. It was created after formatting the Python dictionary data type. - Since the pickle module supports the binary data format, we’ve opened the pickle file in binary mode.
- The
pickle.load()
method accepts the stream object as a parameter and performs the following operations.- Scans the serialized buffer from the stream.
- Instantiate a brand-new Python object.
- Rebuilds the new Python object using the serialized data, and returns the renewed object.
- The
pickle.dump()
and pickle.load() cycle forms a new data structure that is identical to the original data structure.
In the end, we’ve consolidated all the pieces of the code mentioned above and presented the unified structure below.
import pickle
website = {'title' : 'Techbeamers', 'site_link' : '/','site_type': 'technology blog','owner':'Python Serialization tutorial','established_date':'Sep2015'}
with open ('website.pickle','wb') as f:
pickle.dump(website,f)
with open ('website.pickle', 'rb') as f:
data = pickle.load(f)
print (data)
So, that was all we wanted to convey about the Serialization concept in Python. Hope you would have liked it.
Next, we had several Python tutorials/quizzes/interview questions on this blog. If you like to try them, please just go ahead.
You might wanna check out our tutorial on Python Class Methods.
Final word
We always try to cover concepts that serve two purposes: first, they are useful for learning and second, they help in your interview prep. That’s why we delivered this post on serializing Python objects.
However, if you want us to cover a topic of your choice, please send us your request using the comment box section. We’ll research it and try to publish it as soon as possible.
Lastly, our site needs your support to remain free. Share this post on social media (Linkedin/Twitter) if you gained some knowledge from this tutorial.
Enjoy coding,
TechBeamers.