Installing any python packaging is too easy. Just you need to write “pip install module_name” .This command will auto download the package and install package for you. All this is possible because of proper packaging of the respective module. Along with this python packaging, the credit goes to PyPI Repository. This repository contains 128468 (till date when the article is published) . Let me tell you, This is not the only way for installing Python packages, You may use conda package manager to install python package.In fact, there are few more. Don’t worry you will get everything in this article. I have tried to put all my learning on this article with my personal experiences of programming related to the topic Python Packaging.So If you want to be an expert on this topic, Just you need to spend 5 minutes with this informative article.
Actually, you need to understand this topic from two ends. In these two end, I have already discussed the first where you will learn the best way to python package installation. In the Second phase, We will learn How to distribute your Python Project with others in an efficient way? Every Programmer must have faced this issue at any moment in his programming journey. Suppose you need to develop a python module or Python package which you need to deliver to your client or you want to make it opensource.
Why Python Packaging is a must ?-
I will try to explain with my personal example. When I was a beginner in python, I started a project a project which involves too many external python Module. Usually, when I need any Python module, I simply use pip and Globally install the package. In most of the cases, these modules need another package/module dependencies. I mean, It requires other python modules with their specific version for proper installation. So I have to manually install dependencies by traversing the module missing error in console. Actually, I was not aware of requirement.txt and its uses in the beginning. So I waste most of the time in manual installation of every package.
Not only manual installation but dependency resolution is one of the most important concern. Let’s understand dependency resolution in some detail. Suppose you are doing any project where you need Tensorflow (A python Package ) version r 1.3. In the same time, you start a different project where you need Tensorflow version r 1.5. So you need to upgrade the tensorflow . If you do that to fix your second project, the First project will stop working. How to resolve this problem.
The answer is very simple Virtual Environment. So I think you must have understood the issues that arise in the absence of Proper packaging. Specially It is always a big challenge for Python beginners.So This article is a complete guide for the solution that may arise while packaging of your Project.Only you have to read it with full patience. I promise it will not take long to understand the complete article.
What will you learn this article –
1.Virtual env creation and its uses..
2. How to install packages in python from various resources .
3. Proper Packaging for your python Project.
4. Distribution your Python Project with the community.
Addition Learning resources –
If you say me to suggest the best book on this topic, I will recommend you for The Hitchhiker’s Guide to Python: Best Practices for Development. This book covers the python basic with advance development basics.
How to Create Python Virtual Environment-
Before going to create any python virtual Environment, Please ensure you have already installed Python. To check is Python installed or not, Go to console or Terminal and type –
Now if it is already there you will get the output like this –
In case it is not there, Go for Python installation First. Follow the article Python Installation Tutorial : Step by Step complete Guide . Most of the time, When you download and install Python latest version, The Package Manager Pip comes by default with Python. Just to verify that Pip is already there in your system or not. use the command –
If it is not there, You can install using the command –
python -m ensurepip --default-pip
Now let’s create Python virtual Environment. In this tutorial, I will mention three different ways to create the virtual environment. At the end o this section I will give my personal recommendation to choose the best way to create the virtual environment.
Virtual Environment using Virtualenv :
You can create the virtual environment using the below command but before it, you need to install it separately.
pip install virtualenv # installing virtualenv virtualenv <Directory where you want to make virtual environment> source <Directory where you want to make virtual environment>/bin/activate
Virtual Environment using venv :
Here is the list of command for Virtual Environment using venv . If you are using python 3, venv comes by default.
python3 -m venv <Directory where you want to make virtual environment> source <Directory where you want to make virtual environment>/bin/activate
Virtual Environment using Pipenv :
Pipenv is a combination of pip and venv . Before using Pipenv , you need to install and set the path. Once you do it, the next step is to install Pipenv into the folder where you want to put the complete project .
pip install --user pipenv cd <Directory where you want to make project> pipenv install module_name .
Why Pipenv is the best way to create the virtual environment ?
While doing any project, you can never be sure how many dependencies you will need to complete the project. We just install them and complete the requirement. Later on when we need to deliver the project to anybody else, how will you provide the list of required dependencies? Here are the two ways either track them manually or any other way which can automate this process. In this scenario, if you are using Pipenv , it keeps track of all dependency which we install into Pipfile .
How to install Python Packages-
While writing the code in Python, If you need to use any external Python distribution, You can use pip. Pip is an installer which we use to install python packages from PyPI ( Python Package Index). In fact, installing python package from PyPI is one the best way but there are so many other ways to install Python Package. Specially for the data scientist, It’s not necessary to get all python packages as per their requirement. So In this section, we will also cover how to install python packages from different sources.
Using Pip to install from PyPI –
1.Suppose you need to install project pandas which is available in PyPI , So just use the command –
pip install pandas
2. If we enhance the above step and make it more specific. For example, if you need to install Python pandas with version (0.21.1) . So you can do in this way –
pip install 'pandas ==0.21.1'
3. If you are not sure about the version of the package, You can specify the version range in this way –
pip install 'pandas=>0.13.1,<0.21.1'
4. One more very interesting scenario, Suppose you want to upgrade the currently install python package to its latest version.You also do not the version for latest release. You can achieve in this way – (here pandas is python package as an example )
pip install --upgrade pandas
5.In the usual scenario, You may use multiple python packages. To install them manually is not a good development practice.So you can use “requirement.txt” to install them.
pip install -r requirement.txt
Apart from these installation methods, you can directly install from different Version control system like SVN, GIT etc.
Proper Packaging for your python Project-
Once you complete your project and you need to distribute the project (submit to the community ) on PyPI with others. Before you go for uploading your code, You must follow these steps-
Create Initial Files – Although you have all code which a user or developer need to run your code locally. You should add these initial file in your code distribution zip at the root of your project.
This file contains A-Z configuration for your project.This contains a setup() function which has a list of different arguments. All you need to understand them and put into setup.py file. There is two type of argument is setup.py . One is required and other is optional.
Required Argument in setup() function –
name='project_name', version='1.2.0', #here you have mention the version for the project description='A sample Python project', #short description for the project is mandatory
optional Argument in setup() function –
long_description = "Link which contains the home page for project ", author= "Here you have to write the owner of the project. Just for example - Name of the company", author_email="firstname.lastname@example.org" classifiers=[ 'Development Status :: "Here you have to put how stable is your code . I mean Alpha , Beta etc "', 'License :: "Which License you prefer " ', 'Programming Language :: Python :: 2', ] install_requires=['Name of Project need to be installed ']
Apart from the above mention arguments, there are few more which you may add to setup function argument. I think you should see an example for setup.py file .
Others File –
I think writing each file with their complete description may be lengthy. It may also bore you. So I have an interesting alternative for you. Please see the below image, This contains a sample python project with proper python packaging.
How to Distribute your python Package –
Source Distribution Vs Built Distribution :
You can either distribute your python code in Source format or Built Distribution Format. Let’s understand the difference. Actually, Source Distribution for any Project is code archive which also contains the data file ( For example- it may contain .py file, c/cpp file etc). You need to compile it when you need to install it. This actually gives you a complete control over submodule and functionalities. In the opposite side Built Distribution is contains the compiled file ( like .pyc etc). Just because it is pre-compiled, It becomes platform specific. As far as the installation of Built Distribution is concerned, You can install it by simply extract them in the root directory (Obviously binary file will go into usr/bin and data file will go into usr/share etc).As it is pre-compiled, It reduces an overhead work for others. So it is most popular.
Under the umbrella of Built Distribution, There are two usually known Python packaging format which you should know.
Egg python packaging format was released in 2004 while Wheel is newer. It was introduced in 2012. If you need to understand the difference between them, I will refer you to read Egg Vs Wheel here .
How to create source distribution for your project /code –
python setup.py sdist
How to create wheel distribution for your project /code –
First of all, You need to install the wheel.To install the wheel, Use the below command –
pip install wheel
1. Universal Wheels
python setup.py bdist_wheel --universal
2. Pure Python Wheels
python setup.py bdist_wheel
3.Platform Wheels –
This wheel type may contain C extension with Python code.Such type of wheel is also platform specific.
python setup.py bdist_wheel
Either you choose Built distribution or Source Distribution. All the above command create the distribution file is dist directory usually.
How to upload your code to PyPI-
twine upload dist/* gpg --detach-sign -a dist/package-1.0.1.tar.gz //Pre sign your Distribution twine upload dist/package-1.0.1.tar.gz package-1.0.1.tar.gz.asc //command to upload package
As a programmer, We know the pain and possible difficulties may arise if package dependencies are not in the proper way.Proper packaging of project is not only limited to python. Actually, It is must for every programming language.So we have reached a point where you can develop and distribute your python project with others. So friends how did you find this article? Is that enough to solve your problem? In case you need to know anything else, Please comment in the comment box. Anyways if you think, You want to contribute for making this article ” Python Packaging Complete Guide for every Programmer” more better . You are always welcome, You can reach us via email email@example.com.
Data Science Learner Team