Python Packaging Complete Guide for every Programmer

Python Packaging

Installing any python packaging is too easy. Just you need to write “pip install module_name“.This command will auto-download the package and install a package for you. All this is possible because of the proper packaging of the respective module. Along with this python packaging, the credit goes to  PyPI Repository. This repository contains  128468 (to date when the article is published). Let me tell you, This is not the only way for installing Python packages, You may use conda package manager to install python package. In fact, there are a few more. Don’t worry you will get everything in this article. I have tried to put all my learning in this article with my personal experiences of programming related to the topic of Python Packaging. So If you want to be an expert on this topic, Just you need to spend 5 minutes with this informative article.

Actually, you need to understand this topic from two ends. In these two ends, I have already discussed the first where you will learn the best way to python package installation. In the second phase, We will learn How to distribute your Python Project with others in an efficient way? Every Programmer must have faced this issue at any moment in his programming journey. Suppose you need to develop a python module or Python package which you need to deliver to your client or you want to make it opensource.

[toc]

Why Python Packaging is a must ?-

I will try to explain with my personal example. When I was a beginner in python, I started a project which involves too many external python Module. Usually, when I need any Python module, I simply use pip and Globally install the package. In most of the cases, these modules need another package/module dependencies. I mean, It requires other python modules with their specific version for proper installation. So I have to manually install dependencies by traversing the module missing error in console.  Actually, I was not aware of requirement.txt  and its uses in the beginning.  So I waste most of the time in manual installation of every package.

Not only manual installation but dependency resolution is one of the most important concerns. Let’s understand the dependency resolution in some detail. Suppose you are doing any project where you need Tensorflow (A python Package ) version r 1.3. At the same time, you start a different project where you need Tensorflow version r 1.5. So you need to upgrade the TensorFlow. If you do that to fix your second project, the First project will stop working. How to resolve this problem.

The answer is a very simple Virtual Environment. So I think you must have understood the issues that arise in the absence of proper packaging. Specially It is always a big challenge for Python beginners. So This article is a complete guide for the solution that may arise while packaging of your Project. Only you have to read it with full patience. I promise it will not take long to understand the complete article.

What will you learn this article –

1.Virtual env creation and its uses.

2. How to install packages in python from various resources.

3. Proper Packaging for your python Project.

4. Distribution your Python Project with the community.

Addition Learning resources –

If you say me to suggest the best book on this topic, I will recommend you for The Hitchhiker’s Guide to Python: Best Practices for Development. This book covers the python basics with advanced development basics.

 

Python Packaging Tutorial-

1. How to Create a Python Virtual Environment?

Before going to create any python virtual Environment, Please ensure you have already installed Python. To check is Python installed or not, Go to the console or Terminal and type –

python --version

Now if it is already there you will get the output like this –

In case it is not there, Go for Python installation First. Follow the article Python Installation Tutorial: Step by Step complete Guide. Most of the time, When you download and install Python latest version, The Package Manager Pip comes by default with Python. Just to verify that Pip is already there in your system or not. use the command –

pip --version

If it is not there, You can install using the command –

python -m ensurepip --default-pip

Now let’s create a Python virtual environment. In this tutorial, I will mention three different ways to create a virtual environment. At the end o this section I  will give my personal recommendation to choose the best way to create the virtual environment.

1.1 Virtual Environment using Virtualenv :

You can create the virtual environment using the below command but before it, you need to install it separately.

pip install virtualenv # installing virtualenv

virtualenv <Directory where you want to make virtual environment> 

source <Directory where you want to make virtual environment>/bin/activate

1.2 Virtual Environment using venv:

Here is the list of commands for Virtual Environment using venv. If you are using python 3, venv comes by default.

python3 -m venv <Directory where you want to make virtual environment>
source <Directory where you want to make virtual environment>/bin/activate

1.3 Virtual Environment using Pipenv :

Pipenv is a combination of pip and venv.   Before using Pipenv, you need to install and set the path. Once you do it, the next step is to install Pipenv into the folder where you want to put the complete project.

 

pip install --user pipenv 

cd <Directory where you want to make project> pipenv install module_name .

 

1.4 Why Pipenv is the best way to create a virtual environment?

While doing any project, you can never be sure how many dependencies you will need to complete the project. We just install them and complete the requirement. Later on when we need to deliver the project to anybody else, how will you provide the list of required dependencies? Here are the two ways either track them manually or any other way which can automate this process. In this scenario, if you are using Pipenv, it keeps track of all dependency which we install into Pipfile. 

2. How to install Python Packages?

While writing the code in Python, If you need to use any external Python distribution, You can use pip. Pip is an installer which we use to install python packages from PyPI ( Python Package Index). In fact, installing python package from PyPI is one the best way but there are so many other ways to install Python Package. Especially for the data scientist, It’s not necessary to get all python packages as per their requirement. So In this section, we will also cover how to install python packages from different sources.

 

2.1 Using Pip to install from PyPI –

Suppose you need to install project pandas which is available in PyPI, So just use the command –

pip install pandas

 

2.2. If we enhance the above step and make it more specific. For example, if you need to install Python pandas with version (0.21.1) . So you can do it in this way –

pip install 'pandas ==0.21.1'

 

2.3. If you are not sure about the version of the package, You can specify the version range in this way –

pip install 'pandas=>0.13.1,<0.21.1'

 

2.4. One more very interesting scenario, Suppose you want to upgrade the currently install python package to its latest version. You also do not the version for the latest release. You can achieve in this way –  (here pandas is python package as an example )

pip install --upgrade pandas

 

2.5. In the usual scenario, You may use multiple python packages. To install them manually is not a good development practice. So you can use “requirement.txt” to install them.

pip install -r requirement.txt

Apart from these installation methods, you can directly install from different Version control systems like SVN, GIT, etc.

3. Proper Packaging for python Project-

Once you complete your project and you need to distribute the project (submit to the community ) on PyPI with others. Before you go for uploading your code, You must follow these steps-

3.1 Create Initial Files –

Although you have all code which a user or developer need to run your code locally. You should add these initial file in your code distribution zip at the root of your project.

3.2 setup.py –

This file contains A-Z configuration for your project. This contains a setup() function which has a list of different arguments. All you need to understand them and put into setup.py file. There are two types of arguments in setup.py. One is required and the other is optional.

3.2.1 Required Argument in setup() function –

 

name='project_name',                                

version='1.2.0',                                  #here you have mention the version for the project

description='A sample Python project',            #short description for the project is mandatory

3.2.2 optional Argument in setup() function –

 

long_description = "Link which contains the home page for project ",

author= "Here you have to write the owner of the project. Just for example - Name of the company",

author_email="[email protected]"

classifiers=[

'Development Status ::  "Here you have to put how stable is your code . I mean Alpha , Beta  etc "',

'License :: "Which License you prefer " ',

'Programming Language :: Python :: 2',

]

install_requires=['Name of Project need to be installed ']

 

Apart from the above mention arguments, there are few more which you may add to the setup function argument. I think you should see an example for setup.py file.

3.3 Others File –

I think writing each file with its complete description may be lengthy. It may also bore you. So I have an interesting alternative for you. Please see the below image, This contains a sample python project with proper python packaging.

Python Packaging for your code -1
Python Packaging for your code –

link – https://github.com/pypa/sampleproject

4. How to Distribute your python Package?

Source Distribution Vs Built Distribution :

You can either distribute your python code in Source format or Built Distribution Format. Let’s understand the difference. Actually, Source Distribution for any Project is code archive which also contains the data file ( For example- it may contain .py file, c/CPP file, etc). You need to compile it when you need to install it. This actually gives you complete control over submodule and functionalities. On the opposite side Built Distribution is contains the compiled file ( like .pyc etc). Just because it is pre-compiled, It becomes platform-specific. As far as the installation of Built Distribution is concerned, You can install it by simply extract them in the root directory (Obviously binary file will go into usr/bin and data file will go into usr/share, etc). As it is pre-compiled, It reduces overhead work for others. So it is the most popular.

Under the umbrella of Built Distribution, There are two usually known Python packaging formats which you should know.

  1. Egg
  2. wheel.

Egg python packaging format was released in 2004 while Wheel is newer. It was introduced in 2012. If you need to understand the difference between them, I will refer you to read Egg Vs Wheel here.

How to create source distribution for your project /code  –

python setup.py sdist

How to create wheel distribution for your project /code  –

First of all, You need to install the wheel. To install the wheel, Use the below command –

pip install wheel
 Actually, Wheel distribution has three types. Below are the details of wheel distribution Python Packaging –

1. Universal Wheels

 This wheel contains pure python files. There will be no compile python file. You can only create Universal wheels if and only if your code perfectly fine on Python 2 and 3. The project should not contain any C extension. Here is the command to build Universal Wheels –
python setup.py bdist_wheel --universal

2. Pure Python Wheels

These wheels also contains pure python files. All you need to create wheels with each different python version ( 2 and 3).Here is the command to create a Pure python wheel-
python setup.py bdist_wheel

 

3.Platform Wheels –

This wheel type may contain a C extension with Python code. Such type of wheel is also platform-specific.

python setup.py bdist_wheel

Either you choose Built distribution or Source Distribution. All the above commands create the distribution file is dist directory usually. 

How to upload your code to PyPI-

 Finally, we have reached the last step. Now you need to create an account on PyPI. All you need to run a few commands to upload your very first project the PyPI. You need a python utility Twine.
twine upload dist/*
gpg --detach-sign -a dist/package-1.0.1.tar.gz               //Pre sign your Distribution 

twine upload dist/package-1.0.1.tar.gz package-1.0.1.tar.gz.asc    //command to upload package

Conclusion –

 As a programmer, We know the pain and possible difficulties may arise if package dependencies are not in the proper way. The proper packaging of the project is not only limited to python. Actually, It is a must for every programming language. So we have reached a point where you can develop and distribute your python project with others. So friends how did you find this article? Is that enough to solve your problem?  In case you need to know anything else, please comment in the comment box. Anyways if you think, You want to contribute to making this article ” Python Packaging Complete Guide for every Programmer” better . You are always welcome, You can reach us via email [email protected]

Thanks!

Data Science Learner Team

 

Join our list

Subscribe to our mailing list and get interesting stuff and updates to your email inbox.

Thank you for signup. A Confirmation Email has been sent to your Email Address.

Something went wrong.

Meet Abhishek ( Chief Editor) , a data scientist with major expertise in NLP and Text Analytics. He has worked on various projects involving text data and have been able to achieve great results. He is currently manages Datasciencelearner.com, where he and his team share knowledge and help others learn more about data science.
 
Thank you For sharing.We appreciate your support. Don't Forget to LIKE and FOLLOW our SITE to keep UPDATED with Data Science Learner