Nowadays, you have seen everyone talking about learning python for data science and data analysis. Here in this post, we will see how to set up Python Development Environment and packages especially for Data Analysis.
Why Python?
Python is an open source, easy to pick up, general programming language and it doesn’t matter whether you’re the first-time programmer or you’re experienced with other languages you can learn it easily. The important point about Python is that its Python Package Index (PyPI) hosts thousands of third-party modules which help its user an endless possibility whether it’s about visualization with Matplotlib or building machine learning models with Sci-kit learn.
There are two ways you can setup whole Python Development Environment for Data Analysis in your Windows 10, Mac or Linux desktops:
- Manual
- Automatic (By downloading Anaconda)
Here in this post, we’ll see both ways so that you have an idea of how you can achieve the result i.e. setting up Python Development Environment using the manual as well as automatic method.
Manual method for setting up Python Development Environment:
To install everything manually we have to install Python with standard libraries one by one. Let’s start and see how can we do this:
- First to install Python, head over to Python’s official website and download the latest version i.e. Python 3.7.x and install it.
- To check whether you have correctly installed python in your system, run the below command from command line, if python is installed you will get an output like as you can see below:
python --version Python 3.7.0
- Otherwise, you have to check again and install it.
- After that, you have to check whether the pip is installed in your system or not. Because If you installed Python from source, with an installer from python.org, you should already have pip. To check this, run the below command on the command line, you will be shown the version as an output as shown below:
pip --version pip 18.1 from c:\python37\lib\site-packages\pip (python 3.7)
- If any issue came, you have to download and install it from here.
- After python and pip installation, we have to install important libraries which will be used for data analysis:
- Pandas
- Matplotlib
- Installation of python libraries in similar you need to use the below command and for every package, you just have to change the name of the package.
-
pip install Pandas
-
pip install Matplotlib
-
- The package will be installed automatically.
For any other help, you can check this tutorial to know more about the process.
Automatic method of setting up Python Development Environment using Anaconda:
For this first, you have to go to the official website of Anaconda to download Anaconda software. Here I am using Windows 10 O.S. so will share the process for the same but for other O.S. the installation process is somewhat similar.
Steps and important points to consider:
- The important and best part of setting up Python Development Environment with Anaconda is that you do not need to install Python and other packages like Pandas, NumPy, Matplotlib separately. The setup will contain all the required files and you are good to go to start learning or creating visualization and even start building machine learning models.
- Make sure you have installed the correct version of Python as two versions are available, one is Python 3.7 and another is Python 2.7. I prefer you should go with Python 3.7 and Graphical Installer as its an up to date version with many advancements or improvements in features. To download go to this link.
- Once the download finish, run the setup and follow the default options. The installation will take time so have patience. 🙂
- After installation, open Anaconda navigator and there you will find Jupyter Notebook which will help you in coding and make sure you get a simple UI and options.
- If you are new to use Jupyter notebook, you can follow the step by step process to learn how to work with Jupyter Notebook.
After setting up the Python Development Environment in your desktop, you are all set to get your hands dirty on data analysis using Python. Here below are some posts you can check to get a basic idea about data analysis:
- 10 Basic Python fundamentals for Data Scientist aspirants and Data Analysis
- 35 Pandas codes every data scientist aspirant must know
For more detailed information about Pandas library check out the official Pandas documentation, for Matplotlib check out Matplotlib’s official documentation.
Hope you like our post. If you need any help share it with us by commenting below.
Leave a Reply