Home > Programming > Stata/Python integration part 3: How to install Python packages

Stata/Python integration part 3: How to install Python packages

In my last post, I showed you three ways to use Python within Stata. The examples were simple but they allowed us to start using Python. At this point, you could write your own Python programs within Stata. But the real power of Python lies in the thousands of freely available packages. Today, I want to show you how to download and install Python packages.

Using pip to install Python packages

Let’s begin by typing python query to verify that Python is installed on our system and that Stata is set up to use Python.

. python query
--------------------------------------------------------------------------------
    Python Settings
      set python_exec      C:\Users\ChuckStata\AppData\Local\Programs\Python>
> \Python38\python.exe
      set python_userpath  C:\Users\ChuckStata\AppData\Local\Programs\Python>
> \Python38\

    Python system information
      initialized          yes
      version              3.8.3
      architecture         64-bit
      library path         C:\Users\ChuckStata\AppData\Local\Programs\Python
> \Python38\python38.dll

The results indicate that Stata is set up to use Python 3.8, so we are ready to install packages.

NumPy is a popular package that is described as “the fundamental package for scientific computing with Python”. Many other packages rely on NumPy‘s mathematical features, so let’s begin by installing it. It is possible that NumPy is already installed on my system, and I can check by typing python which numpy in Stata.

. python which numpy
Python module numpy not found
r(601);

NumPy is not found on my system, so I am going to install it. I am using Windows 10, so I will type shell in Stata to open a Windows Command Prompt.

Figure 1: Windows Command Prompt
graph1

shell will also open a terminal in Mac or Linux operating systems. Note that experienced Stata users often type ! rather than the word shell.

Next, I will use a program named pip to install NumPy. You can type pip -V in the Windows Command Prompt or terminal in Mac or Linux to see the version and location of your pip program.

Figure 2: pip version and location
graph1

The path for pip is the same as the path returned by python query above. You should verify this if you have multiple versions of Python installed on your system.

Next, type pip install numpy in the Command Prompt or terminal, and pip will download and install NumPy in the appropriate location on your system.

Figure 3: pip install numpy
graph1

The output tells us that NumPy was installed successfully.

We can verify that NumPy was installed successfully by again typing python which numpy

. python which numpy
<module 'numpy' from 'C:\\Users\\ChuckStata\\AppData\\Local\\Programs\\
> Python\\Python38\\lib\\site-packages\\numpy\\__init__.py'>

Let’s install three more packages that we will use in the future. Pandas is a popular Python package used for importing, exporting, and manipulating data. We can install it by typing pip install pandas in the Command Prompt.

Figure 4: pip install pandas
graph1

You can watch a video that demonstrates how to use pip to install Pandas on the Stata YouTube channel.

Matplotlib is a popular package that “is a comprehensive library for creating static, animated, and interactive visualizations in Python”. We can install it by typing pip install matplotlib in the Command Prompt.

Figure 5: pip install matplotlib
graph1

Scikit-learn is a popular package for machine learning. We can install it by typing pip install sklearn in the Command Prompt.

Figure 6: pip install scikit-learn
graph1

Let’s use python which to verify that pandas, matplotlib, and scikit-learn are installed.

. python which pandas
<module 'pandas' from 'C:\\Users\\ChuckStata\\AppData\\Local\\Programs\\
> Python\\Python38\\lib\\site-packages\\pandas\\__init__.py'>

. python which matplotlib
<module 'matplotlib' from 'C:\\Users\\ChuckStata\\AppData\\Local\\Programs\\
> Python\\Python38\\lib\\site-packages\\matplotlib\\__init__.py'>

. python which sklearn
<module 'sklearn' from 'C:\\Users\\ChuckStata\\AppData\\Local\\Programs\\
> Python\\Python38\\lib\\site-packages\\sklearn\\__init__.py'>

Conclusion

We did it! We successfully installed four of the most popular Python packages using pip. You can use your Internet search engine to find hundreds of other Python packages and install them with pip. Next time, I will show you how to use packages in Python.

Categories: Programming Tags: ,