Using Python within RStudio

Python and OS logos

Summary

Finding and implementing settings in RStudio to use Pyhon.

RStudio and Python

RStudio can handle a large number of languages (> 50) in addition to its basic support of R. (This month the RStudio company has been renamed Posit.)

There are a few critical steps to implement for using Python to avoid painful issues, or even to know which version is used, where it’s located etc.

Which Python?

New users that want to use Python can find lots of “installation” instructions online that in the end might cause conflicts as the suggested methods might not install things properly.

This question “Which Python?” has a double entendre (French for double meaning) in understanding of the source of the Python program (e.g. which one was installed?) as well as which one is used by default (if multiple Pythons and/or versions are installed.) This last question would be answered by the Terminal command:

$ which python

So the questions can be formulated as:

– Which distribution to install?
– How to choose (and later change?) which one is used within RStudio?

Which distribution?

The current choices for free versions are essentially a choice between 2 options:

1. Python.org
2. Anaconda.com

Both are good choices, but their installation methods and the subsequent managements of additional modules/packages/libraries has fundamental differences that can lead to conflicts.

A very useful article to read (July 2022) is “Python vs. Anaconda — What’s the Difference?” [Archived]

Which is the default?

This is a “tricky question” as the answer may differ depending on the actual software that will use the “default” version of Python.

Within a command Terminal (i.e. using the bashor zsh shell) the answer will be given by the which pythoncommand but is ultimately determined by the PATHvariable (see post PATH: the overlooked crucial variable.)

The R command Sys.getenv("PATH") can be useful to check the current value of the PATHvariable.

Within RStudio the answer is a bit more subtle or complex as detailed below.

R reticulate package to access Python within RStudio

First, the R package reticulateneeds to be installed. Details can be gathered from the repository rstudio.github.io/reticulate – [Archived]. The critical information within is below:

Installation: Install the reticulate package from CRAN as follows within the R Console:

install.packages("reticulate")

Python version: By default, reticulate uses the version of Python found on your PATH (i.e. Sys.which("python") or Sys.which("python3") on some macOS). This is simply passing on the which command to the operating system.

The use_python() function enables you to specify an alternate version, for example:

library(reticulate)
use_python("/usr/local/bin/python")

Current configuration: (From article Python Version Configuration – [Archived will download PDF].)

You can use the py_config() function to query for information about the specific version of Python in use as well as a list of other Python versions discovered on the system: (Note: That statement is not completely true as on my Mac this command does not list any of the Anaconda installations or even the Python version that is part of macOS.)

You can also use the py_discover_config() function to see what version of Python will be used without actually loading Python:

reticulate::py_discover_config()

Providing Hints: There are two ways you can provide hints as to which version of Python should be used:

  1. By setting the value of the RETICULATE_PYTHON environment variable to a Python binary. Note that if you set this environment variable, then the specified version of Python will always be used (i.e. this is prescriptive rather than advisory). To set the value of RETICULATE_PYTHON, insert Sys.setenv(RETICULATE_PYTHON = PATH) into your project’s .Rprofile, where PATH is your preferred Python binary.
  2. By calling one of the these functions:
Function Description
use_python() Specify the path a specific Python binary.
use_virtualenv() Specify the directory containing a Python virtualenv.
use_condaenv() Specify the name of a Conda environment.

Cheat Sheet for reticulate is available as PDF – [Archived]

Manual set-up with RStudio GUI

The choice of Python can also be set-up using the RStudio GUI, which may prove easier for many users. For this Mac users would go to the Preferences menu (RStudio > Preferences…) while Linux/Windows users might find that pane under the “Tools” top menu. Then select the “Python” button on the left side. The “Python interpreter” text box will typically be empty at first. To select an interpreter click on the “Select…” button (see below.)

RStudio Preferences Panel
Click on Python within the Preferences pane.

 

Then click the “Select…” button to choose a Python version (that is already installed on the computer) presenting 3 choices. Here are what I currently see as I have both the standard Python as well as Anaconda installed:=

List of installed python.org versions

Python version 3.9.6 is what is installed by default and also an integral part of macOS (13.0.1, Ventura) while version 3.10.4 was installed by the Intelligent Hub (see Post Understanding Python installation mess.) The listed options python3 as well as python3.10 are in fact the very same binary labeled internally.

With this method it is also possible to choose an Anaconda installation:

List of Anaconda Python available

The version 3.9.13 is that of an Anaconda version installed manually recently. (Anaconda does not require Admin password for installation. However, on current Mac system, it is necessary to match the chip type e.g. Intel or M1 to avoid conflict with RStudio.)

The version labeled 3.7.13 is related to the built-in internal Python version within the installed PyMOL software and relates to a PyMOL plug in (see post Structural Bioinformatics with PyMOL plugin PyMod 3.)

GUI sets RETICULATE_PYTHON

The method above is a convenient way to set things up, but require manual intervention. Running the command reticulate::py_config() after this set-up will show the same Python version as selected manually, and at the end of the list will be a line that says:

NOTE: Python version was forced by RETICULATE_PYTHON

Indeed RStudio also opens an internal Terminal to access the system.

Terminal Tab within RStudio

Switching to that Tab next to the R console we can ask the question:

$ printenv RETICULATE_PYTHON
/usr/local/bin/python3

Conclusion

There are multiple ways to access Python from within RStudio. The manual (GUI) method is also reflected in variables accessed by command line. Command line methods could be beneficial to switch which Pyhon is used within a document.


Image credits:
Top illustration: Various respective logos over orange fruit background by Pixabay Pheladi Shai.