Finding and implementing settings in RStudio to use Pyhon.
RStudio and Python
There are a few critical steps to implement for using Python to avoid painful issues, or even to know which version is used, where it’s located etc.
New users that want to use Python can find lots of “installation” instructions online that in the end might cause conflicts as the suggested methods might not install things properly.
This question “Which Python?” has a double entendre (French for double meaning) in understanding of the source of the Python program (e.g. which one was installed?) as well as which one is used by default (if multiple Pythons and/or versions are installed.) This last question would be answered by the Terminal command:
$ which python
So the questions can be formulated as:
– Which distribution to install?
– How to choose (and later change?) which one is used within RStudio?
The current choices for free versions are essentially a choice between 2 options:
Both are good choices, but their installation methods and the subsequent managements of additional modules/packages/libraries has fundamental differences that can lead to conflicts.
Which is the default?
This is a “tricky question” as the answer may differ depending on the actual software that will use the “default” version of Python.
Within a command Terminal (i.e. using the
zsh shell) the answer will be given by the
which pythoncommand but is ultimately determined by the
PATHvariable (see post PATH: the overlooked crucial variable.)
The R command
Sys.getenv("PATH") can be useful to check the current value of the
Within RStudio the answer is a bit more subtle or complex as detailed below.
R reticulate package to access Python within RStudio
Installation: Install the reticulate package from CRAN as follows within the R Console:
Python version: By default, reticulate uses the version of Python found on your
Sys.which("python3") on some macOS). This is simply passing on the
which command to the operating system.
use_python() function enables you to specify an alternate version, for example:
You can use the
py_config() function to query for information about the specific version of Python in use as well as a list of other Python versions discovered on the system: (Note: That statement is not completely true as on my Mac this command does not list any of the Anaconda installations or even the Python version that is part of macOS.)
You can also use the
py_discover_config() function to see what version of Python will be used without actually loading Python:
Providing Hints: There are two ways you can provide hints as to which version of Python should be used:
- By setting the value of the
RETICULATE_PYTHONenvironment variable to a Python binary. Note that if you set this environment variable, then the specified version of Python will always be used (i.e. this is prescriptive rather than advisory). To set the value of
Sys.setenv(RETICULATE_PYTHON = PATH)into your project’s .Rprofile, where
PATHis your preferred Python binary.
- By calling one of the these functions:
||Specify the path a specific Python binary.|
||Specify the directory containing a Python virtualenv.|
||Specify the name of a Conda environment.|
Manual set-up with RStudio GUI
The choice of Python can also be set-up using the RStudio GUI, which may prove easier for many users. For this Mac users would go to the Preferences menu (RStudio > Preferences…) while Linux/Windows users might find that pane under the “Tools” top menu. Then select the “Python” button on the left side. The “Python interpreter” text box will typically be empty at first. To select an interpreter click on the “Select…” button (see below.)
Then click the “Select…” button to choose a Python version (that is already installed on the computer) presenting 3 choices. Here are what I currently see as I have both the standard Python as well as Anaconda installed:=
Python version 3.9.6 is what is installed by default and also an integral part of macOS (13.0.1, Ventura) while version 3.10.4 was installed by the Intelligent Hub (see Post Understanding Python installation mess.) The listed options python3 as well as python3.10 are in fact the very same binary labeled internally.
With this method it is also possible to choose an Anaconda installation:
The version 3.9.13 is that of an Anaconda version installed manually recently. (Anaconda does not require Admin password for installation. However, on current Mac system, it is necessary to match the chip type e.g. Intel or M1 to avoid conflict with RStudio.)
The version labeled 3.7.13 is related to the built-in internal Python version within the installed PyMOL software and relates to a PyMOL plug in (see post Structural Bioinformatics with PyMOL plugin PyMod 3.)
GUI sets RETICULATE_PYTHON
The method above is a convenient way to set things up, but require manual intervention. Running the command
reticulate::py_config() after this set-up will show the same Python version as selected manually, and at the end of the list will be a line that says:
NOTE: Python version was forced by RETICULATE_PYTHON
Indeed RStudio also opens an internal Terminal to access the system.
Switching to that Tab next to the R console we can ask the question:
$ printenv RETICULATE_PYTHON
There are multiple ways to access Python from within RStudio. The manual (GUI) method is also reflected in variables accessed by command line. Command line methods could be beneficial to switch which Pyhon is used within a document.
Top illustration: Various respective logos over orange fruit background by Pixabay Pheladi Shai.