How to Install RDKit in Jupyter Notebook: A Complete Setup Guide
RDKit is one of the most widely used open-source cheminformatics libraries in Python โ built for working with molecular structures, chemical data, and computational chemistry workflows. Getting it running inside a Jupyter Notebook takes a few specific steps that differ from a typical pip install, and the right approach depends on your environment setup.
What Is RDKit and Why Does Installation Work Differently?
RDKit is a C++-backed Python library with compiled binary dependencies. Unlike pure Python packages, it can't always be installed cleanly with a simple pip install rdkit โ though that has changed in recent versions. Historically, conda was the only reliable installation method because it handled the binary dependencies automatically.
Understanding this distinction matters before you start, because choosing the wrong method for your environment can lead to import errors or broken Jupyter kernels.
Method 1: Installing RDKit with Conda (Most Reliable)
If you're using Anaconda or Miniconda, this is the recommended approach for most users.
Step 1: Create a Dedicated Conda Environment
conda create -n rdkit-env python=3.10 conda activate rdkit-env Creating a separate environment prevents dependency conflicts with other projects.
Step 2: Install RDKit via Conda-Forge
conda install -c conda-forge rdkit The conda-forge channel maintains the most up-to-date RDKit builds and handles all binary dependencies behind the scenes.
Step 3: Install Jupyter Inside the Same Environment
conda install -c conda-forge jupyter This step is important. Installing Jupyter within the same environment ensures the notebook process has direct access to the RDKit package.
Step 4: Register the Environment as a Jupyter Kernel (Optional but Useful)
If you run Jupyter from a base environment but want to select your RDKit environment as a kernel:
conda install ipykernel python -m ipykernel install --user --name rdkit-env --display-name "Python (rdkit-env)" Then launch Jupyter and select "Python (rdkit-env)" from the kernel menu.
Method 2: Installing RDKit with Pip ๐งช
Since around 2022, RDKit has been available directly on PyPI, which means pip installation now works for many standard setups:
pip install rdkit If you're using a virtual environment (venv) rather than conda, this is often the cleaner path. After installing:
pip install jupyter jupyter notebook Then in a notebook cell, test the installation:
from rdkit import Chem mol = Chem.MolFromSmiles('CCO') print(mol) If this runs without errors, RDKit is installed and accessible in your notebook.
When Pip Works Well vs. When It Doesn't
| Scenario | Pip Install | Conda Install |
|---|---|---|
| Standard Python 3.9โ3.12 on modern OS | โ Generally works | โ Works |
| Older Python versions (3.7, 3.8) | โ ๏ธ Limited wheel availability | โ Better support |
| Complex scientific stack (NumPy, SciPy, etc.) | โ ๏ธ Potential conflicts | โ Handles dependencies |
| Minimal environment / Docker containers | โ Lightweight option | โ Works with Miniconda |
| Windows systems | โ Improved recently | โ Most stable |
Common Issues and How to Diagnose Them
RDKit Installs But Won't Import in Jupyter
This almost always comes down to a kernel mismatch โ Jupyter is running a different Python environment than where RDKit was installed. To check which Python your notebook is actually using:
import sys print(sys.executable) Compare that path to where you installed RDKit. If they don't match, either install RDKit into the correct environment or register the correct environment as a kernel using the ipykernel method above.
ModuleNotFoundError: No module named 'rdkit'
Run this in a notebook cell to confirm the issue:
import subprocess subprocess.run(["pip", "show", "rdkit"]) If nothing returns, RDKit isn't installed in the active kernel's Python path.
Conda Environment Not Appearing as Kernel Option
Make sure nb_conda_kernels is installed in your base environment:
conda install -n base nb_conda_kernels This allows Jupyter to automatically detect and list all conda environments with Python installed.
Variables That Affect Which Method Works for You
Not every setup behaves identically. Several factors shape which installation path will go smoothly:
- Operating system โ Windows, macOS (Intel vs. Apple Silicon), and Linux each have slightly different binary compatibility considerations. Apple Silicon (M1/M2/M3) Macs sometimes require Rosetta or ARM-specific conda builds.
- Python version โ Newer RDKit pip wheels are built for Python 3.9 and above. Older environments may need conda.
- Existing environment complexity โ A scientific Python stack with pinned NumPy or SciPy versions can create conflicts that conda's solver handles better than pip.
- Jupyter setup type โ JupyterLab, classic Jupyter Notebook, and VS Code's Jupyter extension each handle kernels slightly differently, which affects how environments are detected.
- Institutional or managed environments โ Some research computing setups restrict conda channels or pip access, which changes your options entirely.
A Note on JupyterLab vs. Classic Notebook
Both JupyterLab and the classic Jupyter Notebook work with RDKit, but JupyterLab has additional support for interactive molecular visualization through extensions like rdkit-stdinchi and mols2grid. If visualization is part of your workflow, your choice of Jupyter interface affects which display tools are available to you. ๐ฌ
The core installation steps are the same regardless of which interface you use โ the difference shows up when you start adding display and rendering packages on top of RDKit itself.
Verifying a Successful Install
Once installed, run this quick check in a notebook cell:
from rdkit import Chem from rdkit.Chem import Draw from rdkit import __version__ print(f"RDKit version: {__version__}") mol = Chem.MolFromSmiles('c1ccccc1') print("Benzene molecule object:", mol) A valid molecule object returned without errors confirms RDKit is fully operational in your notebook environment.
How straightforward this process is depends heavily on what your existing Python environment looks like โ a clean setup with a fresh conda environment tends to be the most predictable path, while fitting RDKit into an existing environment introduces more variables worth examining before you start.