Hertzsprung-Russell diagram with Python

When it comes to stars classification, the Hertzsprung-Russell diagram is probably the most popular way of doing so. These diagrams relate the temperature of the star and its luminosity. After plotting the values for different stars, several groups arise in the figure, which enable to distinguish between different types of stars. Check out this post for more on how to plot H-R figures with Python!

First things first, we need a star catalogue. There are plenty of these data sets, although the most popular ones are the Yale Bright Star Catalogue, the Hipparcos Catalogue and the Gliese catalogue. Therefore, which one should we use? Well, it turns out that there exists a data set which collects all previous ones into a single *.CSV file: the HYG database, who’s license can be found in the official repository.

Preparing the environment

Let us start by downloading the stars catalogue making use of the curl command:

curl https://www.astronexus.com/downloads/catalogs/hygdata_v3.csv.gz --output HYG_catalogue.csv.gz

Now, as with every Python project, the first thing is to create a virtual environment and activate it. The version we will be using for this tutorial is Python 3.8:

python -m venv .venv && source .venv/bin/activate

Because the only dependencies we require are pandas and matplotlib, let us build a requirements.txt file and install these popular libraries:

echo "matplotlib\npandas" >> requirements.txt
pip install -r requirements.txt

Now, we are ready to start working with the HYG catalogue!

Reading the data

Let us start by importing the two libraries we are going to work with. In addition, we can also read the data set and print the different fields on it:

# Import the main libraries of the project
import matplotlib.pyplot as plt
import pandas as pd

# Read the HYG catalogue and print its fields
stars_catalogue = pd.read_csv("HYG_catalogue.csv.gz")
print(f"Catalogue fields: {[field for field in stars_catalogue.columns]}")

which outputs:

>>> Catalogue fiels: ['id', 'hip', 'hd', 'hr', 'gl', 'bf', 'proper', 'ra',
    'dec', 'dist', 'pmra', 'pmdec', 'rv', 'mag', 'absmag', 'spect', 'ci', 'x',
    'y', 'z', 'vx', 'vy', 'vz', 'rarad', 'decrad', 'pmrarad', 'pmdecrad',
    'bayer', 'flam', 'con', 'comp', 'comp_primary', 'base', 'lum', 'var',
    'var_min', 'var_max']

From these fields, we are interested in the ci one, which provides the Color Index. Also, let us get the lum one, which returns the luminosity of each star. It is also possible to use the absmag instead of the luminosity for building the diagram.

However, we see that the temperature of the star is not available. Therefore, how can we compute it? Well, it turns out that the color index (also known as B-V) is related to the temperature via:

$$ T = 4600 \text{K} \left( \frac{1}{0.92 \cdot BV + 1.7}+ \frac{1}{0.92 \cdot BV + 0.62} \right) $$

where the output temperature is in Kelvin. Let us implement previous expression under a function, so we can apply it directly to the ci field.

def T_from_BV(BV):
    """Solves the temperature in Kelvin at given color index."""
    T = 4600 * ((1 / (0.92 * BV + 1.70) + 1 / (0.92 * BV + 0.62)))
    return T

With previous function, we can obtain all the information we are interested in:

# Collect the luminosity, color index and temperature of each star
luminosity = stars_catalogue["lum"]
color_index = stars_catalogue["ci"]
temperature = color_index.apply(T_from_BV)

Building a custom color map

In order to obtain a beautiful H-R diagram, let us build a custom color map. This will help us relating the temperature of each star with a particular color. Although there are amazing color palletes available, see this one by Mitchell Charity, we will be using the one from the color index table, which I extracted using an HTML inspector.

Because we need to create a custom color map, let us make use of the LinearSegmentedColormap class:

# Import the required utilities to build the color map
from matplotlib.colors import LinearSegmentedColormap

# A color list linked to star classes, from higher temperature to lower one
color_list = [
    "#7070ff", "#50a0ff", "#c0cfff", "#cfffff", "#efffdf", "#ffff7f", "#ff7f7f"

# Build a custom colormap linked to the temperature
cmap_MKGFABO = LinearSegmentedColormap.from_list(
    "cmap_MKGFABO", color_list[::-1], N=100

where the name cmap_MKGFABO indicates that it is a color map going from the M class color (colder stars) up to O star class (hotter stars). This is the reason why we had to apply a reversed color scheme by doing color_list[::-1], as the color_list was defined from higher temperature values to lower ones.

Plotting the data

With all data collected, we are finally able to generate the diagram:

# Plotting the figure
with plt.rc_context({"axes.facecolor": "black"}):

    # Build the figure
    fig, ax = plt.subplots(figsize=(12, 16))

    # Plot each star as a point and link its color to the temperature
    ax.scatter(color_index, luminosity, s=0.1, c=temperature, cmap=cmap_MKGFABO)

    # Set title and axis labels
    ax.set_title("Hertzsprung-Russell diagram")
    ax.set_xlabel("Color index B-V")

    # Configure the scales, limits and ticks
    ax.set_xlim((-0.5, 2.5))
    # Define the style of annotations and draw them
    anotations_style = dict(color="white", arrowprops=dict(arrowstyle="simple", facecolor="white"))
    annotations = {
        "Giants": [(1.5, 1e3), (2, 1e4)],
        "White Dwarfs": [(0.25, 1e-4), (-0.35, 1e-5)],
        "Main sequence": [(1, 1), (1.45, 1)],
        "Super Giants": [(1.2, 1e7), (1.45, 1e8)],
    for label in annotations:
        ax.annotate(label, *annotations[label], **anotations_style)

    # Save the figure and display it
    plt.savefig("hr_diagram.png", bbox_inches="tight")

which returns the following beautiful image:

As an exercise, I propose you to plot another scale in the horizontal axes for the temperature, so you can better visualize the relation between the color, the color index and the stars' temperature. The same exercise could be applied to the absolute magnitude, in this case along the vertical axes.

Useful resources and references

For further information about star classes, here are some links I checked while preparing this notebook: