Python: Creating and Importing a custom extension

This entry is more of a big “note to self” than anything else. Until recently, I’ve spent years relying on PHP, Perl, Awk and Bash for scripting, so I never felt an urgent need to pick up Python.

On the other hand, python is everywhere these days, and that alone makes it worthwhile to learn about and dig into. One thing that I’ve quickly noticed, is that many python scripts begin with a wall of import statements. Naturally, this got me curious, what is going on in there, and what’s it like to try and create something I can import myself?

According to the official documentation we can use either C or C++ to build extensions for Python. Since I am not a C++ programmer, I will stick to C, which is a language I’m reasonably comfortable with as a lifelong linux user.

Building the external code

For the sake of my own sanity, I’m going to stick to a simple goal: a simple extension that can be used to calculate the vector cross product of two 3 dimensional vectors. The main benefit of this example is that even with pen and paper, we can easily verify that the extension returns the expected answer for a set of inputs. This way it’ll be easier to stay focused on getting familiar with the Python API.

Recap on vector cross products

There’s many different methods for calculating vector cross products, but in my opinion, the simplest one is to use the Levi-Civita Permutation Symbol. With this trick, all we really need to remember is how to count to 3, and we can calculate vector cross products without too much thought. In three dimensions, the Levi-Civita symbol (ε_ijk) has the following properties:


 ε_ijk = 1 if (i,j,k) is an even permutation of (1,2,3) 
 ε_ijk = -1 if (i,j,k) is an odd permutation of (1,2,3) 
 ε_ijk = 0 otherwise

Now remembering that a vector cross product means that the resulting vector will be a vector perpendicular to both of the initial vectors, we can formulate the vector cross product as:


 ε_ijka_jb_kê_i

Calculating through an example:


 (-1, 2, 3) x (-2, 0, 1) = ? 

 ε_ijka_jb_kê_i = ε₁₂₃2₂1₃ê₁ + ε₂₃₁3₃ − 2₁ê₂ + ε₃₁₂ − 1₁0₂ê₃ ... 
 + ε₁₃₂3₃0₂ê₁ + ε₂₁₃ − 1₁1₃ê₂ + ε₃₂₁2₂ − 2₁ê₃ 

 ε_ijka_jb_kê_i = (1)(2)ê₁ + (1)(−6)ê₂ + (1)(0)ê₃ ... 
 +(−1)(0)ê₁ + (−1)(−1)ê₂ + (−1)(−4)ê₃ 

 ε_ijka_jb_kê_i = 2ê₁ + (−5)ê₂ + 4ê₃

This gives us a clear expected result out of the function we wish to implement here.

For me this approach has been the easiest to remember: just count to 3, apply the rule and you’re done! No need to memorize different component formulas. This also allows for a simple implementation programmatically.

Creating the extension core files

To write the C extension, we need a source file with the actual function vector_product_py.c:

int levi_civita(int i, int j, int k) {
    if (i == j || j == k || k == i) {
        return 0;
    } else if ((i == 0 && j == 1 && k == 2) || (i == 1 && j == 2 && k == 0) || (i == 2 && j == 0 && k == 1)) {
        return -1;
    } else {
        return 1;
    }
}

void cross_product(double a[3], double b[3], double result[3]) {
    result[0] = levi_civita(1, 2, 3) * a[1] * b[2] - levi_civita(1, 3, 2) * a[2] * b[1];
    result[1] = levi_civita(2, 0, 3) * a[2] * b[0] - levi_civita(2, 3, 0) * a[0] * b[2];
    result[2] = levi_civita(3, 0, 1) * a[0] * b[1] - levi_civita(3, 1, 0) * a[1] * b[0];
}

Which should have a header file vector_cross_py.h:

int levi_civita(int i, int j, int k);
void cross_product(double a[3], double b[3], double result[3]);

The header file declares the functions our extension exposes, while the .c file defines what the functions actually do. This makes it easy to include the functionality from other C files, such as the Python glue code that we’re going to write next.

The “glue” code

Now that we have a raw C extension, we need some kind of “glue” that will allow us to connect it to the python interpreter. For this, lets’ create a new file vector_product_module.c.

We’ll start by the obvious, and include the python header files, as well as the header file of the extension we just created:

#define PY_SSIZE_T_CLEAN
#include <Python.h>

#include "vector_product_py.h"

Now, for each function in the extension that we want to be able to call from python, we need a corresponding wrapper, that can take a python object, convert it to a C object, call our C function and covert the result to a Python object. In our case, we want to expose only the cross_product function.

So let’s implement a wrapper for cross_product so that it can be imported in python:

static PyObject* py_vector_cross(PyObject* self, PyObject* args) {
    PyObject *a_obj, *b_obj;

    if (!PyArg_ParseTuple(args, "OO", &a_obj, &b_obj))
        return NULL;

    if (PySequence_Size(a_obj) != 3 || PySequence_Size(b_obj) != 3) {
        PyErr_SetString(PyExc_ValueError, "Both input vectors must have length 3");
        return NULL;
    }

    double a[3], b[3], r[3];

    for (int i = 0; i < 3; i++) {
        a[i] = PyFloat_AsDouble(PySequence_GetItem(a_obj, i));
        b[i] = PyFloat_AsDouble(PySequence_GetItem(b_obj, i));
    }

    cross_product(a, b, r);
    return Py_BuildValue("(ddd)", r[0], r[1], r[2]);
}

Ok, so there’s quite a lot to unpack here, so let’s break it down:

The PyArg_ParseTuple() is a variadic function that takes the Python objects contained in args and converts them based on the input pattern. We’re expecting to input two tuples on the python end, so our input pattern here is “OO” to get two C variables without any type conversion (i.e. at this point, on the C side, we do not yet have individual values extracted from the input). We’ll want to turn this thing into a C array.

Next we use the PySequence_Size() to check the number of elements in the inputs. This is more or less analogous to doing a len() check in python. If either of the inputs into this wrapper is not exactly 3, we should bail and return an error, since the vector cross product requires an exact number of inputs.

Now that we’re (somewhat) sure that we have a reasonable input, we can begin the conversion from a python object to a C array. First of all, using PySequence_GetItem(*obj, item). It takes a python object obj and returns the value of the i-th element (or NULL if the element does not exist). Then, we can use PyFloat_AsDouble() to convert the results to native C doubles.

With that, we’ve successfully taken our python inputs, and translated them to something that can work with C.

Once the calculation is done, we want to eventually return a tuple, which is why we return with Py_BuildValue(). The input (ddd) means that we wish to return a python tuple with 3 doubles as values.

Now, we can map these wrapper functions to a name in python:

static PyMethodDef VecMethods[] = {
    {"vector_cross", py_vector_cross, METH_VARARGS, "Compute 3D cross product."},
    {NULL, NULL, 0, NULL}
};

What this means, is that if this extension is imported, we should be able to call a vector_cross() function in python. Calling said function in python should invoke our C implementation behind the scenes.

And finally some boilerplate to create the extension:

static struct PyModuleDef vecmodule = {
    PyModuleDef_HEAD_INIT,
    "vec",
    "3D vector operations",
    -1,
    VecMethods
};

PyMODINIT_FUNC PyInit_vec(void) {
    return PyModule_Create(&vecmodule);
}

Basically, when we build this extension, we’ll need to give it a name that python can find and reference. In the PyModuleDef structure, we tell Python that we want to call our extension vec and that the list of methods we expose is given by the structure VecMethods.

The PyInit_vec() function is the entry point that python will call whenever this extension is imported. The name after PyInit must match with the name specified in PyModuleDef.

What that boils down to for us here, is that once this extension is built, python will know how to initialize it when we do:

import vec

Building the extension

Before pressing the big red button, we need to keep in mind that we need

-> a shared object, not a standalone program

-> we want it position-independent, i.e. be able to place our shared object wherever

-> have the naming correct, because python expects a specific pattern when looking for shared objects and it does have a hierarchy when searching

-> we want to compile the source files vector_product_py.c AND the “glue” file vector_product_module.c

In my case, this boils down to the following gcc command:

gcc -fPIC -shared $(python3-config --includes) vector_product_module.c vector_product_py.c -o vec$(python3-config --extension-suffix)

Which outputted a file:

vec.cpython-310-x86_64-linux-gnu.so

Trying it out

At this point, the shared library must be in the same directory that you’re working from (this is a topic for a future blog post!)

So assuming we’re in the directory where we’ve outputted the shared object:

root@debian-test:~/py# python3
Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import vec
>>> vec.vector_cross((-1, 2, 3),(-2, 0, 1))
(2.0, -5.0, 4.0)

Success! Python is successfully calling into our simple C function.

What now?

Will I use this constantly? Probably not, but Python’s C API is more approachable than I expected. If you’ve ever looked at those import statements and wondered what’s really happening, then now you know.