Introduction

In this post we’ll create a builtin function called atleast with the following signature: atleast(n, iterable). This function will return True if at least n items in an iterable are truthy and False otherwise. It will extend the idea of the already existing builtin called any, which returns True if at least 1 item is truthy and False otherwise. Here are some examples:

atleast(1, [1, 0, 1]) # True, equivalent to: any([1, 0, 1])
atleast(2, [1, 0, 1]) # True
atleast(3, [1, 0, 1]) # False

Where to begin?

Since we are trying to extend the any builtin, the best place to start looking is at the builtin_any function defined in Python/bltinmodule.c. Here’s what that function looks like:

static PyObject *
builtin_any(PyObject *module, PyObject *iterable)
{
    PyObject *it, *item;
    PyObject *(*iternext)(PyObject *);
    int cmp;

    it = PyObject_GetIter(iterable);
    if (it == NULL)
	return NULL;
    iternext = *Py_TYPE(it)->tp_iternext;

    for (;;) {
	item = iternext(it);
	if (item == NULL)
	    break;
	cmp = PyObject_IsTrue(item);
	Py_DECREF(item);
	if (cmp < 0) {
	    Py_DECREF(it);
	    return NULL;
	}
	if (cmp > 0) {
	    Py_DECREF(it);
	    Py_RETURN_TRUE;
	}
    }
    Py_DECREF(it);
    if (PyErr_Occurred()) {
	if (PyErr_ExceptionMatches(PyExc_StopIteration))
	    PyErr_Clear();
	else
	    return NULL;
    }
    Py_RETURN_FALSE;
}

The module parameter isn’t used in the function body so we can ignore it. The iterable parameter is, however, used to create an iterator through a call to PyObject_GetIter. Once the iterator is made, each item within it is passed to the function PyObject_IsTrue, which returns -1 for errors, 0 for falsy values, and 1 for truthy values. In every execution path the reference count for the iterator is decremented at most one time with a call to Py_DECREF. If an error occurs NULL is returned. If a truthy value is found the Py_RETURN_TRUE macro is expanded. Lastly, if no errors occurred and no truthy values are found, then the Py_RETURN_FALSE macro is expanded. As a step towards a solution we can copy this function and make the following changes:

  • define a variable n of type Py_ssize_t to represent the minimum number of truthy values and hardcode it to the value 2.
  • define another variable count of type Py_ssize_t to keep track of the number of truthy values.
  • add an if statement to check if count is greater than or equal to n.
  • rename the function to builtin_atleast2.

    static PyObject *
    builtin_atleast2(PyObject *module, PyObject *iterable)
    {
    Py_ssize_t n = 2, count = 0;
    PyObject *it, *item;
    PyObject *(*iternext)(PyObject *);
    int cmp;
    
    it = PyObject_GetIter(iterable);
    if (it == NULL)
    	return NULL;
    iternext = *Py_TYPE(it)->tp_iternext;
    
    for (;;) {
    	item = iternext(it);
    	if (item == NULL)
    	    break;
    	cmp = PyObject_IsTrue(item);
    	Py_DECREF(item);
    	if (cmp < 0) {
    	    Py_DECREF(it);
    	    return NULL;
    	}
    	if (cmp > 0) {
    	    if (++count >= n) {
    		Py_DECREF(it);
    		Py_RETURN_TRUE;
    	    }
    	}
    }
    Py_DECREF(it);
    if (PyErr_Occurred()) {
    	if (PyErr_ExceptionMatches(PyExc_StopIteration))
    	    PyErr_Clear();
    	else
    	    return NULL;
    }
    Py_RETURN_FALSE;
    }

In order for this new builtin to be picked up we need to add an entry for it in the static array named builtin_methods within the same file. This array contains items of type PyMethodType, which is a struct containing a name, a function (cast as either PyCFunction or PyCFunctionWithKeywords), a method flag (can be METH_NOARGS, METH_O, METH_VARARGS, METH_KEYWORDS, or METH_VARARGS | METH_KEYWORDS), and a documentation string. For now we can copy the entry for builtin_any, but with the name set to atleast2 and the function set to builtin_atleast2.

static PyMethodDef builtin_methods[] = {
...
    {"atleast2", (PyCFunction)builtin_atleast2, METH_O, builtin_any__doc__},
...
};

If we recompile with ../configure && make, then the following should work in the Python REPL.

MODIFY_CPy>>> atleast2([1, 0, 1])
True
MODIFY_CPy>>> atleast2([0, 1, 0])
False

The next step

Now we can work on removing the hardcoded value and instead receive it as a parameter. To do so we will need to make the following changes:

  • change the method flag from METH_O to METH_VARARGS. That way our function will be called with an args tuple instead of a Python Object.
  • rename the parameter from “iterable” to “args”
  • declare a pointer to PyObject named “iterable”, which acts just like the “iterable” parameter before.
  • parse the “args” tuple using PyArg_ParseTuple, passing it the args, a format string, and the addresses of the variables we want to store the values in.

    PyDoc_STRVAR(atleast_doc,
    "atleast(n, iterable) -> bool\n\
    Returns True if at least n entries are truthy, False otherwise.\n");
    
    static PyObject *
    builtin_atleast(PyObject *module, PyObject *args)
    {
    Py_ssize_t n;
    PyObject *it, *item, *iterable;
    PyObject *(*iternext)(PyObject *);
    int cmp, count = 0;
    
    if (!PyArg_ParseTuple(args, "nO:atleast", &n, &iterable))
    	return NULL;
    
    it = PyObject_GetIter(iterable);
    if (it == NULL)
    	return NULL;
    iternext = *Py_TYPE(it)->tp_iternext;
    
    for (;;) {
    	item = iternext(it);
    	if (item == NULL)
    	    break;
    	cmp = PyObject_IsTrue(item);
    	Py_DECREF(item);
    	if (cmp < 0) {
    	    Py_DECREF(it);
    	    return NULL;
    	}
    	if (cmp > 0) {
    	    if (++count >= n) {
    		Py_DECREF(it);
    		Py_RETURN_TRUE;
    	    }
    	}
    }
    Py_DECREF(it);
    if (PyErr_Occurred()) {
    	if (PyErr_ExceptionMatches(PyExc_StopIteration))
    	    PyErr_Clear();
    	else
    	    return NULL;
    }
    Py_RETURN_FALSE;
    }
    static PyMethodDef builtin_methods[] = {
    ...
    {"atleast", (PyCFunction)builtin_atleast, METH_VARARGS, atleast_doc},
    ...
    };

Does it work?

After recompiling, we can try out our new builtin function:

MODIFY_CPy>>> atleast(2, [1, 0, 1, 1])
True
MODIFY_CPy>>> atleast(3, [1, 0, 1, 1])
True
MODIFY_CPy>>> atleast(4, [1, 0, 1, 1])
False