The strength of the data community and the beauty of open source

Giovanni Lanzani/
12 June, 2020
/

As part of PyData Amsterdam 2020 (tickets on sale), I was asked to prepare a session on getting started with testing in Python.

The goal was to empower participants to participate to open source development, on libraries such as pandas.

This is always a daunting and humbling task, even more so now that I basically run the GoDataDriven Academy full-time, with no time to code — or to write tests.

So while readying my talk, I thought: wouldn't it be a great challenge if could not only teach participants something new but also contribute a real test to pandas?

Lo and behold: the pandas developers are ready to merge my contribution!

@pytest.mark.parametrize(
    "grouping,_index",
    [
        (
            {"level": 0},
            pd.MultiIndex.from_tuples(
                [(0, 0), (0, 0), (1, 1), (1, 1), (1, 1)], names=[None, None]
            ),
        ),
        (
            {"by": "X"},
            pd.MultiIndex.from_tuples(
                [(0, 0), (1, 0), (2, 1), (3, 1), (4, 1)], names=["X", None]
            ),
        ),
    ],
)
def test_rolling_positional_argument(grouping, _index, raw):
    # GH 34605

    def scaled_sum(*args):
        if len(args) < 2:
            raise ValueError("The function needs two arguments")
        array, scale = args
        return array.sum() / scale

    df = DataFrame(data={"X": range(5)}, index=[0, 0, 1, 1, 1])

    expected = DataFrame(data={"X": [0.0, 0.5, 1.0, 1.5, 2.0]}, index=_index)
    result = df.groupby(**grouping).rolling(1).apply(scaled_sum, raw=raw, args=(2,))
    tm.assert_frame_equal(result, expected)

I learned tons of things about new features of pytest, how the pandas code base is structured, and more!

Also interested in learning about more advanced features of Python for Data Science? Be sure to check out our Advanced Data Science with Python course, or start with the Data Science with Python Foundation course.

Subscribe to our newsletter

Stay up to date on the latest insights and best-practices by registering for the GoDataDriven newsletter.