The strength of the data community and the beauty of open source

Giovanni Lanzani/
12 June, 2020
/

As part of PyData Amsterdam 2020 (tickets on sale), I was asked to prepare a session on getting started with testing in Python.

The goal was to empower participants to participate to open source development, on libraries such as pandas.

This is always a daunting and humbling task, even more so now that I basically run the GoDataDriven Academy full-time, with no time to code — or to write tests.

So while readying my talk, I thought: wouldn't it be a great challenge if could not only teach participants something new but also contribute a real test to pandas?

Lo and behold: the pandas developers are ready to merge my contribution!

@pytest.mark.parametrize(
    "grouping,_index",
    [
        (
            {"level": 0},
            pd.MultiIndex.from_tuples(
                [(0, 0), (0, 0), (1, 1), (1, 1), (1, 1)], names=[None, None]
            ),
        ),
        (
            {"by": "X"},
            pd.MultiIndex.from_tuples(
                [(0, 0), (1, 0), (2, 1), (3, 1), (4, 1)], names=["X", None]
            ),
        ),
    ],
)
def test_rolling_positional_argument(grouping, _index, raw):
    # GH 34605

    def scaled_sum(*args):
        if len(args) < 2:
            raise ValueError("The function needs two arguments")
        array, scale = args
        return array.sum() / scale

    df = DataFrame(data={"X": range(5)}, index=[0, 0, 1, 1, 1])

    expected = DataFrame(data={"X": [0.0, 0.5, 1.0, 1.5, 2.0]}, index=_index)
    result = df.groupby(**grouping).rolling(1).apply(scaled_sum, raw=raw, args=(2,))
    tm.assert_frame_equal(result, expected)

I learned tons of things about new features of pytest, how the pandas code base is structured, and more!

Also interested in learning about more advanced features of Python for Data Science? Be sure to check out our Advanced Data Science with Python course, or start with the Data Science with Python Foundation course.

Subscribe to our newsletter

Stay up to date on the latest insights and best-practices by registering for the GoDataDriven newsletter.

    Register for GoDataFest 2020

    The Streamed Festival of Data Technology takes place from November 2 - 6 and features the latest innovations from AWS, Google, Databricks, Microsoft, and much more.

    Apache Airflow Training

    This 1-day GoDataDriven training teaches you the internals, terminology, and best practices of writing DAGs. Plus hands-on experience in writing and maintaining data pipelines.

    Learn Online Today, Apply Tomorrow

    Find the right online course to level up your game whether you’re a data scientist, data engineer, or analytics translator!

    Use Case Ideation and AI Solution

    Do you want to develop AI solutions that generate value for your organization? Use the Use Case Ideation and AI Solution canvas to first generate ideas and then turn those ideas into value-adding AI products.

    Data & AI Training Guide 2020

    Download the GoDataDriven brochure for a complete overview of available training sessions and data engineering, data science, and analytics translator learning journeys.

    Free Video Course – Python Foundation

    The goal of this free online self-paced training is to introduce enough Python to move on to more complicated topics in data science without getting surprised.

    Free Video Course – Analytics Translator

    Our Analytics Translator training is available as on-demand video training.

    Scan – AI Maturity

    The AI Maturity Scan is a great starting point for improving your organization's AI capabilities. It provides a quick, insightful, and actionable report that can directly act upon. Advance your organization's AI Maturity.

    Self-assessment – AI Maturity

    Find Out the AI Maturity Level of Your Organization. This self-assessment provides an initial indication of your organizations' AI maturity level by rating your level of maturity on several key components.

    Webinar – AI Maturity

    View this webinar to learn all about the levels of analytical capabilities and business adoption of AI-driven organizations.

    Whitepaper – AI Maturity

    Learn how an Analytics Translator helps organizations overcome the most common difficulties when building AI solutions.

    Webinar – Analytics Translator

    How an Analytics Translator helps organizations overcome the most common difficulties when building AI solutions. Watch the on-demand webinar with Analytics Translator Mark Schep.