Python 3.9 StatsProfile — My first OSS Contribution to cPython

By Daniel Olshansky

I recently made my first open source cPython contribution which adds a dataclass called StatsProfile to address this. After you created your stats object, you can retrieve all the information in the above screenshot by calling ps.get_stats_profile() and analyze it in a programmatic way.

If you’re not on Python3.9 yet, the following code snippet is a slightly modified version of the code in the pull request that you can start using today by importing it directly into your project.

Now, rather than inspecting the profile of a single execution of our code snippet, we can aggregate and analyze the profiles over several different iterations. In a real production service, depending on which logging tool you use, you would likely need to format and stringify the StatsProfile dataclass before logging it, but for the purposes of this example, everything is stored in memory.

To simulate timestamped logging, (timestamp, stats_profile)tuples are appended to a timestamped_stats_profile list with every execution of the loop.

After the data is logged, it needs to be aggregated over a certain timeslice. Most logging/visualization platforms have their functions to process timeseries data, so this would be platform specific. Sumologic has the timeslice function, Elasticsearch has examples of how to do date histogram aggregation, Datadog has an aggregate across time dropdown, etc...

For the purposes of this example, I’m doing the aggregation manually in python. I bucket all the logged (i.e. saved) StatsProfile objects over 10 second intervals, aggregate the cumulative execution time, cumtime, per function call and store the resultant counters in time_slices_counters. If you’re interested in inspecting the number of calls to certain functions rather than the cumulative execution time spent in it, you would simply modify the parameter being access on line 21 in the code snippet below.

In my opinion, a stacked bar graph is a great way to visualize and easily interpret this data. Using the following code snippet:

We can generate a graph that looks like this:

The results aren’t very interesting or surprising given the simplicity of a script calling sleepa bunch of times, but hopefully it’ll be more useful in more complex applications.