Site icon R-bloggers

Diffify – Python release

[This article was first published on The Jumping Rivers Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

It has been 6 months since the launch of Diffify, our website for comparing package releases. We are delighted to announce that, in addition to CRAN’s 20,000 R packages, you can now track 1600 popular Python packages!

What’s included?

The current criteria for a Python package to be included in Diffify are:

If your favourite package is not currently accessible, don’t worry! We are actively working to expand the list to as many PyPI packages as possible, as we’ll explain below.


Data comes in all shapes and sizes. It can often be difficult to know where to start. Whatever your problem, Jumping Rivers can help.


< !-- This is where the ad goes! Just use the name of the shortcode file. -->

New content

The first change you’ll notice is to our homepage, where we now have buttons for both R and Python.

Clicking on the Python button will take you through to the package search bar. For this walkthrough, we will compare versions 3.3.0 and 3.5.0 of the Matplotlib package. Diffify provides a breakdown of the changes to the package dependencies, functions and classes.

Dependencies

We consider three kinds of dependencies:

In our example, we see that the Python version requirement has changed from >=3.6 to >=3.7.

Functions

Here we provide a list of functions that have been added, removed or changed between the two versions.

Clicking on the “Details” dropdown will bring up the function arguments, including the argument name and default value. If type annotations are included in the package source code, Diffify will also display the argument type and the function return type.

For the pyplot.grid() function, the name of the first positional argument has changed from b to visible.

Classes

Here we provide a list of classes that have been added, removed or changed.

Clicking on the “Methods” button for a class will bring up a pop-up that lists the methods that belong to that class. The example below shows the methods .__init__() and .from_dict(), which belong to the spines.Spines class.

Similar to functions, you can access the method arguments by clicking on “Details”.

Removing clutter

The functions and classes listed above have been detected by analysing the package source code. We have taken various steps to filter out code that is intended for internal use by the package developers, including

These criteria are intended to leave out internal code and unit tests.

Looking ahead

Python has been around for quite a while, and consequently it has many packages – 400,000 to be precise! Perhaps unsurprisingly, analysing so many packages for Diffify has proven to be a bit of a challenge…

This is why we have initially chosen to focus on the 2000 most popular PyPI packages. We will soon extend this to the top 5000, according to Top PyPI Packages. And we won’t be stopping there! It remains to be seen whether we will manage to add all 400,000, but we will certainly try our utmost.

Despite our best efforts to filter out clutter, you may still come across some functions and classes that are clearly intended for internal use or unit testing. We will continue to look at ways to improve our filters.

We hope you enjoy the new content! As always, if you spot any bugs or have any suggestions please add an issue to our public GitHub.

Stay tuned for more updates…

For updates and revisions to this article, see the original post

To leave a comment for the author, please follow the link and comment on their blog: The Jumping Rivers Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.