I remember my first days of coding R: I would search around for how to do X task, and find out that Y package would be really helpful. But that led to more questions: What are packages in the first place, and how do you use them? In this post, we’ll explore what these packages tell us about the nature of open source, and how using them is like using your smartphone apps.
Open source is a license to build
Imagine if you had a smartphone but couldn’t install any apps on it. The phone would still be pretty nifty — you could browse the internet, update your calendar and of course make calls. But the power of smartphones often lies in the power of its downloadable apps.
Open source packages serve a similar function. When software is open source, anyone is free to inspect, repurpose and redistribute the code. They’re also welcome to contribute to it — and that’s where packages come in.
A package is nothing but assembled code, usually meant to achieve some specific purpose. Maybe you’ve defined your own function before rather than walking through the same tedious calculations each time. Package can include functions, objects, and more. Anyone can make a package independently, perhaps using it for personal use or sharing it with a team.
Many languages maintain official clearinghouses for packages: for R, it’s in the Comprehensive R Archive Network; Python’s is the Python Package Index. You can think of these in a way as the “app stores” of R and Python. And to paraphrase the famous saying… “There’s a package for that!” Whether you are looking to connect to a database or make geospatial visualizations, there’s probably a package to help.
“There’s a package for that!”
Consider the steps you take to use an app on your phone:
- You install the package to your device once.
- You open the package each time you want to start using it.
- You update, delete, etc. the package as desired.
These steps are the same in R and Python:
- In R, you’ll install a package with
install.packages(), then open with
- In Python, you’ll install with either
conda install, depending on whether you’re using the Anaconda distribution and if the package is available there. You’ll then import into your session with
- For more information and instructions, check out my book Advancing into Analytics.
If you’ve wondered why R and Python have reached such massive user adoption over a short period, packages are a major factor. Best of all, nearly all of them are free — they just take a couple of steps to use!
Want to learn more about popular data packages in Python and R? Check out my book Advancing into Analytics.
You’ll also learn about some ways to identify packages to meet your given needs. At untold thousands, there’s likely one in R or Python to help you out with any data project.
As you get started it’s a very common beginner mistake (heck, everyone does it!) to refer to a package that you haven’t called into your session, which would be like trying to navigate an app you don’t have open. Remember the analogy, pick up the book for more, and happy coding.