March has been a busy month here at DataCamp! We’ve released 7 courses, 4 DataChats, and 8 articles. Luckily, this round-up post will cover the highlights, in case you missed them.
Data Manipulation: Tidy Your Data!
We were very excited to release the third course in our Pandas course series! Merging DataFrames with pandas follows on the Pandas Foundations and Manipulating DataFrames with Pandas; All three are taught by Dhavide Aruliah from Continuum Analytics, who you might already know if you have downloaded the Anaconda Python distribution!
Tip: if you ever need a one-page reference for this Python package, consider our Pandas basics and Pandas Data Wrangling cheat sheets. Make sure you also don’t miss our Python Exploratory Data Analysis tutorial, in which you’ll learn more about how you can use Pandas with other packages such as matplotlib to explore your data!
Apart from the release of our Pandas course, we also launched Network Analysis in Python (Part 1), by Eric Ma. In this course, you’ll learn how to analyze, visualize, and understand networks by applying the concepts that you learn real-world network data with the NetworkX library.
Eric shows the power of network data in his talk titled “Networks, networks everywhere!” in which he goes over a real-life example of how networks solved influenza problems. You can check it out in DataChats episode.
For those of you who are interested in getting into Spark with Python, March also brought a PySpark cheat sheet, which covers the basics of Spark’s main building blocks, Resilient Distributed Datasets (RDD). There’s also a beginner’s guide, which will give you a comprehensive overview of how to install PySpark on your computer, how to work with Spark in Jupyter, how RDDs differ from DataFrames and Datasets, and so much more!
Machine Learning Is Fun!
We added the Unsupervised Learning in R course, by Hank Roark, Senior Data Scientist at Boeing, to the curriculum: in contrast to supervised machine learning, you’ll see that this course will teach you how to find patterns in data without trying to make predictions.
DataCamp instructor Nick Carchedi asked Hank about his career in data science, job automation and the importance of communicating your results as a data scientist in episode 14 of DataChats.
Next, we launched Supervised Learning with scikit-learn course, taught by Andreas Müller. In this course you’ll learn how to use Python to perform supervised learning: you’ll build predictive models, tune the parameters, tell how well they will perform on unseen data, … All while discovering real world data sets with scikit-learn.
Fun fact: Andreas is one of the core developers of the scikit-learn package!
Do you want to know more about Andreas and his work? In the 15th episode of DataChats, Andreas also gives advice to people starting with data science and answers what the most difficult part of his job.
Lastly, there’s also a new Machine Learning with the Experts: School Budgets course, taught by Peter Bull, co-founder of DrivenData. You might have already heard about this company already: it hosts data science competitions with the goal to save the world. In this course, you’ll use the scikit-learn skills that you learned from Andreas’ course to tackle a problem related to school district budgeting.
Tip: To prepare for DataCamp’s upcoming Deep Learning in Python course, go and check out Deep Learning with Jupyter Notebooks in the Cloud.
R and Finance: Secure Your Quantitative Finance Job
Last but not least, DataCamp’s R for finance curriculum was enriched with an Introduction to R for Finance course, by Lore Dirick. This course is ideal for those who want to get started with finance in R in an applied way: you’ll learn the essential data structures and you’ll have the chance to apply them directly to financial examples! With this course, you’ll be set to start performing financial analyses in R!
One of the things that this course also covers is correlation, or the relationship between variables: consider taking our R correlation tutorial if you want to find out more.
For those who have already finished the Introduction to Portfolio Analysis in R course, we also released Intermediate Portfolio Analysis in R, taught by Ross Bennett, in which you’ll explore advanced concepts in the portfolio optimization process with the PortfolioAnalytics package.
Did you know that this package has 863 monthly (direct) downloads, according to RDocumentation? You can read more about which techniques and technologies are used to provide you with these insights in our RDocumentation: Scoring and Ranking post.
Hope you enjoyed this month’s content. Let us know what you think the comments below! Much more to come for April!
The DataCamp Team