Who wants to work at Google?

[This article was first published on R – Journey of Analytics, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In this tutorial, we will explore the open roles at Google, and try to see what common attributes Google is looking for, in future employees.

This dataset is a compilation of job descriptions of 1200+ open roles at Google offices across the world. This dataset is available for download from the Kaggle website, and contains text information about job location, title, department, minimum, preferred qualifications and responsibilities of the position. You can download the dataset here, and run the code on the Kaggle site itself here.

Using this dataset we will try to answer the following questions:

  1. Where are the open roles?
  2. Which departments have the most openings?
  3. What are the minimum and preferred educational qualifications needed to get hired at Google?
  4. How much experience is needed?
  5. What categories of roles are the most in demand?

Step1 – Data Preparation and Cleaning:

The data is all in free-form text, so we do need to do a fair amount of cleanup to remove non-alphanumeric characters. Some of the job locations have special characters too, so we remove those using basic string manipulation functions. Once we read in the file, this is the snapshot of the resulting dataframe:

Step 2 – Analysis:

Now we will use R programming to identify patterns in the data that help us answer the questions of interest.

a) Job Categories:

First let us look at which departments have the most number of open roles. Surprisingly, there are more roles open for the “Marketing and Communications” and “Sales & Account Management” categories, as compared to the traditional technical business units. (like Software Engineering or networking) .

b) Full-time versus internships:

Let us see how many roles are full-time and how many are for students. As expected, only ~13% of roles are for students i.e. internships. Majority are full-time positions.

c) Technical Roles:

Since Google is predominantly technical company, let us see how many positions need technical skills, irrespective of the business unit (job category)

a) Roles related to “Google Cloud”:

To check this, we investigate how many roles have the phrase either in the job title or the responsibilities. As shown in the graph below, ~20% of the roles are related to Cloud infrastructure, clearly showing that Google is making Cloud services a high priority.

b) Senior Roles and skills :

A quick word search also reveals how many senior roles (roles that require 10+ years of experience) use the word “strategy” in their list of requirements, under either qualifications or responsibilities. Word association analysis can also show this. (not shown here).

Educational Qualifications:

Here we are basically parsing the “min_qual” and “pref_qual” columns to see the minimum qualifications needed for the role. If we only take the minimum qualifications into consideration, we see that 80% of the roles explicitly ask for a bachelors degree. Less than 5% of roles ask for a masters or PhD.

However, when we consider the “preferred” qualifications, the ratio increases to a whopping ~25%. Thus, a fourth of all roles would be more suited to candidates with masters degrees and above.

Google Engineers:

Google is famous for hiring engineers for all types of roles. So we will read the job qualification requirements to identify what percentage of roles requires a technical degree or degree in Engineering.
As seen from the data, 35% specifically ask for an Engineering or computer science degree, including roles in marketing and non-engineering departments.

 

Role Locations:

The dataset does not have the geographical coordinates for mapping. However, this is easily overcome by using the geocode() function and the amazing Rworldmap package. We are only plotting the locations, so some places would have more roles than others.  So, we see open roles in all parts of the world. However, the maximum positions are in US, followed by UK, and then Europe as a whole.

Responsibilities – Word Cloud:

Let us create a word cloud to see what skills are most needed for the Cloud engineering roles: We see that words like “partner”, “custom solutions”, “cloud”, strategy“,”experience” are more frequent than any specific technical skills. This shows that the Google cloud roles are best filled by senior resources where leadership and business skills become more significant than expertise in a specific technology.

 

Conclusion:

So who has the best chance of getting hired at Google?

For most of the roles (from this dataset), a candidate with the following traits has the best chance of getting hired:

  1. 5+ years of experience.
  2. Engineering or Computer Science bachelor’s degree.
  3. Masters degree or higher.
  4. Working in the US.

The code for this script and graphs are available here on the Kaggle website. If you liked it, don’t forget to upvote the script. ? And don’t forget to share!

Next Steps:

You can tweak the code to perform the same analysis, but on a subset of data. For example, only roles in a specific department, location (HQ in California) or Google Cloud related roles.

Thanks and happy coding!

 

(Please note that this post has been reposted from the main blog site at http://blog.journeyofanalytics.com/ )

To leave a comment for the author, please follow the link and comment on their blog: R – Journey of Analytics.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)