useR 2015: Networks

July 1, 2015
By

(This article was first published on Why? » R, and kindly contributed to R-bloggers)

These are my initial notes from useR 2015. Will revise when I have time.

fbRads: Analyzing and managing Facebook ads from R (Gergely Daroczi)

Modern advertising

Google/Amazon/Facebook use our information

Ad platforms: Google: RAdwords, facebook likes: fbRads. You can use the facebook API to get information from facebook. Get hashes of email address, not the actual address. In the last few years, the API has changed.

Example

  1. Grab useR's email addresses from CRAN and R-help mailing list.
  2. Create a facebook app with API to get a token.
  3. Create a custom audience.
  4. Create lookalike audiences: get facebook users who are similar to my target list.
  5. Define audience, ad and budget.
  6. Upload an image and description.
  7. Run A/B testing.

The performance metrics API is still being developed.

Web scraping with R – A fast track overview. (Peter Meißner)

There are a number of R packages for web-scraping.

Two problems:

  • Download: protocols/procedures, i.e. HTTP, cookies, POST, GET
  • extraction: parsing/extraction/cleansing, i.e. XML, JSON, html into R

Reading text from the Web

The simplest solution is to use readLines, then use some regular expressions (either with base R or stringr or …).

Reading HTML/XML

Use rvest and use xml_structure to view the structure of the XML scheme. To extract text, we need to use XPath (still using rvest). Within rvest there are a number of convenience functions, e.g. html_table to get a list of tables.

JSON

Use jsonlite to translate JSON to a data frame.

HTML forms/HTTP methods

Use httr and rvest packages.

Overcoming the Javascript Barrier

Use RSelenium for browser automation

Conclusion

  • Don't use Windows for web scraping. Use Linux (or if you must, a Mac)
  • Start with stringr, rvest, jsonlite
  • Need to learn regular expression, file manipulation
  • Before scraping, look for the download button

multiplex: Analysis of Multiple Social Networks with Algebra (Antonio Rivero Ostoic)

Motivation

  • multiplex is a package designed to perform algebraic analyses of multiple networks (but isn't limited to algebra)
  • The function zbind creates multivariate network data from arrays
  • perm manipulates network data

Two-mode networks are represented in a Galois framework. This makes analysis easier(?)

What's new in igraph and networks (Gabor Csardi)

Abstract

igraph is the premier R package for the analysis of network data and it went through major restructuring recently and has changed a lot since last time it was featured at useR! in 2008. This talk introduces the new/updated features of igraph: – Simplified ways of graph manipulation. – New methods community detection. – New layouts for graph visualization. – New statistical methods: graphlets, embeddings, graph matching, cohesive blocks, etc. – How to use igraph graphs with new visualization tools: DiagrammeR, D3, etc.

The igraph package deals mainly with infrastructure. It's actually a C library, with an R and python interface.

What's new: [ and [[

The [ operator makes the graph behave like an adjacency matrix. For example, to check if an edge exists, use air["BOS", "SFO"]. Can also use it to manipulate the network, e.g. to add or remove edges.

The [[ can be used to get all adjacent vertices

What's new: consistent function names and manipulators

  • make_*, sample_*, cluster_, layout_*, graph_from_*
  • manipulators: make_ and sample_
  • Pipe friendly syntax
  • Easier connection to other packages, e.g. networkD3

Current work

  • Better connection to other packages
  • Inference
  • Infrastructure cleanup

Please note that the notes/talks section of this post is merely my notes on the
presentation. I may have made mistakes: these notes are not guaranteed to be
correct. Unless explicitly stated, they represent neither my opinions nor the
opinions of my employers. Any errors you can assume to be mine and not the
speaker’s. I’m happy to correct any errors you may spot – just let me know!

To leave a comment for the author, please follow the link and comment on their blog: Why? » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)