Site icon R-bloggers

Medium + r-bloggers — How to integrate?

[This article was first published on Stories by Sebastian Wolf on Medium, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Medium + r-bloggers — How to integrate?

Build up a PHP script that allows you to post your Medium articles on r-bloggers.com. The script filters an RSS feed by item tags.

Photo by Ato Aikins on Unsplash

Motivation

I started my blog about R on Medium. Medium is a wonderful platform with a great user interface. The idea to use it came by reading a blog post from Joe Cheng. After I wrote about two or three articles about R I wanted to publish them on r-bloggers.com. r-bloggers.com requires an RSS feed of your blog. This is a simple feature if you are using WordPress or Bloggers. In Medium I noticed my feed contains everything that I’m writing. It also includes short comments of mine (https://medium.com/feed/@zappingseb):

The maintainers of r-bloggers.com did not want such comments inside their website. So I started reading on filtering my articles by certain tags. The only way I found to realize this was inside a PHP script.

Read RSS to PHP

To read my RSS feed into a PHP variable I used the built-in XML reader:

header('Content-Type: text/xml');
$xml = new DOMDocument;
$url = "https://medium.com/feed/@zappingseb";
$xml->load($url);

Iterate over the RSS file

The next step was to get every single item (blog entry) out of the XML. To start an iteration I first needed to get the items into an array. Therefore I made the XML a book. The documentElement function will create such a book. From this book, I extracted all items inside the channel tag.

$book = $xml->documentElement;
// we retrieve the chapter and remove it from the book
$channel= $book->getElementsByTagName('channel')->item(0);
$items = $channel->getElementsByTagName('item');

To store all filtered items, I created an empty array.

$domArray = array(); //set up an array to catch all our node

I found a wonderful stackoverflow entry on how to filter keywords from RSS. It provided me with a function to lookup keywords. So I looped over all items of my Medium RSS feed. For each item, I got the categories inside the tag category . Iterating over the categories I checked if the category “r” was inside. If so I set the variable $found to true and stop the iteration on categories. If the $found variable was never set to true, I added the article to the domArray . All items in the domArray must be filtered out.

//For each article you can derive whether it contains a keyword in 
//it’s tags or not
foreach ($items as $i){

    $categories = $i->getElementsByTagName('category');
    $found = false;
    foreach($categories as $category){
        if($category->nodeValue == "r"){
           $found = true;
           break;
        }
    }
    if(!$found){
       $domArray[] = $i;
       
    }
}

Filtering out articles

It took me a while to understand how PHP handles XML files internally. After some time I found out I can just delete all items within one loop. It was not possible to do it in my items loop. So I inserted this loop at the end of my file:

foreach($domArray as $i){ 
    $channel->removeChild($i);
}

It takes the $channel variable where all items are stored. Afterwards, it deletes the ones that did not contain the keyword r .

Printing out the feed

A difference between PHP and R is that $channel was a copy by reference from $xml . This means the code did not only change the $channel variable, but also the $xml variable. This is the reason I can immediately write out the $xml variable. This command produces the filtered RSS feed.

echo $xml->saveXML();

Final learning

For me, it was a hard task to get to r-bloggers.com. I simply had to pull out all my old PHP skills and google a lot. Finally, I made it. I was really happy to see my post. I hope more people can benefit from this blog entry. It would be a pleasure to see more Medium content on r-bloggers.com.

Dear Reader: It’s always a pleasure to write about my work on R programming. I thank you for reading until the end of this article. If you liked the article, you can clap for it on Medium or star the repository on github. In case of any comment, leave it here or on my LinkedIn profile http://linkedin.com/in/zappingseb.

Final file:

I’m hosting the script of this tutorial on my PHP server. You can find it in the gist below. It is hosted at: http://mail-wolf.de/documents/medium/


Medium + r-bloggers — How to integrate? was originally published in Data Driven Investor on Medium, where people are continuing the conversation by highlighting and responding to this story.

To leave a comment for the author, please follow the link and comment on their blog: Stories by Sebastian Wolf on Medium.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.