Site icon R-bloggers

Re-Release: `traktok`

[This article was first published on Johannes B. Gruber on Johannes B. Gruber, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I’m happy to announce that traktok, my package to get content from TikTok, has returned from the dead. That’s slightly exaggerated, because it actually always worked in some shape or form, but up until about September, the most recent state on Github had very limited functionality. Now I extended the package substantially and also gave it an appealing home on a pkgdown site here: https://jbgruber.github.io/traktok/.

The main issue I had before, namely that some requests to the unofficial TikTok API need to be signed, still remains unresolved. But the remaining functions are now much more stable. I have also moved the ’authentication’for the unofficial API to a separate package, cookiemonster, since it seemed silly to maintain two different code bases for using cookies in R (the other place being in paperboy, which I will discuss here soon).

However, what is new is that traktok now supports the Research API! This was actually also an issue because it required me to decide on a new naming scheme. I landed on keeping most of the functions, but writing separate version for whether you have Research API access or not. But I think the most analysis projects will profit from being able to mix and match functions from both APIs:

Description Shorthand Research API Hidden API
search videos tt_search tt_search_api tt_search_hidden
get video detail (+file) tt_videos tt_videos_hidden
get user videos tt_user_info tt_user_info_api
get comments under a video tt_comments tt_comments_api
get who follows a user tt_get_follower tt_get_follower_hidden
get who a user is following tt_get_following tt_get_following_hidden
get raw video data tt_request_hidden
authenticate a session auth_research auth_hidden

You can install the package from GitHub. I’m not sure if it will ever be released on CRAN, since I’m not entirely sure they would be happy with the reverse engineering of a hidden API (but let me know if you think otherwise).

pak::pak("JBGruber/traktok")

For a very quick demonstration, let’s look up some videos about R on TikTok (this will only work after authenticating):

library(traktok)
rstats_vids_urls <- tt_search("#rstats", max_pages = 1L, verbose = FALSE)
rstats_vids_urls
## # A tibble: 12 × 20
##    video_id            video_timestamp     video_url    video_length video_title
##    <chr>               <dttm>              <glue>              <int> <chr>      
##  1 7115114419314560298 2022-06-30 19:17:53 https://www…          135 "R for Beg…
##  2 7213413598998056234 2023-03-22 16:49:12 https://www…            6 "R and me …
##  3 7252226153828584731 2023-07-05 07:01:45 https://www…           36 "Wow!!! TH…
##  4 7306893853297052960 2023-11-29 14:40:06 https://www…           36 "Don't Mak…
##  5 7242068680484408581 2023-06-07 22:05:16 https://www…           34 "R GRAPHIC…
##  6 7257689890245201153 2023-07-20 00:23:40 https://www…           56 "Pie chart…
##  7 7302970667907992850 2023-11-19 00:56:09 https://www…          163 "What is c…
##  8 7249335886255738158 2023-06-27 12:05:54 https://www…            5 "#CapCut #…
##  9 7278304897911491872 2023-09-13 13:40:21 https://www…           36 "Quick R Q…
## 10 7293317457035431210 2023-10-24 00:36:48 https://www…            9 "#CapCut #…
## 11 7274045053889285419 2023-09-02 02:10:05 https://www…           91 "Easily cr…
## 12 7167010863784693035 2022-11-17 15:42:56 https://www…           58 "Here’s an…
## # ℹ 15 more variables: video_diggcount <int>, video_sharecount <int>,
## #   video_commentcount <int>, video_playcount <int>, video_is_ad <lgl>,
## #   author_name <chr>, author_nickname <chr>, author_followercount <int>,
## #   author_followingcount <int>, author_heartcount <int>,
## #   author_videocount <int>, author_diggcount <int>, music <list>,
## #   challenges <list>, download_url <chr>

If you want to download these videos as well:

dir.create("videos", showWarnings = FALSE)
tt_videos(rstats_vids_urls$video_url, dir = "videos", verbose = FALSE)
## # A tibble: 12 × 19
##    video_id            video_url    video_timestamp     video_length video_title
##    <glue>              <chr>        <dttm>                     <int> <chr>      
##  1 7115114419314560298 https://www… 2022-06-30 19:17:53          135 "R for Beg…
##  2 7213413598998056234 https://www… 2023-03-22 16:49:12            6 "R and me …
##  3 7252226153828584731 https://www… 2023-07-05 07:01:45           36 "Wow!!! TH…
##  4 7306893853297052960 https://www… 2023-11-29 14:40:06           36 "Don't Mak…
##  5 7242068680484408581 https://www… 2023-06-07 22:05:16           34 "R GRAPHIC…
##  6 7257689890245201153 https://www… 2023-07-20 00:23:40           56 "Pie chart…
##  7 7302970667907992850 https://www… 2023-11-19 00:56:09          163 "What is c…
##  8 7249335886255738158 https://www… 2023-06-27 12:05:54            5 "#CapCut #…
##  9 7278304897911491872 https://www… 2023-09-13 13:40:21           36 "Quick R Q…
## 10 7293317457035431210 https://www… 2023-10-24 00:36:48            9 "#CapCut #…
## 11 7274045053889285419 https://www… 2023-09-02 02:10:05           91 "Easily cr…
## 12 7167010863784693035 https://www… 2022-11-17 15:42:56           58 "Here’s an…
## # ℹ 14 more variables: video_locationcreated <chr>, video_diggcount <int>,
## #   video_sharecount <int>, video_commentcount <int>, video_playcount <int>,
## #   author_username <chr>, author_nickname <chr>, author_bio <chr>,
## #   download_url <chr>, html_status <int>, music <list>, challenges <list>,
## #   is_classified <lgl>, video_fn <chr>
tibble::tibble(file = list.files("videos"),
               size_Mb = file.size(list.files("videos", full.names = TRUE)) / 1000000)
## # A tibble: 12 × 2
##    file                                            size_Mb
##    <chr>                                             <dbl>
##  1 learningcast_video_7167010863784693035.mp4        8.64 
##  2 learningcast_video_7249335886255738158.mp4        0.312
##  3 learningcast_video_7293317457035431210.mp4        0.598
##  4 mattdancho_video_7115114419314560298.mp4          2.60 
##  5 mrpecners_video_7274045053889285419.mp4           2.07 
##  6 sillysciencelady_video_7213413598998056234.mp4    0.612
##  7 smooth.learning.c_video_7257689890245201153.mp4   1.78 
##  8 smooth.learning.c_video_7302970667907992850.mp4   5.22 
##  9 statisticsglobe_video_7242068680484408581.mp4     1.90 
## 10 statisticsglobe_video_7252226153828584731.mp4     1.82 
## 11 statisticsglobe_video_7278304897911491872.mp4     1.99 
## 12 statisticsglobe_video_7306893853297052960.mp4     1.64

And with these two commands, you already have a small TikTok dataset to play with 📊🚀!

To leave a comment for the author, please follow the link and comment on their blog: Johannes B. Gruber on Johannes B. Gruber.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Exit mobile version