Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

random_files <- function(path, percent_number, pattern = "wav$|WAV$"){
####################################################################
# path = path to folder with files to select
#
# percent_number = percentage or number of recordings to select. If value is
#   between 0 and 1 percentage of files is assumed, if value greater than 1,
#   number of files is assumed
#
# pattern = file extension to select. By default it selects wav files. For
#   other type of files replace wav and WAV by the desired extension
####################################################################

# Get file list with full path and file names
files <- list.files(path, full.names = TRUE, pattern = pattern)
file_names <- list.files(path, pattern = pattern)

# Select the desired % or number of file by simple random sampling
randomize <- sample(seq(files))
files2analyse <- files[randomize]
names2analyse <- file_names[randomize]
if(percent_number <= 1){
size <- floor(percent_number * length(files))
}else{
size <- percent_number
}
files2analyse <- files2analyse[(1:size)]
names2analyse <- names2analyse[(1:size)]

# Create folder to output
results_folder <- paste0(path, '/selected')
dir.create(results_folder, recursive=TRUE)

# Write csv with file names
write.table(names2analyse, file = paste0(results_folder, "/selected_files.csv"),
col.names = "Files", row.names = FALSE)

# Move files
for(i in seq(files2analyse)){
file.rename(from = files2analyse[i], to = paste0(results_folder, "/", names2analyse[i]) )
}
}

I normally use this function inside a little script for some extra functionalities:

1. first I set up the environment by sourcing the required functions and loading the packages,
2. as I always do when using functions that use randomness, I set a seed to be able to reproduce my results in a later time,
3. as the function uses a folder path, I’ve included a little search window with tcltk to select the folder instead of having to write the full path by hand.
# Load packages and functions
require(tcltk)
source("random_files.R")

# Set seed to reproduce results
set.seed(1001)

# Select folder with recordings
path <- tcltk::tk_choose.dir()

# Percentage or number of recordings
percent_number <- 0.2 # using percentage

# Random sampling of files
random_files(path, percent_number, pattern = "wav$|WAV$")

This function was written for a specific purpose but with some tweaks you can probably adapt it for other purposes other than the one I use it for.

You can find this and other R scripts at: https://github.com/bmsasilva/Rscripts