With help from LLMs (e.g. Deepseek) ! an R function that filters a list of vectors using indices from the unlisted data
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Upon users’ query, language learning models like ChatGPT and Deepseek can generate useful programming codes having been trained on many codes and texts available on the Internet. However, at least according to my experience the generated scripts are not always correct or useful. Some times I can get a more useful code if I continue to discuss and query for awhile !
The following is one of those occasions that the immediately generated codes by Deepseek were very useful to me. This may have been due to the fact that the problem was clear and easy to define precisely. The request was that it generate an R function that allows filtering of a list of vectors (e.g lst) using filtering indices based on the unlisted vector. I noticed that two functions were used in the generated codes that I don’t use frequently in R coding (I use alternative methods to achieve the same outcomes !). These functions are: 1. the seq_along function and 2. The split function with 2 input vectors as parameters.
How has your experience been with using LLMs (e.g. ChatGPT and DeepSeek) for coding ?
What apps or online tools do you use for this purpose ?
When do you get more useful codes ?
How do you query to get better codes ?
Here is the function and two use cases. Do note that I slightly modified it to improve the handling of lists with no element-names and the handling when the filtering leads to empty vectors within the list.
#Defining the function
filter_list_by_unlisted_indices <- function(lst, remove_indices, remove_empty=TRUE) {
# Check for and deel with lists that don’t have names
lstBkup=lst
if(is.null(names(lstBkup)))
names(lst)=paste(“n”,1:length(lst), sep=””)
# Unlist with names attribute preserved
utest <- unlist(lst, use.names = FALSE)
names_vec <- rep(names(lst), lengths(lst))
# Get the indices to keep
all_indices <- seq_along(utest)
keep_indices <- setdiff(all_indices, remove_indices)
# Split the kept values back into the list structure
kept_values <- utest[keep_indices]
kept_names <- names_vec[keep_indices]
# Reconstruct the list
filtered_list <- split(kept_values, kept_names)
# Preserve original order and structure
selNames=names(lst)
if(remove_empty)
selNames=selNames[selNames%in%unique(kept_names)]
filtered_list <- filtered_list[selNames]
# Set NULL names for the output list if lst lacks names
if(is.null(names(lstBkup)))
names(filtered_list)=NULL
return(filtered_list)
}
#Using the function
(tmp=filter_list_by_unlisted_indices(lst=list(a=1:3,b=4:14,c=15:20), remove_indices=c(3:15), remove_empty=TRUE))
#$a
#[1] 1 2
#$c
#[1] 16 17 18 19 20
(tmp=filter_list_by_unlisted_indices(lst=list(1:3,4:14,15:20), remove_indices=c(3:15), remove_empty=TRUE))
#[[1]]
#[1] 1 2
#[[2]]
#[1] 16 17 18 19 20
My typo-filled Deepseek query was as follows:
Hi! Let’s say that in R programming, I have a list of vectors called test that is defined as test=list(a=1:5, b=6:9, c=10:20) and the vectorized list being utest=unlist(test). How can I remove elements of vectors within the list test if I have the filtering indexes from the utest (i.e. vectorized list). As for instance how can I remove the elements from the ‘test’ list that correspond to 3rd, 5th, 18th and 20th elements of utest. Please wirte an function that does the filtering for any list and index vector of the vectorized list (and reutrns the filtered list).
Of course the picture was generated using yet another LLM-based tool i.e. Grok! The query Draw “Language Learning models generating programming codes using machine learning” was used.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.