Yesterday I learned an unexpected but interesting use of the highr package
from a GitHub issue. This package is
intended for syntax highlighting R code, but the user wanted to identify
function calls from given R code. What he did was to first syntax highlight the
code, and then look for LaTeX code \kwd{} in the result. I told him that this
task could be done with getParseData(), but there were a few edge cases. For
example:
getParseData(parse(text = c('lapply(1:10, paste)')))
line1 col1 line2 col2 id parent token text
1 1 1 1 6 1 3 SYMBOL_FUNCTION_CALL lapply
2 1 7 1 7 2 20 '(' (
4 1 8 1 8 4 5 NUM_CONST 1
6 1 9 1 9 6 10 ':' :
7 1 10 1 11 7 8 NUM_CONST 10
9 1 12 1 12 9 20 ',' ,
14 1 14 1 18 14 16 SYMBOL paste
15 1 19 1 19 15 20 ')' )
In this case, lapply was correctly identified as SYMBOL_FUNCTION_CALL, but
paste was not (instead, it was identified as a SYMBOL). We can try to
evaluate the symbol and check if it is a function:
find_funs = function(code) {
d = getParseData(parse(text = code))
f = d[d$token == 'SYMBOL_FUNCTION_CALL', 'text']
for (s in d[d$token == 'SYMBOL', 'text']) {
tryCatch({
ev = eval(as.symbol(s), parent.frame())
if (is.function(ev)) f = c(f, s)
}, error = function(e) NULL)
}
f
}
Then find_funs('lapply(1:10, paste)') can find both lapply and paste.
One caveat is that this approach doesn’t evaluate the code but simply parses it,
so it won’t be able to recognize functions in add-on packages by default. One
way to address this problem is to detect library() or require() calls and
search for possible function names in packages. This won’t be totally robust
(e.g., for the case library(x, character.only = TRUE)). Another way is to
actually evaluate the code before trying to detect if a symbol is a function,
which is more expensive.
Another edge case is function calls in glue::glue(), e.g., str_to_title and
as.character in the following case:
glue("This number is {str_to_title(as.character(123))}")
We can certainly detect glue calls and try to parse the glue templates. I’m
not interested in going that far, so I’ll just stop here.
Besides getParseData(), I guess codetools::makeCodeWalker() might also work.
I first learned about it from Kohske ten years
ago.
Donate
As a freelancer (currently working as a contractor) and a dad of three kids, I truly appreciate your donation to support my writing and open-source software development! Your contribution helps me cope with financial uncertainty better, so I can spend more time on producing high-quality content and software. You can make a donation through methods below.
-
Venmo:
@yihui_xie, or Zelle:xie@yihui.name -
Paypal
-
If you have a Paypal account, you can follow the link https://paypal.me/YihuiXie or find me on Paypal via my email
xie@yihui.name. Please choose the payment type as “Family and Friends” (instead of “Goods and Services”) to avoid extra fees. -
If you don’t have Paypal, you may donate through this link via your debit or credit card. Paypal will charge a fee on my side.
-
-
Other ways:
WeChat Pay (微信支付:谢益辉) Alipay (支付宝:谢益辉) 

When sending money, please be sure to add a note “gift” or “donation” if possible, so it won’t be treated as my taxable income but a genuine gift. Needless to say, donation is completely voluntary and I appreciate any amount you can give.
Please feel free to email me if you prefer a different way to give. Thank you very much!
I’ll give back a significant portion of the donations to the open-source community and charities. For the record, I received about $30,000 in total (before tax) in 2024-25, and gave back about $15,000 (after tax).