java - Regex to remove prefix of a pattern (R) -
i have 2 strings, s
, t
. how use regex remove copies of prefixes of t
end of s
?
more specifically, s
consists of characters followed copies of t
, last 1 may truncated. instance, t
abcdef
, s
asdjb|ak.fvajfabcdefabcdefabcdefabc
-- asdjb|ak.fvajf
.
additionally, s
, t
may contain characters special meaning regex engines, such .[]*+()\
. i'm working in r, solution in java fine too.
i believe it. it's long.
s <- "asdjb|ak.fvajfabcdefabcdefabcdefabc" t <- "abcdef" want <- "asdjb|ak.fvajf" sp <- strsplit(t, "")[[1]] pat <- sapply(seq_along(sp), function(i){ paste(sp[seq_len(i)], collapse = "") }) pat <- paste0("(", paste(pat, collapse = "|"), ")*$") result <- gsub(pat, "", s) identical(result, want) [1] true
if want process several vectors, rewrite above function , use sapply
. (or lapply
.)
repl <- function(x, prefix){ sp <- strsplit(prefix, "")[[1]] pat <- sapply(seq_along(sp), function(i){ paste(sp[seq_len(i)], collapse = "") }) pat <- paste0("(", paste(pat, collapse = "|"), ")*$") result <- gsub(pat, "", x) result } <- rep(s, 10) pref <- rep(t, 10) sapply(seq_along(where), function(i) repl(where[[i]], pref[[i]]))
Comments
Post a Comment