How to regex match everything but long words? -
i select long words string: re.findall("[a-z]{3,}")
however, reason can use substitute only. hence need substitute words of 3 , more letters space. (e.g. abc de1 fgh ij
-> abc fgh
)
how such regex like?
the result should "[a-z]{3,}" concatenated spaces. however, can use substitution only.
or in python: find regex
such that
re.sub(regex, " ", text) == " ".join(re.findall("[a-z]{3,}", text))
here test cases
import re solution_regex="..." test_str in ["aaa aa aaa aa", "aaa aa11", "11aaa11 11aa11", "aa aa1aa aaaa" ]: expected_str = " ".join(re.findall("[a-z]{3,}", test_str)) print(test_str, "->", expected_str) if re.sub(solution_regex, " ", test_str)!=expected_str: print("error") -> aaa aa aaa aa -> aaa aaa aaa aa11 -> aaa 11aaa11 11aa11 -> aaa aa aa1aa aaaa -> aaaa
note space no different other symbol.
\b(?:[a-z,a-z,_]{1,2}|\w*\d+\w*)\b
explanation:
\b
means substring looking start , endborder of word
(?: )
- non captured group\w*\d+\w*
word contains @ least 1 digit , consists of digits, '_' , letters
here can see test.
Comments
Post a Comment