How to regex match everything but long words? -


i select long words string: re.findall("[a-z]{3,}")

however, reason can use substitute only. hence need substitute words of 3 , more letters space. (e.g. abc de1 fgh ij -> abc fgh)

how such regex like?

the result should "[a-z]{3,}" concatenated spaces. however, can use substitution only.

or in python: find regex such that

re.sub(regex, " ", text) == " ".join(re.findall("[a-z]{3,}", text)) 

here test cases

import re solution_regex="..." test_str in ["aaa aa aaa aa",                  "aaa aa11",                  "11aaa11 11aa11",                  "aa aa1aa aaaa"                 ]:     expected_str = " ".join(re.findall("[a-z]{3,}", test_str))     print(test_str, "->", expected_str)      if re.sub(solution_regex, " ", test_str)!=expected_str:         print("error")  -> aaa aa aaa aa -> aaa aaa aaa aa11 -> aaa 11aaa11 11aa11 -> aaa aa aa1aa aaaa -> aaaa 

note space no different other symbol.

\b(?:[a-z,a-z,_]{1,2}|\w*\d+\w*)\b 

explanation:

  • \b means substring looking start , end border of word
  • (?: ) - non captured group
  • \w*\d+\w* word contains @ least 1 digit , consists of digits, '_' , letters

here can see test.


Comments

Popular posts from this blog

python Tkinter Capturing keyboard events save as one single string -

android - InAppBilling registering BroadcastReceiver in AndroidManifest -

javascript - Z-index in d3.js -