swift - Regex catch word at the start and end of a UITextView -


i'm trying catch when word used in uitextview. i've got working words in interior of view.

the problem when word first or last in view. code far:

private func filteredtermfor(_ word: string) -> string {     let punctuationfilter = "([\\a|\\w|\\d|\\z| ])"     let wordinparens = "(\(word))"     return punctuationfilter + wordinparens + punctuationfilter } 

i checked , found should use ^ start of input , $ end of input. when add either of these, example:

"([^|\\a|\\w|\\d|\\z| ])" 

they don't seem have effect when word in question first or last in view.

*for sake of being verbose question, return value function above being used searchterm in this:

    func highlightedtextinstring(with searchterm: string, targetstring: string) -> nsattributedstring? {     let attributedstring = nsmutableattributedstring(string: targetstring)     {         let regex = try nsregularexpression(pattern: searchterm, options: .caseinsensitive)         let range = nsrange(location: 0, length: targetstring.utf16.count)         match in regex.matches(in: targetstring, options: .withtransparentbounds, range: range) {             let fontcolor = uicolor.red             attributedstring.addattribute(nsforegroundcolorattributename, value: fontcolor, range: match.range)         }         return attributedstring     } catch _ {         print("error creating regular expression")         return nil     } } 

** edit ** since marked duplicate question reported duplicate of not cover edge cases when word typed next punctuation mark or digit without spaces. example: .word , word9 , ?word?

note ([^|\\a|\\w|\\d|\\z| ]) capturing group ((...)) containing character class matches single char defined inside it. ^ after [ makes class negated one, , matches char one(s) defined in set. so, [^|\\a|\\w|\\d|\\z| ] matches single char other | (it no longer alternation operator inside character class), a (the \ in front not considered, omitted), non-word char, digit, z , space. effectively matches _ , letters other a , z.

you state words need match may occur within word boundaries or digits.

you may use

return "(?<![^\\w\\d])(\(word))(?![^\\w\\d])" 

see regex demo.

here, "(?<![^\\w\\d])" negative lookbehind matches location not preceded character other non-word , digit char. sounds cumbersome, main point here [^\w\d] matches same texts \w excluding digits (\w matches letters, digit, , _. so, "(?<![^\\w\\d])" makes sure there start of string or non-letter , non-_ char right before word. if allow word match after _, use (?<!\\p{l}) (where \p{l} matches unicode letter).

the "(?![^\\w\\d])" negative lookahead makes sure there end of string or non-letter , non-_ (there can punctuation, symbols , digits) right of word. again, if want match word if followed _, may replace lookahead "(?!\\p{l})" (just no letter after word allowed).


Comments

Popular posts from this blog

android - InAppBilling registering BroadcastReceiver in AndroidManifest -

python Tkinter Capturing keyboard events save as one single string -

sql server - Why does Linq-to-SQL add unnecessary COUNT()? -