bash - Using sed to remove period at the end of string (zip code) -


i have file of addresses attempting scrub , using sed rid of unwanted charachters , formatting. in case, have zip codes followed period:

mr. john doe exclusively stuff, 186  caravelle drive, ponte vedra fl 33487.  

(for time being, ignore new lines; focusing on zip , period now)

i want remove period (.) zip first step in cleaning up. tried use sub strings in sed follows (using "|" delimiter - easier me see):

sed 's|\([0-9]{4}\)\.|\1|g' test.txt 

unfortunately, doesn't remove period. prints out part of sub string based on post: replace period surrounded characters sed

a point in right direction appreciated.

you specified 4 digits {4} have 5 , have escape { , }, example:

sed 's|\(^[0-9]\{5\}\).*|\1|g' test.txt 

notice have space after dot, might want trim following 5 digits safe might want specify must @ start of line ^.

in case, if type info sed more complete man sed, find this:

'-r' '--regexp-extended'      use extended regular expressions rather basic regular      expressions.  extended regexps 'egrep' accepts;      can clearer because have less backslashes,      gnu extension , hence scripts use them not portable.      *note extended regular expressions: extended regexps. 

and under appendix extended regular expressions can read:

the difference between basic , extended regular expressions in behavior of few characters: '?', '+', parentheses, braces ('{}'), , '|'.  while basic regular expressions require these escaped if want them behave special characters, when using extended regular expressions must escape them if want them _to match literal character_.  '|' special here because '\|' gnu extension - standard basic regular expressions not provide functionality.  examples: 'abc?'      becomes 'abc\?' when using extended regular expressions.       matches literal string 'abc?'.  'c\+'      becomes 'c+' when using extended regular expressions.  matches      1 or more 'c's.  'a\{3,\}'      becomes 'a{3,}' when using extended regular expressions.       matches 3 or more 'a's.   '\(abc\)\{2,3\}'      becomes '(abc){2,3}' when using extended regular expressions.       matches either 'abcabc' or 'abcabcabc'.   '\(abc*\)\1'      becomes '(abc*)\1' when using extended regular expressions.      backreferences must still escaped when using extended regular      expressions. 

Comments

Popular posts from this blog

android - InAppBilling registering BroadcastReceiver in AndroidManifest -

python Tkinter Capturing keyboard events save as one single string -

sql server - Why does Linq-to-SQL add unnecessary COUNT()? -