regex - php sentence boundaries detection -
i divide text sentences in php. i'm using regex, brings ~95% accuracy , improve using better approach. i've seen nlp tools in perl, java, , c didn't see fits php. know of such tool? an enhanced regex solution assuming care handling: mr. , mrs. etc. abbreviations, following single regex solution works pretty well: <?php // test.php rev:20160820_1800 $split_sentences = '%(?#!php/i split_sentences rev:20160820_1800) # split sentences on whitespace between them. # see: http://stackoverflow.com/a/5844564/433790 (?<= # sentence split location preceded [.!?] # either end of sentence punct, | [.!?][\'"] # or end of sentence punct , quote. ) # end positive lookbehind. (?<! # don\'t split after these: mr\. # either "mr." | mrs\. # or "mrs." | ms\. # or "ms." | jr\. # or "jr." | dr\. ...