Software Developer

Split Paragraphs into Shorter Paragraphs in PHP

Here is a PHP function to split a large paragraph into shorter paragraphs. Paragraphs that are too large are those that have more than 4 sentences. This will count the sentences in a paragraph, and if needed, convert it into shorter paragraphs.

You need all of the 4 following functions for this. Below this code, you’ll find some examples of how to use this.

/**
 * Break paragraph into shorter paragraphs
 * @param string $p the paragraph to break up
 * @return mixed String of shorter paragraphs, or false if paragraphs was already less than 5 sentences. Returned string will have max 4 sentences per paragraph.
 */
function shorter_paragraphs($p) {
	$buffer = '';
   
    do {
    	$p_in = $p;
    	$c = count_sentences($p_in);

    	$replace = true;
   
        if ($c > 4) {
        	$n_arr = get_n($c);
        	$n = $n_arr[0];
        	$more = $n_arr[1];
	        $s = get_sentences($p_in, $n);
	        $s = trim($s);
	        $buffer .= $s . PHP_EOL . PHP_EOL;
	        $p_in = str_replace($s, '', $p_in);

	        if (1 === $more) {
	            $c = count_sentences($p_in);
	        } else {
	            $buffer .= trim($p_in);
	            $c = 0;// kill loop
	        }

        } else {

        	$buffer .= trim($p_in);
        	$c = 0;// kill loop
			
			if (trim($buffer) == trim($p_in)) {
				$replace = false;
			}
        }

        $p = $p_in;

    } while ($c > 4);

    if ($replace) {
		return $buffer;
    } else {
    	return false;
    }

}

function count_sentences($str){
    return preg_match_all('/[^\s](\.|\!|\?)/',$str,$match);
}

/**
 * Get the first $n sentences in a string.
 * @param int $n The number of sentences to get
 * @return string The sentences
 */
function get_sentences($string, $n) {
    $split = preg_split('/(\.|\!|\?)/', $string, $n + 1, PREG_SPLIT_DELIM_CAPTURE);
    $sentences = implode('', array_slice($split, 0, 2 * $n));
    return $sentences;
}

/**
 * Get number of paragraphs to take off the top based on total
 * number of sentences in original paragraph.
 * @param int $c The total count of sentences in original paragraph
 * @return array
 */
function get_n($c) {
	switch ($c) {
	    case 6:
	    case 8:
	        $n = $c / 2;
	        $do_more = 0;
	        break;
	    case 5:
	    case 7:
	        $n = 3;
	        $do_more = 0;
	        break;
	    case 9:
	    case 12:
	    case 15:
	        $n = 3;
	        $do_more = 1;
	        break;
	    case 10:
	    case 11:
	    case 13:
	    case 14:
	    case 16:
	        $n = 4;
	        $do_more = 1;
	        break;
		default:
			$n = 0;
			$do_more = 0;
	}
	return array($n, $do_more);
}

Examples

  1. Convert a string of text into short paragraphs of a maximum of four sentences each.

    $p = ( $new_paragraphs = shorter_paragraphs($string) ) ? $new_paragraphs : $string;
  2. Check a whole body of text with existing paragraphs, and split any paragraphs with more than 4 sentences into shorter paragraphs.

    This will return the entire original text, but all paragraphs will have a maximum of 4 sentences. The $out variable will hold the final text.

    $out = $original_text;
    $original_text = trim($original_text);
    
    // split into paragraphs
    
    $paragraphs_array = preg_split('#(\r\n?|\n)+#', $original_text);
    
    // check each paragraph for length
    
    foreach ($paragraphs_array as $p) {
    	$p = trim($p);
    	if ($p) {
    		$new_ps = shorter_paragraphs($p);
    		if ( $new_ps ) {
    
    			// replace this paragraph with split shorter ones
    			$out = str_replace($p, $new_ps, $out);
    		}
    
    	}
    }
    
    // The $out variable holds the final text
    

The count_sentences() function is the most expensive one here.

By

Questions and Comments are Welcome

Your email address will not be published. All comments will be moderated.

Please wrap code in "code" bracket tags like this:

[code]

YOUR CODE HERE 

[/code]