Class HtmlCompressor
- All Implemented Interfaces:
Compressor
Blocks that should be additionally preserved could be marked with:<!-- {{{ --
...
<!-- }}} -->
or any number of user defined patterns.
Content inside <script> or <style> tags could be optionally compressed using Yahoo YUI Compressor or Google Closure Compiler libraries.
- Author:
- Sergiy Kovalchuk
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final String
Could be passed tosetRemoveSurroundingSpaces
method to remove all surrounding spaces (not recommended).static final String
Predefined list of tags that are block-level by default, excluding<div>
and<li>
tags.static final String
Predefined list of tags that are very likely to be block-level.static final Pattern
Predefined pattern that matches<?php ...
static final Pattern
Predefined pattern that matches<% ...
static final Pattern
Predefined pattern that matches<--# ...
Constructor Summary
ConstructorsMethod Summary
Modifier and TypeMethodDescriptionThe main method that compresses given HTML source and returns compressed result.Returns CSS compressor implementation that will be used to compress inline CSS in HTML.Returns a list of Patterns defining custom preserving block rulesReturns a comma separated list of tags around which spaces will be removed.ReturnsHtmlCompressorStatistics
object containing statistics of the last HTML compression, if enabled.int
Returns number of symbols per line Yahoo YUI Compressor will use during CSS compression.int
Returns number of symbols per line Yahoo YUI Compressor will use during JavaScript compression.boolean
Returnstrue
if CSS compression is enabled.boolean
Returnstrue
if JavaScript compression is enabled.boolean
Returnstrue
if compression is enabled.boolean
Returnstrue
if HTML compression statistics is generatedboolean
Returnstrue
if line breaks will be preserved.boolean
Returnstrue
if all HTML comments will be removed.boolean
Returnstrue
ifmethod="get"
attributes will be removed from<form>
tagsboolean
Returnstrue
ifHTTP
protocol will be removed fromhref
,src
,cite
, andaction
tag attributes.boolean
Returnstrue
ifHTTPS
protocol will be removed fromhref
,src
,cite
, andaction
tag attributes.boolean
Returnstrue
iftype="text"
attributes will be removed from<input>
tagsboolean
Returnstrue
if all inter-tag whitespace characters will be removed.boolean
Returnstrue
ifjavascript:
pseudo-protocol will be removed from inline event handlers.boolean
Returnstrue
if unnecessary attributes will be removed from<link>
tagsboolean
Returnstrue
if all multiple whitespace characters will be replaced with single spaces.boolean
Returnstrue
if all unnecessary quotes will be removed from tag attributes.boolean
Returnstrue
if unnecessary attributes will be removed from<script>
tagsboolean
Returnstrue
iftype="text/style"
attributes will be removed from<style>
tagsboolean
Returnstrue
if boolean attributes will be simplifiedboolean
Returnstrue
if existing DOCTYPE declaration will be replaced with simple<!DOCTYPE html>
declaration.boolean
Returnstrue
if Yahoo YUI Compressor will disable all the built-in micro optimizations during JavaScript compression.boolean
Returnstrue
if Yahoo YUI Compressor will only minify javascript without obfuscating local symbols.boolean
Returnstrue
if Yahoo YUI Compressor will preserve unnecessary semicolons during JavaScript compression.void
setCompressCss
(boolean compressCss) Enables CSS compression within <style> tags using Yahoo YUI Compressor if set totrue
.void
setCompressJavaScript
(boolean compressJavaScript) Enables JavaScript compression within <script> tags using Yahoo YUI Compressor if set totrue
.void
setCssCompressor
(Compressor cssCompressor) Sets CSS compressor implementation that will be used to compress inline CSS in HTML.void
setEnabled
(boolean enabled) If set tofalse
all compression will be bypassed.void
setGenerateStatistics
(boolean generateStatistics) If set totrue
, HTML compression statistics will be generated.void
setPreserveLineBreaks
(boolean preserveLineBreaks) If set totrue
, line breaks will be preserved.void
setPreservePatterns
(List<Pattern> preservePatterns) This method allows setting custom block preservation rules defined by regular expression patterns.void
setRemoveComments
(boolean removeComments) If set totrue
all HTML comments will be removed.void
setRemoveFormAttributes
(boolean removeFormAttributes) If set totrue
,method="get"
attributes will be removed from<form>
tags.void
setRemoveHttpProtocol
(boolean removeHttpProtocol) If set totrue
,HTTP
protocol will be removed fromhref
,src
,cite
, andaction
tag attributes.void
setRemoveHttpsProtocol
(boolean removeHttpsProtocol) If set totrue
,HTTPS
protocol will be removed fromhref
,src
,cite
, andaction
tag attributes.void
setRemoveInputAttributes
(boolean removeInputAttributes) If set totrue
,type="text"
attributes will be removed from<input>
tags.void
setRemoveIntertagSpaces
(boolean removeIntertagSpaces) If set totrue
all inter-tag whitespace characters will be removed.void
setRemoveJavaScriptProtocol
(boolean removeJavaScriptProtocol) If set totrue
,javascript:
pseudo-protocol will be removed from inline event handlers.void
setRemoveLinkAttributes
(boolean removeLinkAttributes) If set totrue
, following attributes will be removed from<link rel="stylesheet">
and<link rel="alternate stylesheet">
tags: type="text/css" type="text/plain"void
setRemoveMultiSpaces
(boolean removeMultiSpaces) If set totrue
all multiple whitespace characters will be replaced with single spaces.void
setRemoveQuotes
(boolean removeQuotes) If set totrue
all unnecessary quotes will be removed from tag attributes.void
setRemoveScriptAttributes
(boolean removeScriptAttributes) If set totrue
, following attributes will be removed from<script>
tags: type="text/javascript" type="application/javascript" language="javascript"void
setRemoveStyleAttributes
(boolean removeStyleAttributes) If set totrue
,type="text/style"
attributes will be removed from<style>
tags.void
setRemoveSurroundingSpaces
(String tagList) Enables surrounding spaces removal around provided comma separated list of tags.void
setSimpleBooleanAttributes
(boolean simpleBooleanAttributes) If set totrue
, any values of following boolean attributes will be removed: checked selected disabled readonlyvoid
setSimpleDoctype
(boolean simpleDoctype) If set totrue
, existing DOCTYPE declaration will be replaced with simple<!DOCTYPE html>
declaration.void
setYuiCssLineBreak
(int yuiCssLineBreak) Tells Yahoo YUI Compressor to break lines after the specified number of symbols during CSS compression.void
setYuiJsDisableOptimizations
(boolean yuiJsDisableOptimizations) Tells Yahoo YUI Compressor to disable all the built-in micro optimizations during JavaScript compression.void
setYuiJsLineBreak
(int yuiJsLineBreak) Tells Yahoo YUI Compressor to break lines after the specified number of symbols during JavaScript compression.void
setYuiJsNoMunge
(boolean yuiJsNoMunge) Tells Yahoo YUI Compressor to only minify javascript without obfuscating local symbols.void
setYuiJsPreserveAllSemiColons
(boolean yuiJsPreserveAllSemiColons) Tells Yahoo YUI Compressor to preserve unnecessary semicolons during JavaScript compression.
Field Details
PHP_TAG_PATTERN
Predefined pattern that matches<?php ... ?>
tags. Could be passed inside a list tosetPreservePatterns
method.SERVER_SCRIPT_TAG_PATTERN
Predefined pattern that matches<% ... %>
tags. Could be passed inside a list tosetPreservePatterns
method.SERVER_SIDE_INCLUDE_PATTERN
Predefined pattern that matches<--# ... -->
tags. Could be passed inside a list tosetPreservePatterns
method.BLOCK_TAGS_MIN
Predefined list of tags that are very likely to be block-level. Could be passed tosetRemoveSurroundingSpaces
method.- See Also:
BLOCK_TAGS_MAX
Predefined list of tags that are block-level by default, excluding<div>
and<li>
tags. Table tags are also included. Could be passed tosetRemoveSurroundingSpaces
method.- See Also:
ALL_TAGS
Could be passed tosetRemoveSurroundingSpaces
method to remove all surrounding spaces (not recommended).- See Also:
Constructor Details
HtmlCompressor
public HtmlCompressor()
Method Details
compress
The main method that compresses given HTML source and returns compressed result.- Specified by:
compress
in interfaceCompressor
- Parameters:
html
- HTML content to compress- Returns:
- compressed content.
isCompressJavaScript
public boolean isCompressJavaScript()Returnstrue
if JavaScript compression is enabled.- Returns:
- current state of JavaScript compression.
setCompressJavaScript
public void setCompressJavaScript(boolean compressJavaScript) Enables JavaScript compression within <script> tags using Yahoo YUI Compressor if set totrue
. Default isfalse
for performance reasons.Note: Compressing JavaScript is not recommended if pages are compressed dynamically on-the-fly because of performance impact. You should consider putting JavaScript into a separate file and compressing it using standalone YUICompressor for example.
- Parameters:
compressJavaScript
- settrue
to enable JavaScript compression. Default isfalse
- See Also:
isCompressCss
public boolean isCompressCss()Returnstrue
if CSS compression is enabled.- Returns:
- current state of CSS compression.
setCompressCss
public void setCompressCss(boolean compressCss) Enables CSS compression within <style> tags using Yahoo YUI Compressor if set totrue
. Default isfalse
for performance reasons.Note: Compressing CSS is not recommended if pages are compressed dynamically on-the-fly because of performance impact. You should consider putting CSS into a separate file and compressing it using standalone YUICompressor for example.
- Parameters:
compressCss
- settrue
to enable CSS compression. Default isfalse
- See Also:
isYuiJsNoMunge
public boolean isYuiJsNoMunge()Returnstrue
if Yahoo YUI Compressor will only minify javascript without obfuscating local symbols. This corresponds to--nomunge
command line option.- Returns:
nomunge
parameter value used for JavaScript compression.- See Also:
setYuiJsNoMunge
public void setYuiJsNoMunge(boolean yuiJsNoMunge) Tells Yahoo YUI Compressor to only minify javascript without obfuscating local symbols. This corresponds to--nomunge
command line option. This option has effect only if JavaScript compression is enabled. Default isfalse
.- Parameters:
yuiJsNoMunge
- settrue
to enablenomunge
mode- See Also:
isYuiJsPreserveAllSemiColons
public boolean isYuiJsPreserveAllSemiColons()Returnstrue
if Yahoo YUI Compressor will preserve unnecessary semicolons during JavaScript compression. This corresponds to--preserve-semi
command line option.- Returns:
preserve-semi
parameter value used for JavaScript compression.- See Also:
setYuiJsPreserveAllSemiColons
public void setYuiJsPreserveAllSemiColons(boolean yuiJsPreserveAllSemiColons) Tells Yahoo YUI Compressor to preserve unnecessary semicolons during JavaScript compression. This corresponds to--preserve-semi
command line option. This option has effect only if JavaScript compression is enabled. Default isfalse
.- Parameters:
yuiJsPreserveAllSemiColons
- settrue
to enablepreserve-semi
mode- See Also:
isYuiJsDisableOptimizations
public boolean isYuiJsDisableOptimizations()Returnstrue
if Yahoo YUI Compressor will disable all the built-in micro optimizations during JavaScript compression. This corresponds to--disable-optimizations
command line option.- Returns:
disable-optimizations
parameter value used for JavaScript compression.- See Also:
setYuiJsDisableOptimizations
public void setYuiJsDisableOptimizations(boolean yuiJsDisableOptimizations) Tells Yahoo YUI Compressor to disable all the built-in micro optimizations during JavaScript compression. This corresponds to--disable-optimizations
command line option. This option has effect only if JavaScript compression is enabled. Default isfalse
.- Parameters:
yuiJsDisableOptimizations
- settrue
to enabledisable-optimizations
mode- See Also:
getYuiJsLineBreak
public int getYuiJsLineBreak()Returns number of symbols per line Yahoo YUI Compressor will use during JavaScript compression. This corresponds to--line-break
command line option.- Returns:
line-break
parameter value used for JavaScript compression.- See Also:
setYuiJsLineBreak
public void setYuiJsLineBreak(int yuiJsLineBreak) Tells Yahoo YUI Compressor to break lines after the specified number of symbols during JavaScript compression. This corresponds to--line-break
command line option. This option has effect only if JavaScript compression is enabled. Default is-1
to disable line breaks.- Parameters:
yuiJsLineBreak
- set number of symbols per line- See Also:
getYuiCssLineBreak
public int getYuiCssLineBreak()Returns number of symbols per line Yahoo YUI Compressor will use during CSS compression. This corresponds to--line-break
command line option.- Returns:
line-break
parameter value used for CSS compression.- See Also:
setYuiCssLineBreak
public void setYuiCssLineBreak(int yuiCssLineBreak) Tells Yahoo YUI Compressor to break lines after the specified number of symbols during CSS compression. This corresponds to--line-break
command line option. This option has effect only if CSS compression is enabled. Default is-1
to disable line breaks.- Parameters:
yuiCssLineBreak
- set number of symbols per line- See Also:
isRemoveQuotes
public boolean isRemoveQuotes()Returnstrue
if all unnecessary quotes will be removed from tag attributes.setRemoveQuotes
public void setRemoveQuotes(boolean removeQuotes) If set totrue
all unnecessary quotes will be removed from tag attributes. Default isfalse
.Note: Even though quotes are removed only when it is safe to do so, it still might break strict HTML validation. Turn this option on only if a page validation is not very important or to squeeze the most out of the compression. This option has no performance impact.
- Parameters:
removeQuotes
- settrue
to remove unnecessary quotes from tag attributes
isEnabled
public boolean isEnabled()Returnstrue
if compression is enabled.- Returns:
true
if compression is enabled.
setEnabled
public void setEnabled(boolean enabled) If set tofalse
all compression will be bypassed. Might be useful for testing purposes. Default istrue
.- Parameters:
enabled
- setfalse
to bypass all compression
isRemoveComments
public boolean isRemoveComments()Returnstrue
if all HTML comments will be removed.- Returns:
true
if all HTML comments will be removed
setRemoveComments
public void setRemoveComments(boolean removeComments) If set totrue
all HTML comments will be removed. Default istrue
.- Parameters:
removeComments
- settrue
to remove all HTML comments
isRemoveMultiSpaces
public boolean isRemoveMultiSpaces()Returnstrue
if all multiple whitespace characters will be replaced with single spaces.- Returns:
true
if all multiple whitespace characters will be replaced with single spaces.
setRemoveMultiSpaces
public void setRemoveMultiSpaces(boolean removeMultiSpaces) If set totrue
all multiple whitespace characters will be replaced with single spaces. Default istrue
.- Parameters:
removeMultiSpaces
- settrue
to replace all multiple whitespace characters will single spaces.
isRemoveIntertagSpaces
public boolean isRemoveIntertagSpaces()Returnstrue
if all inter-tag whitespace characters will be removed.- Returns:
true
if all inter-tag whitespace characters will be removed.
setRemoveIntertagSpaces
public void setRemoveIntertagSpaces(boolean removeIntertagSpaces) If set totrue
all inter-tag whitespace characters will be removed. Default isfalse
.Note: It is fairly safe to turn this option on unless you rely on spaces for page formatting. Even if you do, you can always preserve required spaces with
. This option has no performance impact.- Parameters:
removeIntertagSpaces
- settrue
to remove all inter-tag whitespace characters
getPreservePatterns
Returns a list of Patterns defining custom preserving block rules- Returns:
- list of
Pattern
objects defining rules for preserving block rules
setPreservePatterns
This method allows setting custom block preservation rules defined by regular expression patterns. Blocks that match provided patterns will be skipped during HTML compression.Custom preservation rules have higher priority than default rules. Priority between custom rules are defined by their position in a list (beginning of a list has higher priority).
Besides custom patterns, you can use 3 predefined patterns:
PHP_TAG_PATTERN
,SERVER_SCRIPT_TAG_PATTERN
,SERVER_SIDE_INCLUDE_PATTERN
.- Parameters:
preservePatterns
- List ofPattern
objects that will be used to skip matched blocks during compression
getCssCompressor
Returns CSS compressor implementation that will be used to compress inline CSS in HTML.- Returns:
Compressor
implementation that will be used to compress inline CSS in HTML.- See Also:
setCssCompressor
Sets CSS compressor implementation that will be used to compress inline CSS in HTML.HtmlCompressor currently comes with basic implementation for Yahoo YUI Compressor (called
YuiCssCompressor
), but users can also create their own CSS compressors for custom needs.If no compressor is set
YuiCssCompressor
will be used by default.- Parameters:
cssCompressor
-Compressor
implementation that will be used for inline CSS compression- See Also:
isSimpleDoctype
public boolean isSimpleDoctype()Returnstrue
if existing DOCTYPE declaration will be replaced with simple<!DOCTYPE html>
declaration.- Returns:
true
if existing DOCTYPE declaration will be replaced with simple<!DOCTYPE html>
declaration.
setSimpleDoctype
public void setSimpleDoctype(boolean simpleDoctype) If set totrue
, existing DOCTYPE declaration will be replaced with simple<!DOCTYPE html>
declaration. Default isfalse
.- Parameters:
simpleDoctype
- settrue
to replace existing DOCTYPE declaration with<!DOCTYPE html>
isRemoveScriptAttributes
public boolean isRemoveScriptAttributes()Returnstrue
if unnecessary attributes will be removed from<script>
tags- Returns:
true
if unnecessary attributes will be removed from<script>
tags
setRemoveScriptAttributes
public void setRemoveScriptAttributes(boolean removeScriptAttributes) If set totrue
, following attributes will be removed from<script>
tags:- type="text/javascript"
- type="application/javascript"
- language="javascript"
Default is
false
.- Parameters:
removeScriptAttributes
- settrue
to remove unnecessary attributes from<script>
tags
isRemoveStyleAttributes
public boolean isRemoveStyleAttributes()Returnstrue
iftype="text/style"
attributes will be removed from<style>
tags- Returns:
true
iftype="text/style"
attributes will be removed from<style>
tags
setRemoveStyleAttributes
public void setRemoveStyleAttributes(boolean removeStyleAttributes) If set totrue
,type="text/style"
attributes will be removed from<style>
tags. Default isfalse
.- Parameters:
removeStyleAttributes
- settrue
to removetype="text/style"
attributes from<style>
tags
isRemoveLinkAttributes
public boolean isRemoveLinkAttributes()Returnstrue
if unnecessary attributes will be removed from<link>
tags- Returns:
true
if unnecessary attributes will be removed from<link>
tags
setRemoveLinkAttributes
public void setRemoveLinkAttributes(boolean removeLinkAttributes) If set totrue
, following attributes will be removed from<link rel="stylesheet">
and<link rel="alternate stylesheet">
tags:- type="text/css"
- type="text/plain"
Default is
false
.- Parameters:
removeLinkAttributes
- settrue
to remove unnecessary attributes from<link>
tags
isRemoveFormAttributes
public boolean isRemoveFormAttributes()Returnstrue
ifmethod="get"
attributes will be removed from<form>
tags- Returns:
true
ifmethod="get"
attributes will be removed from<form>
tags
setRemoveFormAttributes
public void setRemoveFormAttributes(boolean removeFormAttributes) If set totrue
,method="get"
attributes will be removed from<form>
tags. Default isfalse
.- Parameters:
removeFormAttributes
- settrue
to removemethod="get"
attributes from<form>
tags
isRemoveInputAttributes
public boolean isRemoveInputAttributes()Returnstrue
iftype="text"
attributes will be removed from<input>
tags- Returns:
true
iftype="text"
attributes will be removed from<input>
tags
setRemoveInputAttributes
public void setRemoveInputAttributes(boolean removeInputAttributes) If set totrue
,type="text"
attributes will be removed from<input>
tags. Default isfalse
.- Parameters:
removeInputAttributes
- settrue
to removetype="text"
attributes from<input>
tags
isSimpleBooleanAttributes
public boolean isSimpleBooleanAttributes()Returnstrue
if boolean attributes will be simplified- Returns:
true
if boolean attributes will be simplified
setSimpleBooleanAttributes
public void setSimpleBooleanAttributes(boolean simpleBooleanAttributes) If set totrue
, any values of following boolean attributes will be removed:- checked
- selected
- disabled
- readonly
For example,
<input readonly="readonly">
would become<input readonly>
Default is
false
.- Parameters:
simpleBooleanAttributes
- settrue
to simplify boolean attributes
isRemoveJavaScriptProtocol
public boolean isRemoveJavaScriptProtocol()Returnstrue
ifjavascript:
pseudo-protocol will be removed from inline event handlers.- Returns:
true
ifjavascript:
pseudo-protocol will be removed from inline event handlers.
setRemoveJavaScriptProtocol
public void setRemoveJavaScriptProtocol(boolean removeJavaScriptProtocol) If set totrue
,javascript:
pseudo-protocol will be removed from inline event handlers.For example,
<a onclick="javascript:alert()">
would become<a onclick="alert()">
Default is
false
.- Parameters:
removeJavaScriptProtocol
- settrue
to removejavascript:
pseudo-protocol from inline event handlers.
isRemoveHttpProtocol
public boolean isRemoveHttpProtocol()Returnstrue
ifHTTP
protocol will be removed fromhref
,src
,cite
, andaction
tag attributes.- Returns:
true
ifHTTP
protocol will be removed fromhref
,src
,cite
, andaction
tag attributes.
setRemoveHttpProtocol
public void setRemoveHttpProtocol(boolean removeHttpProtocol) If set totrue
,HTTP
protocol will be removed fromhref
,src
,cite
, andaction
tag attributes. URL without a protocol would make a browser use document's current protocol instead.Tags marked with
rel="external"
will be skipped.For example:
<a href="http://example.com"> <script src="http://google.com/js.js" rel="external">
would become:
<a href="//example.com"> <script src="http://google.com/js.js" rel="external">
Default is
false
.- Parameters:
removeHttpProtocol
- settrue
to removeHTTP
protocol from tag attributes
isRemoveHttpsProtocol
public boolean isRemoveHttpsProtocol()Returnstrue
ifHTTPS
protocol will be removed fromhref
,src
,cite
, andaction
tag attributes.- Returns:
true
ifHTTPS
protocol will be removed fromhref
,src
,cite
, andaction
tag attributes.
setRemoveHttpsProtocol
public void setRemoveHttpsProtocol(boolean removeHttpsProtocol) If set totrue
,HTTPS
protocol will be removed fromhref
,src
,cite
, andaction
tag attributes. URL without a protocol would make a browser use document's current protocol instead.Tags marked with
rel="external"
will be skipped.For example:
<a href="https://example.com"> <script src="https://google.com/js.js" rel="external">
would become:
<a href="//example.com"> <script src="https://google.com/js.js" rel="external">
Default is
false
.- Parameters:
removeHttpsProtocol
- settrue
to removeHTTP
protocol from tag attributes
isGenerateStatistics
public boolean isGenerateStatistics()Returnstrue
if HTML compression statistics is generated- Returns:
true
if HTML compression statistics is generated
setGenerateStatistics
public void setGenerateStatistics(boolean generateStatistics) If set totrue
, HTML compression statistics will be generated.Important: Enabling statistics makes HTML compressor not thread safe.
Default is
false
.- Parameters:
generateStatistics
- settrue
to generate HTML compression statistics- See Also:
getStatistics
ReturnsHtmlCompressorStatistics
object containing statistics of the last HTML compression, if enabled. Should be called aftercompress(String)
- Returns:
HtmlCompressorStatistics
object containing last HTML compression statistics- See Also:
isPreserveLineBreaks
public boolean isPreserveLineBreaks()Returnstrue
if line breaks will be preserved.- Returns:
true
if line breaks will be preserved.
setPreserveLineBreaks
public void setPreserveLineBreaks(boolean preserveLineBreaks) If set totrue
, line breaks will be preserved.Default is
false
.- Parameters:
preserveLineBreaks
- settrue
to preserve line breaks
getRemoveSurroundingSpaces
Returns a comma separated list of tags around which spaces will be removed.- Returns:
- a comma separated list of tags around which spaces will be removed.
setRemoveSurroundingSpaces
Enables surrounding spaces removal around provided comma separated list of tags.Besides custom defined lists, you can pass one of 3 predefined lists of tags:
BLOCK_TAGS_MIN
,BLOCK_TAGS_MAX
,ALL_TAGS
.- Parameters:
tagList
- a comma separated list of tags around which spaces will be removed