Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WordCount: Add types #22077

Merged
merged 22 commits into from Oct 19, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
4 changes: 2 additions & 2 deletions packages/wordcount/README.md
Expand Up @@ -30,8 +30,8 @@ const numberOfWords = count( 'Words to count', 'words', {} )
_Parameters_

- _text_ `string`: The text being processed
- _type_ `string`: The type of count. Accepts ;words', 'characters_excluding_spaces', or 'characters_including_spaces'.
- _userSettings_ `Object`: Custom settings object.
- _type_ `WPWordCountStrategy`: The type of count. Accepts 'words', 'characters_excluding_spaces', or 'characters_including_spaces'.
- _userSettings_ `WPWordCountUserSettings`: Custom settings object.

_Returns_

Expand Down
37 changes: 37 additions & 0 deletions packages/wordcount/src/defaultSettings.js
@@ -1,3 +1,40 @@
/** @typedef {import('./index').WPWordCountStrategy} WPWordCountStrategy */

/** @typedef {Partial<{type: WPWordCountStrategy, shortcodes: string[]}>} WPWordCountL10n */

/**
* @typedef WPWordCountSettingsFields
* @property {RegExp} HTMLRegExp Regular expression that matches HTML tags
* @property {RegExp} HTMLcommentRegExp Regular expression that matches HTML comments
nb marked this conversation as resolved.
Show resolved Hide resolved
* @property {RegExp} spaceRegExp Regular expression that matches spaces in HTML
* @property {RegExp} HTMLEntityRegExp Regular expression that matches HTML entities
* @property {RegExp} connectorRegExp Regular expression that matches word connectors, like em-dash
* @property {RegExp} removeRegExp Regular expression that matches various characters to be removed when counting
* @property {RegExp} astralRegExp Regular expression that matches astral UTF-16 code points
* @property {RegExp} wordsRegExp Regular expression that matches words
* @property {RegExp} characters_excluding_spacesRegExp Regular expression that matches characters excluding spaces
* @property {RegExp} characters_including_spacesRegExp Regular expression that matches characters including spaces
* @property {RegExp} shortcodesRegExp Regular expression that matches WordPress shortcodes
* @property {string[]} shortcodes List of all shortcodes
* @property {WPWordCountStrategy} type Describes what and how are we counting
* @property {WPWordCountL10n} l10n Object with human translations
*/

/**
* Lower-level settings for word counting that can be overridden.
*
* @typedef {Partial<WPWordCountSettingsFields>} WPWordCountUserSettings
*/

// Disable reason: JSDoc linter doesn't seem to parse the union (`&`) correctly: https://github.com/jsdoc/jsdoc/issues/1285
/* eslint-disable jsdoc/valid-types */
/**
* Word counting settings that include non-optional values we set if missing
*
* @typedef {WPWordCountUserSettings & typeof defaultSettings} WPWordCountDefaultSettings
*/
/* eslint-enable jsdoc/valid-types */
sirreal marked this conversation as resolved.
Show resolved Hide resolved

export const defaultSettings = {
HTMLRegExp: /<\/?[a-z][^>]*?>/gi,
HTMLcommentRegExp: /<!--[\s\S]*?-->/g,
Expand Down
111 changes: 61 additions & 50 deletions packages/wordcount/src/index.js
Expand Up @@ -17,18 +17,29 @@ import stripShortcodes from './stripShortcodes';
import stripSpaces from './stripSpaces';
import transposeHTMLEntitiesToCountableChars from './transposeHTMLEntitiesToCountableChars';

/**
* @typedef {import('./defaultSettings').WPWordCountDefaultSettings} WPWordCountSettings
* @typedef {import('./defaultSettings').WPWordCountUserSettings} WPWordCountUserSettings
*/

/**
* Possible ways of counting.
*
* @typedef {'words'|'characters_excluding_spaces'|'characters_including_spaces'} WPWordCountStrategy
*/

/**
* Private function to manage the settings.
*
* @param {string} type The type of count to be done.
* @param {Object} userSettings Custom settings for the count.
* @param {WPWordCountStrategy} type The type of count to be done.
* @param {WPWordCountUserSettings} userSettings Custom settings for the count.
*
* @return {void|Object|*} The combined settings object to be used.
* @return {WPWordCountSettings} The combined settings object to be used.
nb marked this conversation as resolved.
Show resolved Hide resolved
*/
function loadSettings( type, userSettings ) {
const settings = extend( defaultSettings, userSettings );
const settings = extend( {}, defaultSettings, userSettings );

settings.shortcodes = settings.l10n.shortcodes || {};
settings.shortcodes = settings.l10n?.shortcodes ?? [];
aduth marked this conversation as resolved.
Show resolved Hide resolved

if ( settings.shortcodes && settings.shortcodes.length ) {
settings.shortcodesRegExp = new RegExp(
Expand All @@ -37,7 +48,7 @@ function loadSettings( type, userSettings ) {
);
}

settings.type = type || settings.l10n.type;
settings.type = type;

if (
settings.type !== 'characters_excluding_spaces' &&
Expand All @@ -50,56 +61,56 @@ function loadSettings( type, userSettings ) {
}

/**
* Match the regex for the type 'words'
* Count the words in text
*
* @param {string} text The text being processed
* @param {string} regex The regular expression pattern being matched
* @param {Object} settings Settings object containing regular expressions for each strip function
* @param {string} text The text being processed
* @param {RegExp} regex The regular expression pattern being matched
* @param {WPWordCountSettings} settings Settings object containing regular expressions for each strip function
*
* @return {Array|{index: number, input: string}} The matched string.
* @return {number} Count of words.
*/
function matchWords( text, regex, settings ) {
function countWords( text, regex, settings ) {
text = flow(
stripTags.bind( this, settings ),
stripHTMLComments.bind( this, settings ),
stripShortcodes.bind( this, settings ),
stripSpaces.bind( this, settings ),
stripHTMLEntities.bind( this, settings ),
stripConnectors.bind( this, settings ),
stripRemovables.bind( this, settings )
stripTags.bind( null, settings ),
stripHTMLComments.bind( null, settings ),
stripShortcodes.bind( null, settings ),
stripSpaces.bind( null, settings ),
stripHTMLEntities.bind( null, settings ),
stripConnectors.bind( null, settings ),
stripRemovables.bind( null, settings )
)( text );
text = text + '\n';
return text.match( regex );
return text.match( regex )?.length ?? 0;
}

/**
* Match the regex for either 'characters_excluding_spaces' or 'characters_including_spaces'
* Count the characters in text
*
* @param {string} text The text being processed
* @param {string} regex The regular expression pattern being matched
* @param {Object} settings Settings object containing regular expressions for each strip function
* @param {string} text The text being processed
* @param {RegExp} regex The regular expression pattern being matched
* @param {WPWordCountSettings} settings Settings object containing regular expressions for each strip function
*
* @return {Array|{index: number, input: string}} The matched string.
* @return {number} Count of characters.
*/
function matchCharacters( text, regex, settings ) {
function countCharacters( text, regex, settings ) {
text = flow(
stripTags.bind( this, settings ),
stripHTMLComments.bind( this, settings ),
stripShortcodes.bind( this, settings ),
stripSpaces.bind( this, settings ),
transposeAstralsToCountableChar.bind( this, settings ),
transposeHTMLEntitiesToCountableChars.bind( this, settings )
stripTags.bind( null, settings ),
stripHTMLComments.bind( null, settings ),
stripShortcodes.bind( null, settings ),
transposeAstralsToCountableChar.bind( null, settings ),
stripSpaces.bind( null, settings ),
transposeHTMLEntitiesToCountableChars.bind( null, settings )
)( text );
text = text + '\n';
return text.match( regex );
return text.match( regex )?.length ?? 0;
nb marked this conversation as resolved.
Show resolved Hide resolved
}

/**
* Count some words.
*
* @param {string} text The text being processed
* @param {string} type The type of count. Accepts ;words', 'characters_excluding_spaces', or 'characters_including_spaces'.
* @param {Object} userSettings Custom settings object.
* @param {string} text The text being processed
* @param {WPWordCountStrategy} type The type of count. Accepts 'words', 'characters_excluding_spaces', or 'characters_including_spaces'.
* @param {WPWordCountUserSettings} userSettings Custom settings object.
*
* @example
* ```js
Expand All @@ -109,20 +120,20 @@ function matchCharacters( text, regex, settings ) {
*
* @return {number} The word or character count.
*/

export function count( text, type, userSettings ) {
if ( '' === text ) {
return 0;
}

if ( text ) {
const settings = loadSettings( type, userSettings );
const matchRegExp = settings[ type + 'RegExp' ];
const results =
'words' === settings.type
? matchWords( text, matchRegExp, settings )
: matchCharacters( text, matchRegExp, settings );

return results ? results.length : 0;
const settings = loadSettings( type, userSettings );
let matchRegExp;
switch ( settings.type ) {
case 'words':
matchRegExp = settings.wordsRegExp;
return countWords( text, matchRegExp, settings );
case 'characters_including_spaces':
matchRegExp = settings.characters_including_spacesRegExp;
return countCharacters( text, matchRegExp, settings );
case 'characters_excluding_spaces':
matchRegExp = settings.characters_excluding_spacesRegExp;
return countCharacters( text, matchRegExp, settings );
default:
return 0;
Comment on lines +126 to +137
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like how this section turned out. It's nice that the count…() functions return a number result.

}
}
9 changes: 3 additions & 6 deletions packages/wordcount/src/stripConnectors.js
@@ -1,14 +1,11 @@
/**
* Replaces items matched in the regex with spaces.
*
* @param {Object} settings The main settings object containing regular expressions
* @param {string} text The string being counted.
* @param {import('./index').WPWordCountSettings} settings The main settings object containing regular expressions
* @param {string} text The string being counted.
*
* @return {string} The manipulated text.
*/
export default function stripConnectors( settings, text ) {
if ( settings.connectorRegExp ) {
return text.replace( settings.connectorRegExp, ' ' );
}
return text;
ockham marked this conversation as resolved.
Show resolved Hide resolved
return text.replace( settings.connectorRegExp, ' ' );
}
9 changes: 3 additions & 6 deletions packages/wordcount/src/stripHTMLComments.js
@@ -1,14 +1,11 @@
/**
* Removes items matched in the regex.
*
* @param {Object} settings The main settings object containing regular expressions
* @param {string} text The string being counted.
* @param {import('./index').WPWordCountSettings} settings The main settings object containing regular expressions
* @param {string} text The string being counted.
*
* @return {string} The manipulated text.
*/
export default function stripHTMLComments( settings, text ) {
if ( settings.HTMLcommentRegExp ) {
return text.replace( settings.HTMLcommentRegExp, '' );
}
return text;
return text.replace( settings.HTMLcommentRegExp, '' );
}
9 changes: 3 additions & 6 deletions packages/wordcount/src/stripHTMLEntities.js
@@ -1,14 +1,11 @@
/**
* Removes items matched in the regex.
*
* @param {Object} settings The main settings object containing regular expressions
* @param {string} text The string being counted.
* @param {import('./index').WPWordCountSettings} settings The main settings object containing regular expressions
* @param {string} text The string being counted.
*
* @return {string} The manipulated text.
*/
export default function stripHTMLEntities( settings, text ) {
if ( settings.HTMLEntityRegExp ) {
return text.replace( settings.HTMLEntityRegExp, '' );
}
return text;
return text.replace( settings.HTMLEntityRegExp, '' );
}
9 changes: 3 additions & 6 deletions packages/wordcount/src/stripRemovables.js
@@ -1,14 +1,11 @@
/**
* Removes items matched in the regex.
*
* @param {Object} settings The main settings object containing regular expressions
* @param {string} text The string being counted.
* @param {import('./index').WPWordCountSettings} settings The main settings object containing regular expressions
* @param {string} text The string being counted.
*
* @return {string} The manipulated text.
*/
export default function stripRemovables( settings, text ) {
if ( settings.removeRegExp ) {
return text.replace( settings.removeRegExp, '' );
}
return text;
return text.replace( settings.removeRegExp, '' );
}
4 changes: 2 additions & 2 deletions packages/wordcount/src/stripShortcodes.js
@@ -1,8 +1,8 @@
/**
* Replaces items matched in the regex with a new line.
*
* @param {Object} settings The main settings object containing regular expressions
* @param {string} text The string being counted.
* @param {import('./index').WPWordCountSettings} settings The main settings object containing regular expressions
* @param {string} text The string being counted.
*
* @return {string} The manipulated text.
*/
Expand Down
8 changes: 3 additions & 5 deletions packages/wordcount/src/stripSpaces.js
@@ -1,13 +1,11 @@
/**
* Replaces items matched in the regex with spaces.
*
* @param {Object} settings The main settings object containing regular expressions
* @param {string} text The string being counted.
* @param {import('./index').WPWordCountSettings} settings The main settings object containing regular expressions
* @param {string} text The string being counted.
*
* @return {string} The manipulated text.
*/
export default function stripSpaces( settings, text ) {
if ( settings.spaceRegExp ) {
return text.replace( settings.spaceRegExp, ' ' );
}
return text.replace( settings.spaceRegExp, ' ' );
}
8 changes: 3 additions & 5 deletions packages/wordcount/src/stripTags.js
@@ -1,13 +1,11 @@
/**
* Replaces items matched in the regex with new line
*
* @param {Object} settings The main settings object containing regular expressions
* @param {string} text The string being counted.
* @param {import('./index').WPWordCountSettings} settings The main settings object containing regular expressions
* @param {string} text The string being counted.
*
* @return {string} The manipulated text.
*/
export default function stripTags( settings, text ) {
if ( settings.HTMLRegExp ) {
return text.replace( settings.HTMLRegExp, '\n' );
}
return text.replace( settings.HTMLRegExp, '\n' );
}
9 changes: 3 additions & 6 deletions packages/wordcount/src/transposeAstralsToCountableChar.js
@@ -1,14 +1,11 @@
/**
* Replaces items matched in the regex with character.
*
* @param {Object} settings The main settings object containing regular expressions
* @param {string} text The string being counted.
* @param {import('./index').WPWordCountSettings} settings The main settings object containing regular expressions
* @param {string} text The string being counted.
*
* @return {string} The manipulated text.
*/
export default function transposeAstralsToCountableChar( settings, text ) {
if ( settings.astralRegExp ) {
return text.replace( settings.astralRegExp, 'a' );
}
return text;
return text.replace( settings.astralRegExp, 'a' );
}
@@ -1,17 +1,14 @@
/**
* Replaces items matched in the regex with a single character.
*
* @param {Object} settings The main settings object containing regular expressions
* @param {string} text The string being counted.
* @param {import('./index').WPWordCountSettings} settings The main settings object containing regular expressions
* @param {string} text The string being counted.
*
* @return {string} The manipulated text.
*/
export default function transposeHTMLEntitiesToCountableChars(
settings,
text
) {
if ( settings.HTMLEntityRegExp ) {
return text.replace( settings.HTMLEntityRegExp, 'a' );
}
return text;
return text.replace( settings.HTMLEntityRegExp, 'a' );
}
8 changes: 8 additions & 0 deletions packages/wordcount/tsconfig.json
@@ -0,0 +1,8 @@
{
"extends": "../../tsconfig.base.json",
"compilerOptions": {
"rootDir": "src",
"declarationDir": "build-types"
},
"include": [ "src/**/*" ]
}
3 changes: 2 additions & 1 deletion tsconfig.json
Expand Up @@ -21,7 +21,8 @@
{ "path": "packages/project-management-automation" },
{ "path": "packages/token-list" },
{ "path": "packages/url" },
{ "path": "packages/warning" }
{ "path": "packages/warning" },
{ "path": "packages/wordcount" }
],
"files": []
}