Find The Common Occurrences Of Words In Two String Values
Solution 1:
You can use a first regular expression as a tokenizer to split the tester
string into a list of words, then use such words to build a second regular expression that matches the word list. For example:
var tester = "a string with a lot of words";
functiongetMeRepeatedWordsDetails ( sentence ) {
sentence = sentence + " ";
var regex = /[^\s]+/g;
var regex2 = newRegExp ( "(" + tester.match ( regex ).join ( "|" ) + ")\\W", "g" );
matches = sentence.match ( regex2 );
var words = {};
for ( var i = 0; i < matches.length; i++ ) {
var match = matches [ i ].replace ( /\W/g, "" );
var w = words [ match ];
if ( ! w )
words [ match ] = 1;
else
words [ match ]++;
}
return words;
}
console.log ( getMeRepeatedWordsDetails ( "another string with some words" ) );
The tokenizer is the line:
var regex = /[^\s]+/g;
When you do:
tester.match ( regex )
you get the list of words contained in tester
:
[ "a", "string", "with", "a", "lot", "of", "words" ]
With such an array we build a second regular expression that matches all the words; regex2
has the form:
/(a|string|with|a|lot|of|words)\W/g
The \W
is added to match only whole words, otherwise the a
element will match any word beginning with a
. The result of applying regex2
to sentence
is another array with only the words that are contained in regex2
, that is the words that are contained both in tester
and sentence
. Then the for
loop only counts the words in the matches
array transforming it into the object you requested.
But beware that:
- you have to put at least a space at the end of
sentence
otherwise the\W
inregex2
doesn't match the last word:sentence = sentence + " "
- you have to remove some possible extra character form the matches that has been captured by the
\W
:match = matches [ i ].replace ( /\W/g, "" )
Post a Comment for "Find The Common Occurrences Of Words In Two String Values"