TDoG-Skin/vendor/hoa/ustring/Documentation/En/Index.xyl
2024-08-17 19:13:54 +08:00

614 lines
26 KiB
XML
Executable File
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<?xml version="1.0" encoding="utf-8"?>
<overlay xmlns="http://hoa-project.net/xyl/xylophone">
<yield id="chapter">
<p>Strings can sometimes be <strong>complex</strong>, especially when they use
the <code>Unicode</code> encoding format. The <code>Hoa\Ustring</code> library
provides several operations on UTF-8 strings.</p>
<h2 id="Table_of_contents">Table of contents</h2>
<tableofcontents id="main-toc" />
<h2 id="Introduction" for="main-toc">Introduction</h2>
<p>When we manipulate strings, the <a href="http://unicode.org/">Unicode</a>
format establishes itself because of its <strong>compatibility</strong> with
historical formats (like ASCII) and its capacity to understand a
<strong>large</strong> range of characters and symbols for all cultures and
all regions in the world. PHP provides several tools to manipulate such
strings, like the following extensions:
<a href="http://php.net/mbstring"><code>mbstring</code></a>,
<a href="http://php.net/iconv"><code>iconv</code></a> or also the excellent
<a href="http://php.net/intl"><code>intl</code></a> which is based on
<a href="http://icu-project.org/">ICU</a>, the reference implementation of
Unicode. Unfortunately, sometimes we have to mix these extensions to achieve
our aims and at the cost of a certain <strong>complexity</strong> along with
a regrettable <strong>verbosity</strong>.</p>
<p>The <code>Hoa\Ustring</code> library answers to these issues by providing a
<strong>simple</strong> way to manipulate strings with
<strong>performance</strong> and <strong>efficiency</strong> in minds. It
also provides some evoluated algorithms to perform <strong>search</strong>
operations on strings.</p>
<h2 id="Unicode_strings" for="main-toc">Unicode strings</h2>
<p>The <code>Hoa\Ustring\Ustring</code> class represents a
<strong>UTF-8</strong> Unicode strings and allows to manipulate it easily.
This class implements the
<a href="http://php.net/arrayaccess"><code>ArrayAccess</code></a>,
<a href="http://php.net/countable"><code>Countable</code></a> and
<a href="http://php.net/iteratoraggregate"><code>IteratorAggregate</code></a>
interfaces. We are going to use three examples in three different languages:
French, Arab and Japanese. Thus:</p>
<pre><code class="language-php">$french = new Hoa\Ustring\Ustring('Je t\'aime');
$arabic = new Hoa\Ustring\Ustring('أحبك');
$japanese = new Hoa\Ustring\Ustring('私はあなたを愛して');</code></pre>
<p>Now, let's see what we can do on these three strings.</p>
<h3 id="String_manipulation" for="main-toc">String manipulation</h3>
<p>Let's start with <strong>elementary</strong> operations. If we would like
to <strong>count</strong> the number of characters (not bytes), we will use
the <a href="http://php.net/count"><code>count</code> function</a>. Thus:</p>
<pre><code class="language-php">var_dump(
count($french),
count($arabic),
count($japanese)
);
/**
* Will output:
* int(9)
* int(4)
* int(9)
*/</code></pre>
<p>When we speak about text position, it is not suitable to speak about the
right or the left, but rather about a <strong>beginning</strong> or an
<strong>end</strong>, and based on the <strong>direction</strong> of writing.
We can know this direction thanks to the
<code>Hoa\Ustring\Ustring::getDirection</code> method. It returns the value of
one of the following constants:</p>
<ul>
<li><code>Hoa\Ustring\Ustring::LTR</code>, for left-to-right, if the text is
written from the left to the right,</li>
<li><code>Hoa\Ustring\Ustring::RTL</code>, for right-to-left, if the text is
written from the right to the left.</li>
</ul>
<p>Let's observe the result with our examples:</p>
<pre><code class="language-php">var_dump(
$french->getDirection() === Hoa\Ustring\Ustring::LTR, // is left-to-right?
$arabic->getDirection() === Hoa\Ustring\Ustring::RTL, // is right-to-left?
$japanese->getDirection() === Hoa\Ustring\Ustring::LTR // is left-to-right?
);
/**
* Will output:
* bool(true)
* bool(true)
* bool(true)
*/</code></pre>
<p>The result of this method is computed thanks to the
<code>Hoa\Ustring\Ustring::getCharDirection</code> static method which computes
the direction of only one character.</p>
<p>If we would like to <strong>concatenate</strong> another string to the end
or to the beginning, we will respectively use the
<code>Hoa\Ustring\Ustring::append</code> and
<code>Hoa\Ustring\Ustring::prepend</code> methods. These methods, like most of
the ones which modifies the string, return the object itself, in order to
chain the calls. For instance:</p>
<pre><code class="language-php">echo $french->append('… et toi, m\'aimes-tu ?')->prepend('Mam\'zelle ! ');
/**
* Will output:
* Mam'zelle ! Je t'aime… et toi, m'aimes-tu ?
*/</code></pre>
<p>We also have the <code>Hoa\Ustring\Ustring::toLowerCase</code> and
<code>Hoa\Ustring\Ustring::toUpperCase</code> methods to, respectively, set
the case of the string to lower or upper. For instance:</p>
<pre><code class="language-php">echo $french->toUpperCase();
/**
* Will output:
* MAM'ZELLE ! JE T'AIME… ET TOI, M'AIMES-TU ?
*/</code></pre>
<p>We can also add characters to the beginning or to the end of the string to
reach a <strong>minimum</strong> length. This operation is frequently called
the <em>padding</em> (for historical reasons dating back to typewriters).
That's why we have the <code>Hoa\Ustring\Ustring::pad</code> method which
takes three arguments: the minimum length, characters to add and a constant
indicating whether we have to add at the end or at the beginning of the string
(respectively <code>Hoa\Ustring\Ustring::END</code>, by default, and
<code>Hoa\Ustring\Ustring::BEGINNING</code>).</p>
<pre><code class="language-php">echo $arabic->pad(20, ' ');
/**
* Will output:
* أحبك
*/</code></pre>
<p>A similar operation allows to remove, by default, <strong>spaces</strong>
at the beginning and at the end of the string thanks to the
<code>Hoa\Ustring\Ustring::trim</code> method. For example, to retreive our
original Arabic string:</p>
<pre><code class="language-php">echo $arabic->trim();
/**
* Will output:
* أحبك
*/</code></pre>
<p>If we would like to remove other characters, we can use its first argument
which must be a regular expression. Finally, its second argument allows to
specify from what side we would like to remove character: at the beginning, at
the end or both, still by using the
<code>Hoa\Ustring\Ustring::BEGINNING</code> and
<code>Hoa\Ustring\Ustring::END</code> constants.</p>
<p>If we would like to remove other characters, we can use its first argument
which must be a regular expression. Finally, its second argument allows to
specify the side where to remove characters: at the beginning, at the end or
both, still by using the <code>Hoa\Ustring\Ustring::BEGINNING</code> and
<code>Hoa\Ustring\Ustring::END</code> constants. We can combine these
constants to express “both sides”, which is the default value:
<code class="language-php">Hoa\Ustring\Ustring::BEGINNING |
Hoa\Ustring\Ustring::END</code>. For example, to remove all the numbers and
the spaces only at the end, we will write:</p>
<pre><code class="language-php">$arabic->trim('\s|\d', Hoa\Ustring\Ustring::END);</code></pre>
<p>We can also <strong>reduce</strong> the string to a
<strong>sub-string</strong> by specifying the position of the first character
followed by the length of the sub-string to the
<code>Hoa\Ustring\Ustring::reduce</code> method:</p>
<pre><code class="language-php">echo $french->reduce(3, 6)->reduce(2, 4);
/**
* Will output:
* aime
*/</code></pre>
<p>If we would like to get a specific character, we can rely on the
<code>ArrayAccess</code> interface. For instance, to get the first character
of each of our examples (from their original definitions):</p>
<pre><code class="language-php">var_dump(
$french[0],
$arabic[0],
$japanese[0]
);
/**
* Will output:
* string(1) "J"
* string(2) "أ"
* string(3) "私"
*/</code></pre>
<p>If we would like the last character, we will use the -1 index. The index is
not bounded to the length of the string. If the index exceeds this length,
then a <em>modulo</em> will be applied.</p>
<p>We can also modify or remove a specific character with this method. For
example:</p>
<pre><code class="language-php">$french->append(' ?');
$french[-1] = '!';
echo $french;
/**
* Will output:
* Je t'aime !
*/</code></pre>
<p>Another very useful method is the <strong>ASCII</strong> transformation.
Be careful, this is not always possible, according to your settings. For
example:</p>
<pre><code class="language-php">$title = new Hoa\Ustring\Ustring('Un été brûlant sur la côte');
echo $title->toAscii();
/**
* Will output:
* Un ete brulant sur la cote
*/</code></pre>
<p>We can also transform from Arabic or Japanese to ASCII. Symbols, like
Mathemeticals symbols or emojis, are also transformed:</p>
<pre><code class="language-php">$emoji = new Hoa\Ustring\Ustring('I ❤ Unicode');
$maths = new Hoa\Ustring\Ustring('∀ i ∈ ');
echo
$arabic->toAscii(), "\n",
$japanese->toAscii(), "\n",
$emoji->toAscii(), "\n",
$maths->toAscii(), "\n";
/**
* Will output:
* ahbk
* sihaanatawo aishite
* I (heavy black heart) Unicode
* (for all) i (element of) N
*/</code></pre>
<p>In order this method to work correctly, the
<a href="http://php.net/intl"><code>intl</code></a> extension needs to be
present, so that the
<a href="http://php.net/transliterator"><code>Transliterator</code></a> class
is present. If it does not exist, the
<a href="http://php.net/normalizer"><code>Normalizer</code></a> class must
exist. If this class does not exist neither, the
<code>Hoa\Ustring\Ustring::toAscii</code> method can still try a
transformation, but it is less efficient. To activate this last solution,
<code>true</code> must be passed as a single argument. This <em lang="fr">tour
de force</em> is not recommended in most cases.</p>
<p>We also find the <code>getTransliterator</code> method which returns a
<code>Transliterator</code> object, or <code>null</code> if this class does
not exist. This method takes a transliteration identifier as argument. We
suggest to <a href="http://userguide.icu-project.org/transforms/general">read
the documentation about the transliterator of ICU</a> to understand this
identifier. The <code>transliterate</code> method allows to transliterate the
current string based on an identifier and a beginning index and an end
one. This method works the same way than the
<a href="http://php.net/transliterator.transliterate"><code>Transliterator::transliterate</code></a>
method.</p>
<p>More generally, to change the <strong>encoding</strong> format, we can use
the <code>Hoa\Ustring\Ustring::transcode</code> static method, with a string
as first argument, the original encoding format as second argument and the
expected encoding format as third argument (UTF-8 by default). The get the
list of encoding formats, we have to refer to the
<a href="http://php.net/iconv"><code>iconv</code></a> extension or to use the
following command line in a terminal:</p>
<pre><code class="language-php">$ iconv --list</code></pre>
<p>To know if a string is encoded in UTF-8, we can use the
<code>Hoa\Ustring\Ustring::isUtf8</code> static method; for instance:</p>
<pre><code class="language-php">var_dump(
Hoa\Ustring\Ustring::isUtf8('a'),
Hoa\Ustring\Ustring::isUtf8(Hoa\Ustring\Ustring::transcode('a', 'UTF-8', 'UTF-16'))
);
/**
* Will output:
* bool(true)
* bool(false)
*/</code></pre>
<p>We can <strong>split</strong> the string into several sub-strings by using
the <code>Hoa\Ustring\Ustring::split</code> method. As first argument, we have
a regular expression (of kind <a href="http://pcre.org/">PCRE</a>), then an
integer representing the maximum number of elements to return and finally a
combination of constants. These constants are the same as the ones of
<a href="http://php.net/preg_split"><code>preg_split</code></a>.</p>
<p>By default, the second argument is set to -1, which means infinity, and the
last argument is set to <code>PREG_SPLIT_NO_EMPTY</code>. Thus, if we would
like to get all the words of a string, we will write:</p>
<pre><code class="language-php">print_r($title->split('#\b|\s#'));
/**
* Will output:
* Array
* (
* [0] => Un
* [1] => ete
* [2] => brulant
* [3] => sur
* [4] => la
* [5] => cote
* )
*/</code></pre>
<p>If we would like to <strong>iterate</strong> over all the
<strong>characters</strong>, it is recommended to use the
<code>IteratorAggregate</code> method, being the
<code>Hoa\Ustring\Ustring::getIterator</code> method. Let's see on the Arabic
example:</p>
<pre><code class="language-php">foreach ($arabic as $letter) {
echo $letter, "\n";
}
/**
* Will output:
* أ
* ح
* ب
* ك
*/</code></pre>
<p>We notice that the iteration is based on the text direction, it means that
the first element of the iteration is the first letter of the string starting
from the beginning.</p>
<p>Of course, if we would like to get an array of characters, we can use the
<a href="http://php.net/iterator_to_array"><code>iterator_to_array</code></a>
PHP function:</p>
<pre><code class="language-php">print_r(iterator_to_array($arabic));
/**
* Will output:
* Array
* (
* [0] => أ
* [1] => ح
* [2] => ب
* [3] => ك
* )
*/</code></pre>
<h3 id="Comparison_and_search" for="main-toc">Comparison and search</h3>
<p>Strings can also be <strong>compared</strong> thanks to the
<code>Hoa\Ustring\Ustring::compare</code> method:</p>
<pre><code class="language-php">$string = new Hoa\Ustring\Ustring('abc');
var_dump(
$string->compare('wxyz')
);
/**
* Will output:
* string(-1)
*/</code></pre>
<p>This methods returns -1 if the initial string comes before (in the
alphabetical order), 0 if it is identical and 1 if it comes after. If we
would like to use all the power of the underlying mechanism, we can call the
<code>Hoa\Ustring\Ustring::getCollator</code> static method (if the
<a href="http://php.net/Collator"><code>Collator</code></a> class exists, else
<code>Hoa\Ustring\Ustring::compare</code> will use a simple byte to bytes
comparison without taking care of the other parameters). Thus, if we would
like to sort an array of strings, we will write:</p>
<pre><code class="language-php">$strings = array('c', 'Σ', 'd', 'x', 'α', 'a');
Hoa\Ustring\Ustring::getCollator()->sort($strings);
print_r($strings);
/**
* Could output:
* Array
* (
* [0] => a
* [1] => c
* [2] => d
* [3] => x
* [4] => α
* [5] => Σ
* )
*/</code></pre>
<p>Comparison between two strings depends on the <strong>locale</strong>, it
means of the localization of the system, like the language, the country, the
region etc. We can use the
<a href="@hack:chapter=Locale"><code>Hoa\Locale</code> library</a> to modify
these data, but it's not a dependence of <code>Hoa\Ustring</code>.</p>
<p>We can also know if a string <strong>matches</strong> a certain pattern,
still expressed with a regular expression. To achieve that, we will use the
<code>Hoa\Ustring\Ustring::match</code> method. This method relies on the
<a href="http://php.net/preg_match"><code>preg_match</code></a> and
<a href="http://php.net/preg_match_all"><code>preg_match_all</code></a> PHP
functions, but by modifying the pattern's options to ensure the Unicode
support. We have the following parameters: the pattern, a variable passed by
reference to collect the matches, flags, an offset and finally a boolean
indicating whether the search is global or not (respectively if we have to use
<code>preg_match_all</code> or <code>preg_match</code>). By default, the
search is not global.</p>
<p>Thus, we will check that our French example contains <code>aime</code> with
a direct object complement:</p>
<pre><code class="language-php">$french->match('#(?:(?&amp;lt;direct_object>\w)[\'\b])aime#', $matches);
var_dump($matches['direct_object']);
/**
* Will output:
* string(1) "t"
*/</code></pre>
<p>This method returns <code>false</code> if an error is raised (for example
if the pattern is not correct), 0 if no match has been found, the number of
matches else.</p>
<p>Similarly, we can <strong>search</strong> and <strong>replace</strong>
sub-strings by other sub-strings based on a pattern, still expressed with a
regular expression. To achieve that, we will use the
<code>Hoa\Ustring\Ustring::replace</code> method. This method uses the
<a href="http://php.net/preg_replace"><code>preg_replace</code></a> and
<a href="http://php.net/preg_replace_callback"><code>preg_replace_callback</code></a>
PHP functions, but still by modifying the pattern's options to ensure the
Unicode support. As first argument, we find one or more patterns, as second
argument, one or more replacements and as last argument the limit of
replacements to apply. If the replacement is a callable, then the
<code>preg_replace_callback</code> function will be used.</p>
<p>Thus, we will modify our French example to be more polite:</p>
<pre><code class="language-php">$french->replace('#(?:\w[\'\b])(?&amp;lt;verb>aime)#', function ($matches) {
return 'vous ' . $matches['verb'];
});
echo $french;
/**
* Will output:
* Je vous aime
*/</code></pre>
<p>The <code>Hoa\Ustring\Ustring</code> class provides constants which are
aliases of existing PHP constants and ensure a better readability of the
code:</p>
<ul>
<li><code>Hoa\Ustring\Ustring::WITHOUT_EMPTY</code>, alias of
<code>PREG_SPLIT_NO_EMPTY</code>,</li>
<li><code>Hoa\Ustring\Ustring::WITH_DELIMITERS</code>, alias of
<code>PREG_SPLIT_DELIM_CAPTURE</code>,</li>
<li><code>Hoa\Ustring\Ustring::WITH_OFFSET</code>, alias of
<code>PREG_OFFSET_CAPTURE</code> and
<code>PREG_SPLIT_OFFSET_CAPTURE</code>,</li>
<li><code>Hoa\Ustring\Ustring::GROUP_BY_PATTERN</code>, alias of
<code>PREG_PATTERN_ORDER</code>,</li>
<li><code>Hoa\Ustring\Ustring::GROUP_BY_TUPLE</code>, alias of
<code>PREG_SET_ORDER</code>.</li>
</ul>
<p>Because they are strict aliases, we can write:</p>
<pre><code class="language-php">$string = new Hoa\Ustring\Ustring('abc1 defg2 hikl3 xyz4');
$string->match(
'#(\w+)(\d)#',
$matches,
Hoa\Ustring\Ustring::WITH_OFFSET
| Hoa\Ustring\Ustring::GROUP_BY_TUPLE,
0,
true
);</code></pre>
<h3 id="Characters" for="main-toc">Characters</h3>
<p>The <code>Hoa\Ustring\Ustring</code> class offers static methods working on
a single Unicode character. We have already mentionned the
<code>getCharDirection</code> method which allows to know the
<strong>direction</strong> of a character. We also have the
<code>getCharWidth</code> which counts the <strong>number of columns</strong>
necessary to print a single character. Thus:</p>
<pre><code class="language-php">var_dump(
Hoa\Ustring\Ustring::getCharWidth(Hoa\Ustring\Ustring::fromCode(0x7f)),
Hoa\Ustring\Ustring::getCharWidth('a'),
Hoa\Ustring\Ustring::getCharWidth('㽠')
);
/**
* Will output:
* int(-1)
* int(1)
* int(2)
*/</code></pre>
<p>This method returns -1 or 0 if the character is not
<strong>printable</strong> (for instance, if this is a control character, like
<code>0x7f</code> which corresponds to <code>DELETE</code>), 1 or more if this
is a character that can be printed. In our example, <code></code> requires
2 columns to be printed.</p>
<p>To get more semantics, we have the
<code>Hoa\Ustring\Ustring::isCharPrintable</code> method which allows to know
whether a character is printable or not.</p>
<p>If we would like to count the number of columns necessary for a whole
string, we have to use the <code>Hoa\Ustring\Ustring::getWidth</code> method.
Thus:</p>
<pre><code class="language-php">var_dump(
$french->getWidth(),
$arabic->getWidth(),
$japanese->getWidth()
);
/**
* Will output:
* int(9)
* int(4)
* int(18)
*/</code></pre>
<p>Try this in your terminal with a <strong>monospaced</strong> font. You will
observe that Japanese requires 18 columns to be printed. This measure is very
useful if we would like to know the length of a string to position it
efficiently.</p>
<p>The <code>getCharWidth</code> method is different of <code>getWidth</code>
because it includes control characters. This method is intended to be used,
for example, with terminals (please, see the
<a href="@hack:chapter=Console"><code>Hoa\Console</code> library</a>).</p>
<p>Finally, if this time we are not interested by Unicode characters but
rather by <strong>machine</strong> characters <code>char</code> (being
1 byte), we have an extra operation. The
<code>Hoa\Ustring\Ustring::getBytesLength</code> method will count the
<strong>length</strong> of the string in bytes:</p>
<pre><code class="language-php">var_dump(
$arabic->getBytesLength(),
$japanese->getBytesLength()
);
/**
* Will output:
* int(8)
* int(27)
*/</code></pre>
<p>If we compare these results with the ones of the
<code>Hoa\Ustring\Ustring::count</code> method, we understand that the Arabic
characters are encoded with 2 bytes whereas Japanese characteres are encoded
with 3 bytes. We can also get a specific byte thanks to the
<code>Hoa\Ustring\Ustring::getByteAt</code> method. Once again, the index is
not bounded.</p>
<h3 id="Code-point" for="main-toc">Code-point</h3>
<p>Each character is represented by an integer, called a
<strong>code-point</strong>. To get the code-point of a character, we can
use the <code>Hoa\Ustring\Ustring::toCode</code> static method, and to get a
character based on its code-point, we can use the
<code>Hoa\Ustring\Ustring::fromCode</code> static method. We also have the
<code>Hoa\Ustring\Ustring::toBinaryCode</code> method which returns the binary
representation of a character. Let's take an example:</p>
<pre><code class="language-php">var_dump(
Hoa\Ustring\Ustring::toCode('Σ'),
Hoa\Ustring\Ustring::toBinaryCode('Σ'),
Hoa\Ustring\Ustring::fromCode(0x1a9)
);
/**
* Will output:
* int(931)
* string(32) "1100111010100011"
* string(2) "Σ"
*/</code></pre>
<h2 id="Search_algorithms" for="main-toc">Search algorithms</h2>
<p>The <code>Hoa\Ustring</code> library provides sophisticated
<strong>search</strong> algorithms on strings through the
<code>Hoa\Ustring\Search</code> class.</p>
<p>We will study the <code>Hoa\Ustring\Search::approximated</code> algorithm
which searches a sub-string in a string up to <strong><em>k</em>
differences</strong> (a difference is an addition, a deletion or a
modification). Let's take the classical example of a DNA representation: We
will search all the sub-strings approximating <code>GATAA</code> with
1 difference (maximum) in <code>CAGATAAGAGAA</code>. So, we will write:</p>
<pre><code class="language-php">$x = 'GATAA';
$y = 'CAGATAAGAGAA';
$k = 1;
$search = Hoa\Ustring\Search::approximated($y, $x, $k);
$n = count($search);
echo 'Try to match ', $x, ' in ', $y, ' with at most ', $k, ' difference(s):', "\n";
echo $n, ' match(es) found:', "\n";
foreach ($search as $position) {
echo ' • ', substr($y, $position['i'], $position['l'), "\n";
}
/**
* Will output:
* Try to match GATAA in CAGATAAGAGAA with at most 1 difference(s):
* 4 match(es) found:
* • AGATA
* • GATAA
* • ATAAG
* • GAGAA
*/</code></pre>
<p>This methods returns an array of arrays. Each sub-array represents a result
and contains three indexes: <code>i</code> for the position of the first
character (byte) of the result, <code>j</code> for the position of the last
character and <code>l</code> for the length of the result (simply
<code>j</code> - <code>i</code>). Thus, we can compute the results by using
our initial string (here <code class="language-php">$y</code>) and its
indexes.</p>
<p>With our example, we have four results. The first is <code>AGATA</code>,
being <code>GATA<em>A</em></code> with one moved character, and
<code>AGATA</code> exists in <code>C<em>AGATA</em>AGAGAA</code>. The second
result is <code>GATAA</code>, our sub-string, which well and truly exists in
<code>CA<em>GATAA</em>GAGAA</code>. The third result is <code>ATAAG</code>,
being <code><em>G</em>ATAA</code> with one moved character, and
<code>ATAAG</code> exists in <code>CAG<em>ATAAG</em>AGAA</code>. Finally, the
last result is <code>GAGAA</code>, being <code>GA<em>T</em>AA</code> with one
modified character, and <code>GAGAA</code> exists in
<code>CAGATAA<em>GAGAA</em></code>.</p>
<p>Another example, more concrete this time. We will consider the
<code>--testIt --foobar --testThat --testAt</code> string (which represents
possible options of a command line), and we will search <code>--testot</code>,
an option that should have been given by the user. This option does not exist
as it is. We will then use our search algorithm with at most 1 difference.
Let's see:</p>
<pre><code class="language-php">$x = 'testot';
$y = '--testIt --foobar --testThat --testAt';
$k = 1;
$search = Hoa\Ustring\Search::approximated($y, $x, $k);
$n = count($search);
// …
/**
* Will output:
* Try to match testot in --testIt --foobar --testThat --testAt with at most 1 difference(s)
* 2 match(es) found:
* • testIt
* • testAt
*/</code></pre>
<p>The <code>testIt</code> and <code>testAt</code> results are true options,
so we can suggest them to the user. This is a mechanism user by
<code>Hoa\Console</code> to suggest corrections to the user in case of a
mistyping.</p>
<h2 id="Conclusion" for="main-toc">Conclusion</h2>
<p>The <code>Hoa\Ustring</code> library provides facilities to manipulate
strings encoded with the Unicode format, but also to make sophisticated search
on strings.</p>
</yield>
</overlay>