Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code | Sign in
(1575)

Unified Diff: icu4c/source/test/testdata/rbbitst.txt

Issue 323400043: ticket:13274 Break Iterator Test, cleanups & Java porting. (Closed) Base URL: svn+ssh://source.icu-project.org/repos/icu/trunk/
Patch Set: Created 6 years, 7 months ago
Use n/p to move between diff chunks; N/P to move between comments. Please Sign in to add in-line comments.
Jump to:
View side-by-side diff with in-line comments
Download patch
Index: icu4c/source/test/testdata/rbbitst.txt
===================================================================
--- icu4c/source/test/testdata/rbbitst.txt (revision 40309)
+++ icu4c/source/test/testdata/rbbitst.txt (working copy)
@@ -14,7 +14,9 @@
# <sent> any following data is for sentence break testing
# <line> any following data is for line break testing
# <char> any following data is for char break testing
-# <locale local_name> Switch to the named locale at the next occurence of <word>, <sent>, etc.
+# <rules> rules ... </rules> following data is tested against these rules.
+# Applies until a following occurence of <word>, <sent>, etc. or another <rules>
+# <locale locale_name> Switch to the named locale at the next occurence of <word>, <sent>, etc.
# <data> ... </data> test data. May span multiple lines.
# <> Break position, status == 0
# • Break position, status == 0 (Bullet, \u2022)
@@ -37,9 +39,18 @@
# Temp debugging tests
<locale en>
<word>
-<data><0>1•2•3•4•</data>
-# <data><0>ク<400>ライアン<400>トサーバー<400></data>
+<data><0>コンピューター<400>は<400>、<0>本質<400>的<400>に<400>は<400>数字<400>しか<400>扱う<400>こと<400>が<400>でき<400>ま<400>せん<400>。<0>\
+コンピューター<400>は<400>、<0>文字<400>や<400>記号<400>など<400>の<400>それぞれに<400>番号<400>を<400>割り振る<400>こと<400>によって<400>扱える<400>\
+よう<400>にし<400>ます<400>。<0>ユニ<400>コード<400>が<400>出来る<400>まで<400>は<400>、<0>これらの<400>番号<400>を<400>割り振る<400>仕組み<400>が<400>\
+何<400>百<400>種類<400>も<400>存在<400>しま<400>した<400>。<0>どの<400>一つ<400>を<400>とっても<400>、<0>十分<400>な<400>文字<400>を<400>含<400>\
+んで<400>は<400>いま<400>せん<400>で<400>した<400>。<0>例えば<400>、<0>欧州<400>連合<400>一つ<400>を<400>見<400>て<400>も<400>、<0>その<400>\
+すべて<400>の<400>言語<400>を<400>カバー<400>する<400>ため<400>に<400>は<400>、<0>いくつか<400>の<400>異なる<400>符号<400>化<400>の<400>仕組み<400>\
+が<400>必要<400>で<400>した<400>。<0>英語<400>の<400>よう<400>な<400>一つ<400>の<400>言語<400>に<400>限<400>って<400>も<400>、<0>一つ<400>だけ<400>\
+の<400>符号<400>化<400>の<400>仕組み<400>では<400>、<0>一般<400>的<400>に<400>使<400>われる<400>すべて<400>の<400>文字<400>、<0>句読点<400>、<0>\
+。<0></data>
+#<data><0>コンピューター<400>は<400>、<0>本質<400>的<400>に<400>は<400>数字<400>しか<400>扱う<400>こと<400>が<400>でき<400>ま<400>せん<400>。<0>\
+
## FILTERED BREAK TESTS
# (William Bradford, public domain. http://catalog.hathitrust.org/Record/008651224 ) - edited.
@@ -1308,3 +1319,48 @@
<data>•\U0001F468\u200D\u2695\uFE0F•\U0001F468\u200D\u2695•\U0001F468\U0001F3FD\u200D\u2695\uFE0F•\U0001F468\U0001F3FD\u200D\u2695\u0020•</data>
# woman astronaut, woman astronaut / fitz4
<data>•\U0001F469\u200D\U0001F680•\U0001F469\U0001F3FD\u200D\U0001F680\u0020•</data>
+
+
+####################################################################################
+#
+# Test rule status values
+#
+####################################################################################
+<rules> $Letters = [:L:];
+ $Numbers = [:N:];
+ $Letters+{1};
+ $Numbers+{2};
+ Help\ me\!{4};
+ [^$Letters $Numbers];
+ !.*;
+</rules>
+<data>•abc<1>123<2>.•.•abc<1> •Help<1> •me<1> •Help me!<4></data>
+
+# Test option to prohibit unquoted literals.
+
+<rules>
+!!forward;
+ Hello\ World;
+!!reverse;
+ .*;
+</rules>
+<data>•Hello World•</data>
+
+<badrules>
+!!quoted_literals_only;
+!!forward;
+ Hello\ World;
+!!reverse;
+ .*;
+</badrules>
+
+<rules>
+#TODO: uncomment this line when quoted_literals_only is implemented.
+#!!quoted_literals_only;
+!!forward;
+ 'Hello World';
+!!reverse;
+ .*;
+</rules>
+<data>•Hello World•</data>
+

Powered by Google App Engine
RSS Feeds Recent Issues | This issue
This is Rietveld f62528b