object KoreanTokenizer
Provides Korean tokenization.
Chunk: 어절 - 공백으로 구분되어 있는 단위 (사랑하는사람을) Word: 단어 - 하나의 문장 구성 요소 (사랑하는, 사람을) Token: 토큰 - 형태소와 비슷한 단위이지만 문법적으로 정확하지는 않음 (사랑, 하는, 사람, 을)
Whenever there is an updates in the behavior of KoreanParser, the initial cache has to be updated by running tools.CreateInitialCache.
Linear Supertypes
Ordering
- Alphabetic
- By Inheritance
Inherited
- KoreanTokenizer
- AnyRef
- Any
- Hide All
- Show All
Visibility
- Public
- All
Type Members
- case class KoreanToken (text: String, pos: KoreanPos, offset: Int, length: Int, stem: Option[String] = None, unknown: Boolean = false) extends Product with Serializable
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
clone(): AnyRef
- Attributes
- protected[java.lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
finalize(): Unit
- Attributes
- protected[java.lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
def
tokenize(text: CharSequence, profile: TokenizerProfile = TokenizerProfile.defaultProfile): Seq[KoreanToken]
Parse Korean text into a sequence of KoreanTokens with custom parameters
Parse Korean text into a sequence of KoreanTokens with custom parameters
- text
Input Korean chunk
- returns
sequence of KoreanTokens
-
def
tokenizeTopN(text: CharSequence, topN: Int = 1, profile: TokenizerProfile = TokenizerProfile.defaultProfile): Seq[Seq[Seq[KoreanToken]]]
Parse Korean text into a sequence of KoreanTokens with custom parameters
Parse Korean text into a sequence of KoreanTokens with custom parameters
- text
Input Korean chunk
- returns
sequence of KoreanTokens
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )