Skip navigation links
A B C D E F G H I J K L M N O P Q R S T U V W _ 

A

add(int, IWord) - Method in class org.lionsoul.jcseg.tokenizer.core.ADictionary
directly add a IWord item to the dictionary
add(int, String, int, int, String) - Method in class org.lionsoul.jcseg.tokenizer.core.ADictionary
add a new word to the dictionary with its statistics frequency
add(int, String, int, int) - Method in class org.lionsoul.jcseg.tokenizer.core.ADictionary
add a new word to the dictionary
add(int, String, int) - Method in class org.lionsoul.jcseg.tokenizer.core.ADictionary
add a new word to the dictionary
add(int, String, int, String) - Method in class org.lionsoul.jcseg.tokenizer.core.ADictionary
add a new word to the dictionary
add(int, IWord) - Method in class org.lionsoul.jcseg.tokenizer.Dictionary
 
add(int, String, int, int, String) - Method in class org.lionsoul.jcseg.tokenizer.Dictionary
 
add(int, String, int) - Method in class org.lionsoul.jcseg.tokenizer.Dictionary
 
add(int, String, int, int) - Method in class org.lionsoul.jcseg.tokenizer.Dictionary
 
add(int, String, int, String) - Method in class org.lionsoul.jcseg.tokenizer.Dictionary
 
add(T) - Method in class org.lionsoul.jcseg.util.IHashQueue
append a item from the tail
add(int) - Method in class org.lionsoul.jcseg.util.IntArrayList
Append a new Integer to the end.
addPartSpeech(String) - Method in interface org.lionsoul.jcseg.tokenizer.core.IWord
add a new part to speech to the word.
addPartSpeech(String) - Method in class org.lionsoul.jcseg.tokenizer.Word
 
addSyn(String) - Method in interface org.lionsoul.jcseg.tokenizer.core.IWord
add a new syn word to the word.
addSyn(String) - Method in class org.lionsoul.jcseg.tokenizer.Word
 
ADictionary - Class in org.lionsoul.jcseg.tokenizer.core
Dictionary abstract super class
ADictionary(JcsegTaskConfig, Boolean) - Constructor for class org.lionsoul.jcseg.tokenizer.core.ADictionary
initialize the ADictionary
AL_TODO_FILE - Static variable in class org.lionsoul.jcseg.tokenizer.core.ADictionary
the default auto load task file name
append(String) - Method in class org.lionsoul.jcseg.util.IStringBuffer
append a string to the buffer
append(char[], int, int) - Method in class org.lionsoul.jcseg.util.IStringBuffer
append parts of the chars to the buffer
append(char[], int) - Method in class org.lionsoul.jcseg.util.IStringBuffer
append the rest of the chars to the buffer
append(char[]) - Method in class org.lionsoul.jcseg.util.IStringBuffer
append some chars to the buffer
append(char) - Method in class org.lionsoul.jcseg.util.IStringBuffer
append a char to the buffer
append(boolean) - Method in class org.lionsoul.jcseg.util.IStringBuffer
append a boolean value
append(short) - Method in class org.lionsoul.jcseg.util.IStringBuffer
append a short value
append(int) - Method in class org.lionsoul.jcseg.util.IStringBuffer
append a int value
append(long) - Method in class org.lionsoul.jcseg.util.IStringBuffer
append a long value
append(float) - Method in class org.lionsoul.jcseg.util.IStringBuffer
append a float value
append(double) - Method in class org.lionsoul.jcseg.util.IStringBuffer
append a double value
APPEND_CJK_ENTITY - Variable in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
do the entity recognition ?
APPEND_CJK_PINYIN - Variable in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
append the Pinyin to the splited IWord
APPEND_CJK_SYN - Variable in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
append the syn word to the splited IWord.
APPEND_PART_OF_SPEECH - Variable in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
append the part of speech.
appendCJKPinyin() - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
appendCJKSyn() - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
appendLatinSyn(IWord) - Method in class org.lionsoul.jcseg.tokenizer.ASegment
Check and append the synonyms words of specified word included the CJK and basic Latin words All the synonyms words share the same position part of speech, word type with the primitive word
appendWordFeatures(IWord) - Method in class org.lionsoul.jcseg.tokenizer.ASegment
check and append the pinyin and the synonyms words of the specified word
ASegment - Class in org.lionsoul.jcseg.tokenizer
abstract segmentation super class: 1.
ASegment(Reader, JcsegTaskConfig, ADictionary) - Constructor for class org.lionsoul.jcseg.tokenizer.ASegment
initialize the segment
ASegment(JcsegTaskConfig, ADictionary) - Constructor for class org.lionsoul.jcseg.tokenizer.ASegment
 
autoFilter - Variable in class org.lionsoul.jcseg.extractor.impl.TextRankKeywordsExtractor
auto filter the words with low score
autoLoad() - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
initialize the value of its options by auto searching the jcesg.properties file:
AutoLoadFile - Class in org.lionsoul.jcseg.tokenizer.core
AutoLoad file to describle the autoload configuration files
AutoLoadFile(String) - Constructor for class org.lionsoul.jcseg.tokenizer.core.AutoLoadFile
 
autoMinLength - Variable in class org.lionsoul.jcseg.extractor.impl.TextRankKeyphraseExtractor
auto append the words with a length over the specifield value as a phrase

B

B - Static variable in class org.lionsoul.jcseg.extractor.impl.TextRankSummaryExtractor
 
behindLatin - Variable in class org.lionsoul.jcseg.tokenizer.ASegment
global behind Latin word after the CJK word added at 2016/11/22 for better mixed word implementation
bucketSort(int[], int) - Static method in class org.lionsoul.jcseg.util.Sort
bucket sort algorithm
bucketSort(Integer[], int) - Static method in class org.lionsoul.jcseg.util.Sort
bucket sort algorithm
buffer() - Method in class org.lionsoul.jcseg.util.IStringBuffer
return the chars of the buffer
ByteCharCounter - Class in org.lionsoul.jcseg.util
All Basic printable Latin char counter class include all the English punctuation and the letters
ByteCharCounter() - Constructor for class org.lionsoul.jcseg.util.ByteCharCounter
 

C

charAt(int) - Method in class org.lionsoul.jcseg.util.IStringBuffer
get the char at a specified position in the buffer
CHECK_CE_MASk - Static variable in interface org.lionsoul.jcseg.tokenizer.core.ISegment
 
CHECK_CF_MASK - Static variable in interface org.lionsoul.jcseg.tokenizer.core.ISegment
 
CHECK_EC_MASK - Static variable in interface org.lionsoul.jcseg.tokenizer.core.ISegment
Whether to check the English Chinese mixed suffix For the new implementation of the mixed word recognition Added at 2016/11/22
Chunk - Class in org.lionsoul.jcseg.tokenizer
chunk concept for the mmseg chinese word segment algorithm has implemented IChunk interface
Chunk(IWord[]) - Constructor for class org.lionsoul.jcseg.tokenizer.Chunk
 
CJK_CHAR - Static variable in interface org.lionsoul.jcseg.tokenizer.core.ILexicon
CJK single word
CJK_UNIT - Static variable in interface org.lionsoul.jcseg.tokenizer.core.ILexicon
Chinese single units
CJK_WORD - Static variable in interface org.lionsoul.jcseg.tokenizer.core.ILexicon
Chinese, JPanese, Korean words
CJKIndexOf(String, int) - Static method in class org.lionsoul.jcseg.util.StringUtil
get the index of the first CJK char of the specified string
CJKIndexOf(String) - Static method in class org.lionsoul.jcseg.util.StringUtil
 
clear() - Method in class org.lionsoul.jcseg.util.IntArrayList
 
clear() - Method in class org.lionsoul.jcseg.util.IStringBuffer
clear the buffer by reset the count to 0
CLEAR_STOPWORD - Variable in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
clear away the stop word.
clearStopwords() - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
clone() - Method in interface org.lionsoul.jcseg.tokenizer.core.IWord
make clone available
clone() - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
rewrite the clone method
clone() - Method in class org.lionsoul.jcseg.tokenizer.Word
Interface to clone the current object
CN_DNAME_1 - Static variable in interface org.lionsoul.jcseg.tokenizer.core.ILexicon
first word of Chinese double name
CN_DNAME_2 - Static variable in interface org.lionsoul.jcseg.tokenizer.core.ILexicon
second word of Chinese double name
CN_LNAME - Static variable in interface org.lionsoul.jcseg.tokenizer.core.ILexicon
Chinese last name
CN_LNAME_ADORN - Static variable in interface org.lionsoul.jcseg.tokenizer.core.ILexicon
the adorn(修饰) char before the last name like word "老陈", "小陈"
CN_SNAME - Static variable in interface org.lionsoul.jcseg.tokenizer.core.ILexicon
Chinese single name
CNFRA_TO_ARABIC - Variable in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
Chinese fraction to Arabic fraction .
cnFractionToArabic() - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
CNNUM_TO_ARABIC - Variable in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
Chinese numeric to Arabic .
cnNumericToArabic(String, boolean) - Static method in class org.lionsoul.jcseg.util.NumericUtil
a static method to turn the Chinese numeric to Arabic numbers
cnNumToArabic() - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
compareTo(TextRankSummaryExtractor.Document) - Method in class org.lionsoul.jcseg.extractor.impl.TextRankSummaryExtractor.Document
override the compareTo method compare document with its relevance score
COMPLEX_MODE - Static variable in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
ComplexSeg - Class in org.lionsoul.jcseg.tokenizer
Jcseg complex segmentation implements extended from the ASegment class this will need the filter works of the four MMSeg rules:
ComplexSeg(JcsegTaskConfig, ADictionary) - Constructor for class org.lionsoul.jcseg.tokenizer.ComplexSeg
 
ComplexSeg(Reader, JcsegTaskConfig, ADictionary) - Constructor for class org.lionsoul.jcseg.tokenizer.ComplexSeg
 
config - Variable in class org.lionsoul.jcseg.tokenizer.ASegment
 
config - Variable in class org.lionsoul.jcseg.tokenizer.core.ADictionary
 
contains(T) - Method in class org.lionsoul.jcseg.util.IHashQueue
check the specifield T is aleady exists in the queue or not
createDateTimePool() - Static method in class org.lionsoul.jcseg.util.TimeUtil
create and return a date-time pool
createDefaultDictionary(JcsegTaskConfig, boolean, boolean) - Static method in class org.lionsoul.jcseg.tokenizer.core.DictionaryFactory
create a default ADictionary instance: 1.
createDefaultDictionary(JcsegTaskConfig) - Static method in class org.lionsoul.jcseg.tokenizer.core.DictionaryFactory
create the ADictionary according to the JcsegTaskConfig check and load the lexicon by default
createDefaultDictionary(JcsegTaskConfig, boolean) - Static method in class org.lionsoul.jcseg.tokenizer.core.DictionaryFactory
create the ADictionary according to the JcsegTaskConfig
createDictionary(Class<? extends ADictionary>, Class<?>[], Object[]) - Static method in class org.lionsoul.jcseg.tokenizer.core.DictionaryFactory
create a new ADictionary instance
createJcseg(int, Object...) - Static method in class org.lionsoul.jcseg.tokenizer.core.SegmentFactory
create the specified mode Jcseg instance
createSegment(Class<? extends ISegment>, Class<?>[], Object[]) - Static method in class org.lionsoul.jcseg.tokenizer.core.SegmentFactory
load the ISegment class with the given path
createSingletonDictionary(JcsegTaskConfig) - Static method in class org.lionsoul.jcseg.tokenizer.core.DictionaryFactory
create a singleton ADictionary object according to the JcsegTaskConfig check and load the lexicon by default
createSingletonDictionary(JcsegTaskConfig, boolean) - Static method in class org.lionsoul.jcseg.tokenizer.core.DictionaryFactory
create a singleton ADictionary object according to the JcsegTaskConfig
ctrlMask - Variable in class org.lionsoul.jcseg.tokenizer.ASegment
segmentation runtime function control mask

D

D - Static variable in class org.lionsoul.jcseg.extractor.impl.TextRankKeyphraseExtractor
 
D - Static variable in class org.lionsoul.jcseg.extractor.impl.TextRankKeywordsExtractor
 
D - Static variable in class org.lionsoul.jcseg.extractor.impl.TextRankSummaryExtractor
 
data - Variable in class org.lionsoul.jcseg.util.IHashQueue.Entry
 
data - Variable in class org.lionsoul.jcseg.util.IIntFIFO.Entry
 
data - Variable in class org.lionsoul.jcseg.util.IIntQueue.Entry
 
DATETIME_A - Static variable in class org.lionsoul.jcseg.util.TimeUtil
 
DATETIME_AV - Static variable in class org.lionsoul.jcseg.util.TimeUtil
 
DATETIME_D - Static variable in class org.lionsoul.jcseg.util.TimeUtil
 
DATETIME_DV - Static variable in class org.lionsoul.jcseg.util.TimeUtil
 
DATETIME_H - Static variable in class org.lionsoul.jcseg.util.TimeUtil
 
DATETIME_HV - Static variable in class org.lionsoul.jcseg.util.TimeUtil
 
DATETIME_I - Static variable in class org.lionsoul.jcseg.util.TimeUtil
 
DATETIME_IV - Static variable in class org.lionsoul.jcseg.util.TimeUtil
 
DATETIME_M - Static variable in class org.lionsoul.jcseg.util.TimeUtil
 
DATETIME_MV - Static variable in class org.lionsoul.jcseg.util.TimeUtil
 
DATETIME_NONE - Static variable in class org.lionsoul.jcseg.util.TimeUtil
date-time part index constants we consider a date-time as the following seven parts: +------+-------+-----+---------------+------+--------+--------+ | 0 | 1 | 2 | 3 | 4 | 5 | 6 | +------+-------+-----+---------------+------+--------+--------+ | year | month | day | timing method | hour | minute | second | +------+-------+-----+---------------+------+--------+--------+ and the numeric value before every part.
DATETIME_S - Static variable in class org.lionsoul.jcseg.util.TimeUtil
 
DATETIME_SV - Static variable in class org.lionsoul.jcseg.util.TimeUtil
 
DATETIME_Y - Static variable in class org.lionsoul.jcseg.util.TimeUtil
 
DATETIME_YV - Static variable in class org.lionsoul.jcseg.util.TimeUtil
 
decrease(char) - Method in class org.lionsoul.jcseg.util.ByteCharCounter
 
decrease(char, int) - Method in class org.lionsoul.jcseg.util.ByteCharCounter
 
deleteCharAt(int) - Method in class org.lionsoul.jcseg.util.IStringBuffer
delete the char at the specified position
DELIMITER_MODE - Static variable in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
DelimiterSeg - Class in org.lionsoul.jcseg.tokenizer
delimiter segment algorithm implementation extended from common segment interface ISegment
DelimiterSeg(JcsegTaskConfig, ADictionary) - Constructor for class org.lionsoul.jcseg.tokenizer.DelimiterSeg
method to create a new ISegment
DelimiterSeg(Reader, JcsegTaskConfig, ADictionary) - Constructor for class org.lionsoul.jcseg.tokenizer.DelimiterSeg
method to create a new ISegment
deQueue() - Method in class org.lionsoul.jcseg.util.IIntFIFO
remove the first item from the queue
deQueue() - Method in class org.lionsoul.jcseg.util.IIntQueue
remove the node from the head and you should make sure the size is larger than 0 by calling size() before you invoke the method or you will just get -1
DETECT_MODE - Static variable in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
DetectSeg - Class in org.lionsoul.jcseg.tokenizer
Detect segmentation mode return words only in the loaded dictionary yat, when matched a word and return it or continue to find the next word in the dictionary
DetectSeg(JcsegTaskConfig, ADictionary) - Constructor for class org.lionsoul.jcseg.tokenizer.DetectSeg
method to create the new ISegment
DetectSeg(Reader, JcsegTaskConfig, ADictionary) - Constructor for class org.lionsoul.jcseg.tokenizer.DetectSeg
method to create a new ISegment
dic - Variable in class org.lionsoul.jcseg.tokenizer.ASegment
the dictionary and task configuration instance
Dictionary - Class in org.lionsoul.jcseg.tokenizer
Dictionary class
Dictionary(JcsegTaskConfig, Boolean) - Constructor for class org.lionsoul.jcseg.tokenizer.Dictionary
 
DictionaryFactory - Class in org.lionsoul.jcseg.tokenizer.core
Dictionary Factory to create Dictionary instance a path of the class that has extends the ADictionary class must be given first
Document(int, Sentence, List<IWord>, double) - Constructor for class org.lionsoul.jcseg.extractor.impl.TextRankSummaryExtractor.Document
construct method
DOMAIN_SUFFIX - Static variable in interface org.lionsoul.jcseg.tokenizer.core.ILexicon
domain name suffix dictionary for the URL recognition

E

E_ANGLE - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_ANGLE_360 - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_ANGLE_90 - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_ANGLE_DU - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_ANGLE_FEN - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_ANGLE_GON - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_ANGLE_MRAD - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_ANGLE_RAD - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_AREA - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_AREA_ACRE - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_AREA_ARE - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_AREA_CM2 - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_AREA_DM2 - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_AREA_FT2 - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_AREA_HA - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_AREA_IN2 - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_AREA_KM2 - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_AREA_M2 - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_AREA_MM2 - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_AREA_MU - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_AREA_NM2 - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_AREA_QING - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_AREA_SQ_FT - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_AREA_SQ_IN - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_AREA_SQ_MI - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_AREA_SQ_RD - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_AREA_SQ_YD - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_AREA_UM2 - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_DATETIME - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_DATETIME_A - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_DATETIME_AH - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_DATETIME_AHI - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_DATETIME_AHIS - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_DATETIME_D - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_DATETIME_H - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_DATETIME_HI - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_DATETIME_HIS - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_DATETIME_I - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_DATETIME_M - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_DATETIME_S - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_DATETIME_Y - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_DATETIME_YM - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_DATETIME_YMD - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_DATETIME_YMDHIS - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_DATETIME_YMDZHIS - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_DISTANCE - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_DISTANCE_KM - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_DISTANCE_LI - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_DISTANCE_LY - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_DISTANCE_MI - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_DISTANCE_NMI - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_EMAIL - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_FORCE - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_FORCE_GF - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_FORCE_KGF - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_FORCE_KIP - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_FORCE_KN - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_FORCE_LBF - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_FORCE_N - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_FORCE_TF - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_IP - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_LENGTH - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_LENGTH_CFT - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_LENGTH_CIN - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_LENGTH_CM - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_LENGTH_DM - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_LENGTH_FM - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_LENGTH_FT - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_LENGTH_FUR - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_LENGTH_IN - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_LENGTH_KM - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_LENGTH_M - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_LENGTH_MM - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_LENGTH_NM - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_LENGTH_TFT - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_LENGTH_UM - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_LENGTH_YD - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_MOBILE_NUMBER - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_MONEY - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_MONEY_AUD - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_MONEY_BUK - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_MONEY_CAD - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_MONEY_CNY - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_MONEY_CSK - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_MONEY_CUP - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_MONEY_DEM - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_MONEY_DKK - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_MONEY_EGP - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_MONEY_EUR - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_MONEY_FRF - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_MONEY_GBP - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_MONEY_HKD - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_MONEY_INR - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_MONEY_ISK - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_MONEY_ITL - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_MONEY_JPY - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_MONEY_KRW - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_MONEY_KWP - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_MONEY_MOP - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_MONEY_MXP - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_MONEY_MYR - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_MONEY_NOK - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_MONEY_NZD - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_MONEY_PHP - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_MONEY_SEK - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_MONEY_SGD - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_MONEY_SKK - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_MONEY_SUR - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_MONEY_THB - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_MONEY_TWD - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_MONEY_USD - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_MONEY_VND - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_MONEY_ZAR - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_NAME - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_NAME_CN - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_NAME_FOREIGN - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_NAME_NICKNAME - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_NUMERIC - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_NUMERIC_ARABIC - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_NUMERIC_CN - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_NUMERIC_CN_FRACTION - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_NUMERIC_DECIMAL - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_NUMERIC_FRACTION - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_NUMERIC_PERCENTAGE - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_PLACE - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_PLACE_CITY - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_PLACE_CONTINENT - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_PLACE_DISTRICT - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_PLACE_NATION - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_PLACE_PROVINCE - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_PLACE_TOWNSHIP - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_PLACE_VIEWPOINT - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_PLACE_VILLAGE - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_PRESSURE - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_PRESSURE_ATM - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_PRESSURE_BAR - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_PRESSURE_HG_IN - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_PRESSURE_HG_MM - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_PRESSURE_HPA - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_PRESSURE_KPA - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_PRESSURE_MBAR - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_PRESSURE_PA - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_PRESSURE_WG_MM - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_QUALITY - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_QUALITY_BP - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_QUALITY_CT - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_QUALITY_DAN - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_QUALITY_DAN_UK - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_QUALITY_DAN_US - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_QUALITY_DR - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_QUALITY_G - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_QUALITY_GR - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_QUALITY_JIN - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_QUALITY_KG - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_QUALITY_LB - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_QUALITY_LIANG - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_QUALITY_LT - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_QUALITY_MG - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_QUALITY_OZ - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_QUALITY_Q - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_QUALITY_QIAN - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_QUALITY_ST - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_QUALITY_T - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_STORAGE - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_STORAGE_B - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_STORAGE_BIT - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_STORAGE_EB - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_STORAGE_GB - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_STORAGE_KB - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_STORAGE_MB - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_STORAGE_PB - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_STORAGE_TB - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_TEMPERATURE - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_TEMPERATURE_C - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_TEMPERATURE_F - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_TEMPERATURE_K - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_TEMPERATURE_R - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_TEMPERATURE_RE - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_TIME - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_TIME_D - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_TIME_H - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_TIME_I - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_TIME_MON - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_TIME_MS - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_TIME_NS - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_TIME_S - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_TIME_US - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_TIME_WEEK - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_TIME_YEAR - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_UNIT_BAG - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_UNIT_BOTTLE - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_UNIT_BOX - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_UNIT_DAN - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_UNIT_DISCOUNT - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_UNIT_ITEM - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_URL - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_URL_FTP - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_URL_HTTP - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_VOLUME - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_VOLUME_CL - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_VOLUME_CM3 - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_VOLUME_DL - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_VOLUME_DM3 - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_VOLUME_FT3 - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_VOLUME_GAL_UK - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_VOLUME_GAL_US - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_VOLUME_HL - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_VOLUME_IN3 - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_VOLUME_L - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_VOLUME_M3 - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_VOLUME_MFT - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_VOLUME_ML - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_VOLUME_MM3 - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_VOLUME_NL - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_VOLUME_UL - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
E_VOLUME_YD3 - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
EN_LETTER - Static variable in class org.lionsoul.jcseg.util.StringUtil
 
EN_NUMERIC - Static variable in class org.lionsoul.jcseg.util.StringUtil
 
EN_POSPEECH - Static variable in interface org.lionsoul.jcseg.tokenizer.core.IWord
 
EN_PUNCTUATION - Static variable in class org.lionsoul.jcseg.util.StringUtil
 
EN_SECOND_SEG - Variable in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
whether to do the secondary split for complex Latin compose
EN_UNKNOW - Static variable in class org.lionsoul.jcseg.util.StringUtil
 
EN_WHITESPACE - Static variable in class org.lionsoul.jcseg.util.StringUtil
 
enQueue(int) - Method in class org.lionsoul.jcseg.util.IIntFIFO
add a new item to the queue
enQueue(int) - Method in class org.lionsoul.jcseg.util.IIntQueue
append a int from the tail
enSecondSeg(IWord, boolean) - Method in class org.lionsoul.jcseg.tokenizer.ASegment
Do the secondary split for the specified complex Latin word This will split a complex English, Arabic, punctuation compose word to multiple simple parts Like 'qq2013' will split to 'qq' and '2013'
Entity - Class in org.lionsoul.jcseg.tokenizer.core
word item entity class
Entity() - Constructor for class org.lionsoul.jcseg.tokenizer.core.Entity
 
EntityFormat - Class in org.lionsoul.jcseg.util
Entity format manager class
EntityFormat() - Constructor for class org.lionsoul.jcseg.util.EntityFormat
 
Entry(T, IHashQueue.Entry<T>, IHashQueue.Entry<T>) - Constructor for class org.lionsoul.jcseg.util.IHashQueue.Entry
 
Entry(int, IIntFIFO.Entry) - Constructor for class org.lionsoul.jcseg.util.IIntFIFO.Entry
 
Entry(int, IIntQueue.Entry, IIntQueue.Entry) - Constructor for class org.lionsoul.jcseg.util.IIntQueue.Entry
 
equals(Object) - Method in interface org.lionsoul.jcseg.tokenizer.core.IWord
I mean: you have to rewrite the equals method cause the Jcseg require it
equals(Object) - Method in class org.lionsoul.jcseg.tokenizer.Word
 

F

fieldsArr - Static variable in class org.lionsoul.jcseg.tokenizer.core.Entity
 
fillDateTimePool(IWord[], IWord) - Static method in class org.lionsoul.jcseg.util.TimeUtil
fill the date-time pool specified part through the specified time entity string.
fillDateTimePool(IWord[], int, IWord) - Static method in class org.lionsoul.jcseg.util.TimeUtil
fill the date-time pool specified part with part index constant
fillTimeToPool(IWord[], String) - Static method in class org.lionsoul.jcseg.util.TimeUtil
fill a date-time time part with a standard time format like '15:45:36' to the specified time pool
filter(IWord) - Method in class org.lionsoul.jcseg.extractor.KeyphraseExtractor
word item filter
filter(IWord) - Method in class org.lionsoul.jcseg.extractor.KeywordsExtractor
word item filter
findCHName(char[], int, IChunk) - Method in class org.lionsoul.jcseg.tokenizer.ASegment
find an Chinese name from the current position of the input chars
findCHName(IWord, IChunk) - Method in class org.lionsoul.jcseg.tokenizer.ASegment
Deprecated.
first() - Method in class org.lionsoul.jcseg.util.IStringBuffer
always return the first char
fwsTohws(String) - Static method in class org.lionsoul.jcseg.util.StringUtil
a static method to replace the full-width char to the half-width char in a given string (65281-65374 for full-width char)

G

get(int, String) - Method in class org.lionsoul.jcseg.tokenizer.core.ADictionary
return the IWord associate with the given key.
get(String) - Static method in class org.lionsoul.jcseg.tokenizer.core.Entity
get the entity string by the specified key
get(int, String) - Method in class org.lionsoul.jcseg.tokenizer.Dictionary
 
get(char) - Method in class org.lionsoul.jcseg.util.ByteCharCounter
 
get(int) - Method in class org.lionsoul.jcseg.util.IntArrayList
 
getAutoMinLength() - Method in class org.lionsoul.jcseg.extractor.impl.TextRankKeyphraseExtractor
 
getAverageWordsLength() - Method in class org.lionsoul.jcseg.tokenizer.Chunk
 
getAverageWordsLength() - Method in interface org.lionsoul.jcseg.tokenizer.core.IChunk
return the average word length for all the chunks.
getBestCJKChunk(char[], int) - Method in class org.lionsoul.jcseg.tokenizer.ASegment
an abstract method to gain a CJK word from the current position.
getBestCJKChunk(char[], int) - Method in class org.lionsoul.jcseg.tokenizer.ComplexSeg
 
getBestCJKChunk(char[], int) - Method in class org.lionsoul.jcseg.tokenizer.SearchSeg
here we don't have to do anything
getBestCJKChunk(char[], int) - Method in class org.lionsoul.jcseg.tokenizer.SimpleSeg
 
getConfig() - Method in class org.lionsoul.jcseg.tokenizer.ASegment
get the current task configuration instance.
getConfig() - Method in class org.lionsoul.jcseg.tokenizer.core.ADictionary
 
getConfig() - Method in class org.lionsoul.jcseg.tokenizer.DelimiterSeg
get the current JcsegTaskConfig instance
getConfig() - Method in class org.lionsoul.jcseg.tokenizer.DetectSeg
get the current task config instance
getDateTimeIndex(String) - Static method in class org.lionsoul.jcseg.util.TimeUtil
get and return the time part index of the specified IWord#entity
getDelimiter() - Method in class org.lionsoul.jcseg.tokenizer.DelimiterSeg
get the current delimiter
getDic() - Method in class org.lionsoul.jcseg.tokenizer.DelimiterSeg
get the current dictionary instance
getDict() - Method in class org.lionsoul.jcseg.tokenizer.ASegment
get the current dictionary instance.
getDict() - Method in class org.lionsoul.jcseg.tokenizer.DetectSeg
get the current dictionary instance
getEnCharType(int) - Static method in class org.lionsoul.jcseg.util.StringUtil
get the type of the English char defined in this class and start with EN_.
getEnSecondSeg() - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
getEntity() - Method in interface org.lionsoul.jcseg.tokenizer.core.IWord
get the entity name of the word
getEntity() - Method in class org.lionsoul.jcseg.tokenizer.Word
 
getFile() - Method in class org.lionsoul.jcseg.tokenizer.core.AutoLoadFile
 
getFrequency() - Method in interface org.lionsoul.jcseg.tokenizer.core.IWord
return the frequency of the word, use only when the word's length is one.
getFrequency() - Method in class org.lionsoul.jcseg.tokenizer.Word
 
getIndex() - Method in class org.lionsoul.jcseg.extractor.impl.TextRankSummaryExtractor.Document
 
getIndex(String) - Static method in class org.lionsoul.jcseg.tokenizer.core.ADictionary
get the key's type index located in ILexicon interface
getJarHome(Object) - Static method in class org.lionsoul.jcseg.util.Util
get the absolute parent path for the jar file.
getKeyphrase(Reader) - Method in class org.lionsoul.jcseg.extractor.impl.TextRankKeyphraseExtractor
 
getKeyphrase(Reader) - Method in class org.lionsoul.jcseg.extractor.KeyphraseExtractor
get the keyphrase list from a reader
getKeyphraseFromFile(String) - Method in class org.lionsoul.jcseg.extractor.KeyphraseExtractor
get the keyphrase list from a file
getKeyphraseFromString(String) - Method in class org.lionsoul.jcseg.extractor.KeyphraseExtractor
get the keyphrase list from a string
getKeySentence(Reader) - Method in class org.lionsoul.jcseg.extractor.impl.TextRankSummaryExtractor
 
getKeySentence(Reader) - Method in class org.lionsoul.jcseg.extractor.SummaryExtractor
get the key sentence from a reader
getKeySentenceFromFile(String) - Method in class org.lionsoul.jcseg.extractor.SummaryExtractor
get key sentence from a file path
getKeySentenceFromString(String) - Method in class org.lionsoul.jcseg.extractor.SummaryExtractor
get key sentence from a string
getKeywords(Reader) - Method in class org.lionsoul.jcseg.extractor.impl.TextRankKeywordsExtractor
 
getKeywords(Reader) - Method in class org.lionsoul.jcseg.extractor.KeywordsExtractor
get the keywords list from a reader
getKeywordsFromFile(String) - Method in class org.lionsoul.jcseg.extractor.KeywordsExtractor
get the keywords list from a file
getKeywordsFromString(String) - Method in class org.lionsoul.jcseg.extractor.KeywordsExtractor
get the keywords list from a string
getKeywordsNum() - Method in class org.lionsoul.jcseg.extractor.impl.TextRankKeyphraseExtractor
 
getKeywordsNum() - Method in class org.lionsoul.jcseg.extractor.impl.TextRankKeywordsExtractor
 
getLargestAverageWordLengthChunks(IChunk[]) - Static method in class org.lionsoul.jcseg.tokenizer.core.MMSegFilter
2.
getLargestSingleMorphemicFreedomChunks(IChunk[]) - Static method in class org.lionsoul.jcseg.tokenizer.core.MMSegFilter
the largest sum of degree of morphemic freedom of one-character words this rule will return the chunks that own the largest sum of degree of morphemic freedom of one-character
getLastUpdateTime() - Method in class org.lionsoul.jcseg.tokenizer.core.AutoLoadFile
 
getLength() - Method in class org.lionsoul.jcseg.sentence.Sentence
 
getLength() - Method in class org.lionsoul.jcseg.tokenizer.Chunk
 
getLength() - Method in interface org.lionsoul.jcseg.tokenizer.core.IChunk
return the length of the chunk(the number of the word)
getLength() - Method in interface org.lionsoul.jcseg.tokenizer.core.IWord
return the length of the word
getLength() - Method in class org.lionsoul.jcseg.tokenizer.Word
 
getLexiconPath() - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
return the lexicon directory path
getMaxCnLnadron() - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
getMaximumMatchChunks(IChunk[]) - Static method in class org.lionsoul.jcseg.tokenizer.core.MMSegFilter
1.
getMaxIterateNum() - Method in class org.lionsoul.jcseg.extractor.impl.TextRankKeyphraseExtractor
 
getMaxIterateNum() - Method in class org.lionsoul.jcseg.extractor.impl.TextRankKeywordsExtractor
 
getMaxIterateNum() - Method in class org.lionsoul.jcseg.extractor.impl.TextRankSummaryExtractor
 
getMaxLength() - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
getMaxWordsNum() - Method in class org.lionsoul.jcseg.extractor.impl.TextRankKeyphraseExtractor
 
getNameSingleThreshold() - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
getNextCJKWord(int, int) - Method in class org.lionsoul.jcseg.tokenizer.ASegment
get the next CJK word from the current position of the input stream
getNextCJKWord(int, int) - Method in class org.lionsoul.jcseg.tokenizer.NLPSeg
 
getNextCJKWord(int, int) - Method in class org.lionsoul.jcseg.tokenizer.SearchSeg
get the next CJK word from the current position of the input stream and this function is the core part the most segmentation implements
getNextDatetimeWord(IWord) - Method in class org.lionsoul.jcseg.tokenizer.NLPSeg
get and return the next date-time word
getNextLatinWord(int, int) - Method in class org.lionsoul.jcseg.tokenizer.ASegment
get the next Latin word from the current position of the input stream
getNextMatch(char[], int) - Method in class org.lionsoul.jcseg.tokenizer.ASegment
match the next CJK word in the dictionary
getNextMixedWord(char[], int) - Method in class org.lionsoul.jcseg.tokenizer.ASegment
get the next mixed word, CJK-English or CJK-English-CJK or whatever
getNextPunctuationPairWord(int, int) - Method in class org.lionsoul.jcseg.tokenizer.ASegment
get the next punctuation pair word from the current position of the input stream.
getNextTimeMergedWord(IWord) - Method in class org.lionsoul.jcseg.tokenizer.NLPSeg
get and return the next time merged date-time word
getPairPunctuationText(int) - Method in class org.lionsoul.jcseg.tokenizer.ASegment
find pair punctuation of the given punctuation char the purpose is to get the text between them
getPartSpeech() - Method in interface org.lionsoul.jcseg.tokenizer.core.IWord
return the part of speech of the word.
getPartSpeech() - Method in class org.lionsoul.jcseg.tokenizer.Word
 
getPinyin() - Method in interface org.lionsoul.jcseg.tokenizer.core.IWord
return the pinying of the word
getPinyin() - Method in class org.lionsoul.jcseg.tokenizer.Word
 
getPollTime() - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
getPosition() - Method in class org.lionsoul.jcseg.sentence.Sentence
 
getPosition() - Method in interface org.lionsoul.jcseg.tokenizer.core.IWord
return the start position of the word.
getPosition() - Method in class org.lionsoul.jcseg.tokenizer.Word
 
getPPTMaxLength() - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
getPropertieFile() - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
getPunctuationPair(char) - Static method in class org.lionsoul.jcseg.util.StringUtil
get the pair punctuation' pair
getQueueSize() - Method in class org.lionsoul.jcseg.util.IPushbackReader
get the buffer size - the number of buffered data
getScore() - Method in class org.lionsoul.jcseg.extractor.impl.TextRankSummaryExtractor.Document
 
getSeg() - Method in class org.lionsoul.jcseg.extractor.KeyphraseExtractor
 
getSeg() - Method in class org.lionsoul.jcseg.extractor.KeywordsExtractor
 
getSentence() - Method in class org.lionsoul.jcseg.extractor.impl.TextRankSummaryExtractor.Document
 
getSentenceNum() - Method in class org.lionsoul.jcseg.extractor.impl.TextRankSummaryExtractor
 
getSentenceSeg() - Method in class org.lionsoul.jcseg.extractor.SummaryExtractor
 
getSingleWordsMorphemicFreedom() - Method in class org.lionsoul.jcseg.tokenizer.Chunk
 
getSingleWordsMorphemicFreedom() - Method in interface org.lionsoul.jcseg.tokenizer.core.IChunk
return the degree of morphemic freedom for all the single words.
getSmallestVarianceWordLengthChunks(IChunk[]) - Static method in class org.lionsoul.jcseg.tokenizer.core.MMSegFilter
the smallest variance word length this rule will the chunks that one the smallest variance word length
getSTokenMinLen() - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
getStreamPosition() - Method in class org.lionsoul.jcseg.tokenizer.ASegment
 
getStreamPosition() - Method in interface org.lionsoul.jcseg.tokenizer.core.ISegment
get the current length of the stream
getStreamPosition() - Method in class org.lionsoul.jcseg.tokenizer.DelimiterSeg
 
getStreamPosition() - Method in class org.lionsoul.jcseg.tokenizer.DetectSeg
 
getSummary(Reader, int) - Method in class org.lionsoul.jcseg.extractor.impl.TextRankSummaryExtractor
 
getSummary(Reader, int) - Method in class org.lionsoul.jcseg.extractor.SummaryExtractor
get summary from a reader
getSummaryFromFile(String, int) - Method in class org.lionsoul.jcseg.extractor.SummaryExtractor
get document summary from a file
getSummaryFromString(String, int) - Method in class org.lionsoul.jcseg.extractor.SummaryExtractor
get document summary from a string
getSyn() - Method in interface org.lionsoul.jcseg.tokenizer.core.IWord
return the syn words of the word.
getSyn() - Method in class org.lionsoul.jcseg.tokenizer.Word
 
getTimeKey(String) - Static method in class org.lionsoul.jcseg.util.TimeUtil
get and return the time key part of the specified entity string
getTimeKey(IWord) - Static method in class org.lionsoul.jcseg.util.TimeUtil
 
getTimeKey(int) - Static method in class org.lionsoul.jcseg.util.TimeUtil
get and return the time key part with the part index value
getType() - Method in interface org.lionsoul.jcseg.tokenizer.core.IWord
return the type of the word
getType() - Method in class org.lionsoul.jcseg.tokenizer.Word
 
getValue() - Method in class org.lionsoul.jcseg.sentence.Sentence
 
getValue() - Method in interface org.lionsoul.jcseg.tokenizer.core.IWord
return the value of the word
getValue() - Method in class org.lionsoul.jcseg.tokenizer.Word
 
getWindowSize() - Method in class org.lionsoul.jcseg.extractor.impl.TextRankKeyphraseExtractor
 
getWindowSize() - Method in class org.lionsoul.jcseg.extractor.impl.TextRankKeywordsExtractor
 
getWords() - Method in class org.lionsoul.jcseg.extractor.impl.TextRankSummaryExtractor.Document
 
getWords() - Method in class org.lionsoul.jcseg.tokenizer.Chunk
 
getWords() - Method in interface org.lionsoul.jcseg.tokenizer.core.IChunk
get the all the words in the chunk.
getWordSeg() - Method in class org.lionsoul.jcseg.extractor.SummaryExtractor
 
getWordsVariance() - Method in class org.lionsoul.jcseg.tokenizer.Chunk
 
getWordsVariance() - Method in interface org.lionsoul.jcseg.tokenizer.core.IChunk
return the variance of all the words in all the chunks.
gisb - Variable in class org.lionsoul.jcseg.sentence.SentenceSeg
global string buffer

H

hashCode() - Method in class org.lionsoul.jcseg.tokenizer.Word
rewrite the hash code generate algorithm take the value as the main factor
hwsTofws(String) - Static method in class org.lionsoul.jcseg.util.StringUtil
a static method to replace the half-width char to the full-width char in a given string

I

I_CN_NAME - Variable in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
identify the Chinese name?
ialist - Variable in class org.lionsoul.jcseg.tokenizer.ASegment
 
IChunk - Interface in org.lionsoul.jcseg.tokenizer.core
chunk interface for Jcseg.
identifyCnName() - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
idx - Variable in class org.lionsoul.jcseg.sentence.SentenceSeg
 
idx - Variable in class org.lionsoul.jcseg.tokenizer.ASegment
the index value of the current input stream mainly for track the start position of the token
IHashQueue<T extends IWord> - Class in org.lionsoul.jcseg.util
A normal queue base one single link but with hash index, so, it is fast for searching
IHashQueue() - Constructor for class org.lionsoul.jcseg.util.IHashQueue
 
IHashQueue.Entry<T> - Class in org.lionsoul.jcseg.util
innner Entry node class
IIntFIFO - Class in org.lionsoul.jcseg.util
int first in first out queue base on single link
IIntFIFO() - Constructor for class org.lionsoul.jcseg.util.IIntFIFO
 
IIntFIFO.Entry - Class in org.lionsoul.jcseg.util
Item Entry inner class
IIntQueue - Class in org.lionsoul.jcseg.util
char queue class base on double link Not thread safe
IIntQueue() - Constructor for class org.lionsoul.jcseg.util.IIntQueue
 
IIntQueue.Entry - Class in org.lionsoul.jcseg.util
innner Entry node class
ILexicon - Interface in org.lionsoul.jcseg.tokenizer.core
lexicon configuration class.
increase(char) - Method in class org.lionsoul.jcseg.util.ByteCharCounter
 
increase(char, int) - Method in class org.lionsoul.jcseg.util.ByteCharCounter
 
insertionSort(T[]) - Static method in class org.lionsoul.jcseg.util.Sort
insert sort method
insertionSort(T[], int, int) - Static method in class org.lionsoul.jcseg.util.Sort
method to sort an subarray from start to end with insertion sort algorithm
IntArrayList - Class in org.lionsoul.jcseg.util
array list for basic int data type to intead of ArrayList Well, this will save a lot work to Reopened and Unpacking
IntArrayList() - Constructor for class org.lionsoul.jcseg.util.IntArrayList
 
IntArrayList(int) - Constructor for class org.lionsoul.jcseg.util.IntArrayList
 
IPushbackReader - Class in org.lionsoul.jcseg.util
IPushBackReader based on Reader Not thread safe support unlimited unread operation
IPushbackReader(Reader) - Constructor for class org.lionsoul.jcseg.util.IPushbackReader
 
isAutoFilter() - Method in class org.lionsoul.jcseg.extractor.impl.TextRankKeywordsExtractor
 
isAutoload() - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
about lexicon autoload
isb - Variable in class org.lionsoul.jcseg.tokenizer.ASegment
 
isCJK(String) - Static method in class org.lionsoul.jcseg.util.StringUtil
check if the specified string is all CJK chars
isCJK(String, int, int) - Static method in class org.lionsoul.jcseg.util.StringUtil
 
isCJKChar(int) - Static method in class org.lionsoul.jcseg.util.StringUtil
check the specified char is CJK, Thai...
isCNNumeric(char) - Static method in class org.lionsoul.jcseg.util.NumericUtil
check if the given char is a Chinese numeric or not
isCnPunctuation(int) - Static method in class org.lionsoul.jcseg.util.StringUtil
 
isDate(String, char) - Static method in class org.lionsoul.jcseg.util.EntityFormat
check if the specified string is an valid Latin Date string like "2017/02/22", "2017-02-22" or "2017.02.22"
isDecimal(String) - Static method in class org.lionsoul.jcseg.util.StringUtil
check the specified char is a decimal including the full-width char
isDecimal(String, int, int) - Static method in class org.lionsoul.jcseg.util.StringUtil
 
isDigit(String) - Static method in class org.lionsoul.jcseg.util.StringUtil
check the specified char is a digit or not true will return if it is or return false this method can recognize full-with char
isDigit(String, int, int) - Static method in class org.lionsoul.jcseg.util.StringUtil
 
ISegment - Interface in org.lionsoul.jcseg.tokenizer.core
Jcseg segment interface
isEnChar(int) - Static method in class org.lionsoul.jcseg.util.StringUtil
check the specified char is a basic Latin and Russia and Greece letter.
isENKeepPunctuaton(char) - Static method in class org.lionsoul.jcseg.util.StringUtil
check the given char is English keep punctuation
isEnLetter(int) - Static method in class org.lionsoul.jcseg.util.StringUtil
include the full-width and half-width char
isEnNumeric(int) - Static method in class org.lionsoul.jcseg.util.StringUtil
check the specified char is an English numeric(48-57) including the full-width char
isEnPunctuation(int) - Static method in class org.lionsoul.jcseg.util.StringUtil
check the given char is half-width punctuation
isFWEnChar(int) - Static method in class org.lionsoul.jcseg.util.StringUtil
check the given char is a full-width char AT+reader: the full-width punctuation is not included here
isHWEnChar(int) - Static method in class org.lionsoul.jcseg.util.StringUtil
check the given char is a half-width char or not
isIpAddress(String) - Static method in class org.lionsoul.jcseg.util.EntityFormat
check if the specified string is a IPv4/v6 address v6 is not supported for now
isKeepPunctuation(char) - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
isLatin(String) - Static method in class org.lionsoul.jcseg.util.StringUtil
check if the specified string is all Latin chars
isLatin(String, int, int) - Static method in class org.lionsoul.jcseg.util.StringUtil
 
isLetter(String) - Static method in class org.lionsoul.jcseg.util.StringUtil
check if the specified string is Latin letter
isLetter(String, int, int) - Static method in class org.lionsoul.jcseg.util.StringUtil
 
isLetterNumber(int) - Static method in class org.lionsoul.jcseg.util.StringUtil
check the specified char is Letter number like 'ⅠⅡ' true will be return if it is, or return false
isLetterOrNumeric(String) - Static method in class org.lionsoul.jcseg.util.StringUtil
check if the specified string is Latin numeric or letter
isLetterOrNumeric(String, int, int) - Static method in class org.lionsoul.jcseg.util.StringUtil
 
isLowerCaseLetter(int) - Static method in class org.lionsoul.jcseg.util.StringUtil
 
isMailAddress(String) - Static method in class org.lionsoul.jcseg.util.EntityFormat
check if the specified string is an email address or not
isMobileNumber(String) - Static method in class org.lionsoul.jcseg.util.EntityFormat
check if the specified string is a mobile number
isNoTailingPunctuation(char) - Static method in class org.lionsoul.jcseg.util.StringUtil
check if the given punctuation is the one that need to be cleared
isNumeric(String) - Static method in class org.lionsoul.jcseg.util.StringUtil
check if the specified string it Latin numeric
isNumeric(String, int, int) - Static method in class org.lionsoul.jcseg.util.StringUtil
 
isOtherNumber(int) - Static method in class org.lionsoul.jcseg.util.StringUtil
check the specified char is other number like '①⑩⑽㈩' true will be return if it is, or return false
isPairPunctuation(char) - Static method in class org.lionsoul.jcseg.util.StringUtil
check the given char is pair punctuation or not
isSync() - Method in class org.lionsoul.jcseg.tokenizer.core.ADictionary
 
isTime(String) - Static method in class org.lionsoul.jcseg.util.EntityFormat
check if the specified string is a valid time string like '12:45', '12:45:12'
IStringBuffer - Class in org.lionsoul.jcseg.util
string buffer class
IStringBuffer() - Constructor for class org.lionsoul.jcseg.util.IStringBuffer
create a buffer with a default length 16
IStringBuffer(int) - Constructor for class org.lionsoul.jcseg.util.IStringBuffer
create a buffer with a specified length
IStringBuffer(String) - Constructor for class org.lionsoul.jcseg.util.IStringBuffer
create a buffer with a specified string
isUpperCaseLetter(int) - Static method in class org.lionsoul.jcseg.util.StringUtil
 
isUrlAddress(String, ADictionary) - Static method in class org.lionsoul.jcseg.util.EntityFormat
check if the specified string is an URL address or not
isWhitespace(int) - Static method in class org.lionsoul.jcseg.util.StringUtil
check the given string is a whitespace
IWord - Interface in org.lionsoul.jcseg.tokenizer.core
Word interface

J

JcsegException - Exception in org.lionsoul.jcseg.tokenizer.core
JCSeg exception class
JcsegException(String) - Constructor for exception org.lionsoul.jcseg.tokenizer.core.JcsegException
 
JcsegException(Throwable) - Constructor for exception org.lionsoul.jcseg.tokenizer.core.JcsegException
 
JcsegException(String, Throwable) - Constructor for exception org.lionsoul.jcseg.tokenizer.core.JcsegException
 
JcsegTaskConfig - Class in org.lionsoul.jcseg.tokenizer.core
Jcseg segmentation task configuration class
JcsegTaskConfig() - Constructor for class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
create the config and do nothing about initialize Note: this may cuz Incompatibility problems for the old version that has use this construct method
JcsegTaskConfig(boolean) - Constructor for class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
create and initialize the config by auto load
JcsegTaskConfig(String) - Constructor for class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
create and initialize the task config from a properties file
JcsegTaskConfig(InputStream) - Constructor for class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
create and initialize the task config from a InputStream
JcsegTest - Class in org.lionsoul.jcseg.test
Jcseg test program.
JcsegTest() - Constructor for class org.lionsoul.jcseg.test.JcsegTest
 

K

K1 - Static variable in class org.lionsoul.jcseg.extractor.impl.TextRankSummaryExtractor
 
KEEP_UNREG_WORDS - Variable in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
keepUnregWords() - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
keyphrase(String) - Method in class org.lionsoul.jcseg.test.JcsegTest
keyphrase extractor
KeyphraseExtractor - Class in org.lionsoul.jcseg.extractor
key phrase extractor
KeyphraseExtractor(ISegment) - Constructor for class org.lionsoul.jcseg.extractor.KeyphraseExtractor
construct method
keywords(String) - Method in class org.lionsoul.jcseg.test.JcsegTest
keywords extractor
KeywordsExtractor - Class in org.lionsoul.jcseg.extractor
document keywords extractor
KeywordsExtractor(ISegment) - Constructor for class org.lionsoul.jcseg.extractor.KeywordsExtractor
construct method
keywordsNum - Variable in class org.lionsoul.jcseg.extractor.impl.TextRankKeyphraseExtractor
 
keywordsNum - Variable in class org.lionsoul.jcseg.extractor.impl.TextRankKeywordsExtractor
 

L

ladCJKPos() - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
last() - Method in class org.lionsoul.jcseg.util.IStringBuffer
always return the last char
latinIndexOf(String, int) - Static method in class org.lionsoul.jcseg.util.StringUtil
get the index of the first Latin char of the specified string
latinIndexOf(String) - Static method in class org.lionsoul.jcseg.util.StringUtil
 
length() - Method in class org.lionsoul.jcseg.util.IStringBuffer
return the length of the buffer
LEX_PROPERTY_FILE - Static variable in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
default lexicon property file name
LexiconException - Exception in org.lionsoul.jcseg.tokenizer.core
JCSeg Dictionary configuration exception class
LexiconException(String) - Constructor for exception org.lionsoul.jcseg.tokenizer.core.LexiconException
 
LexiconException(Throwable) - Constructor for exception org.lionsoul.jcseg.tokenizer.core.LexiconException
 
LexiconException(String, Throwable) - Constructor for exception org.lionsoul.jcseg.tokenizer.core.LexiconException
 
load(File) - Method in class org.lionsoul.jcseg.tokenizer.core.ADictionary
load all the words from a specified lexicon file
load(String) - Method in class org.lionsoul.jcseg.tokenizer.core.ADictionary
load all the words from a specified lexicon path
load(InputStream) - Method in class org.lionsoul.jcseg.tokenizer.core.ADictionary
load all the words from a specified lexicon input stream
load(String) - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
initialize the value of its options from a speicfied jcseg.properties propertie file
load(InputStream) - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
initialize the value of its options from a InputStream of a jcseg.properties prperties file
LOAD_CJK_ENTITY - Variable in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
whether to load the entity define
LOAD_CJK_PINYIN - Variable in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
whether to load the Pinyin of the CJK_WORDS
LOAD_CJK_POS - Variable in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
whether to load the word's part of speech
LOAD_CJK_SYN - Variable in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
whether to load the syn word of the CJK_WORDS.
loadCJKEntity() - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
loadCJKPinyin() - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
loadCJKSyn() - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
loadClassPath() - Method in class org.lionsoul.jcseg.tokenizer.core.ADictionary
load all the words from all the files under the specified class path.
loadDirectory(String) - Method in class org.lionsoul.jcseg.tokenizer.core.ADictionary
load the all the words form all the files under a specified lexicon directory
loadWords(JcsegTaskConfig, ADictionary, File) - Static method in class org.lionsoul.jcseg.tokenizer.core.ADictionary
load all the words in the specified lexicon file into the dictionary
loadWords(JcsegTaskConfig, ADictionary, String) - Static method in class org.lionsoul.jcseg.tokenizer.core.ADictionary
load all the words from a specified lexicon file path
loadWords(JcsegTaskConfig, ADictionary, InputStream) - Static method in class org.lionsoul.jcseg.tokenizer.core.ADictionary
load words from a InputStream

M

main(String[]) - Static method in class org.lionsoul.jcseg.test.JcsegTest
 
match(int, String) - Method in class org.lionsoul.jcseg.tokenizer.core.ADictionary
loop up the dictionary, check the given key is in the dictionary or not
match(int, String) - Method in class org.lionsoul.jcseg.tokenizer.Dictionary
 
MAX_CN_LNADRON - Variable in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
the max length for the adron of the Chinese last name.like 老陈 “老”
MAX_LATIN_LENGTH - Variable in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
maximum length for Latin words
MAX_LENGTH - Variable in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
maximum length for maximum match(5-7)
MAX_UNIT_LENGTH - Variable in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
maximum length for unit words for the NLP algorithm added at 2016/11/18
maxIterateNum - Variable in class org.lionsoul.jcseg.extractor.impl.TextRankKeyphraseExtractor
 
maxIterateNum - Variable in class org.lionsoul.jcseg.extractor.impl.TextRankKeywordsExtractor
 
maxIterateNum - Variable in class org.lionsoul.jcseg.extractor.impl.TextRankSummaryExtractor
 
maxWordsNum - Variable in class org.lionsoul.jcseg.extractor.impl.TextRankKeyphraseExtractor
max phrase length
mergeSort(T[]) - Static method in class org.lionsoul.jcseg.util.Sort
merge sort algorithm
MIX_ASSIST_WORD - Static variable in interface org.lionsoul.jcseg.tokenizer.core.ILexicon
special lexicon for Chinese-English[-Chinese] mixed word recognition For the optimization implementation of the mixed word recognition
MIX_POSPEECH - Static variable in interface org.lionsoul.jcseg.tokenizer.core.IWord
 
mixPrefixLength - Variable in class org.lionsoul.jcseg.tokenizer.core.ADictionary
 
mixSuffixLength - Variable in class org.lionsoul.jcseg.tokenizer.core.ADictionary
maximum length for the Chinese words after the LATIN word or the one before it used to match Chinese and English mix word, like 'B超,AA制...' or style compose style like '卡拉ok'.
MMSegFilter - Class in org.lionsoul.jcseg.tokenizer.core
mmseg default filter class
MMSegFilter() - Constructor for class org.lionsoul.jcseg.tokenizer.core.MMSegFilter
 

N

NAME_POSPEECH - Static variable in interface org.lionsoul.jcseg.tokenizer.core.IWord
 
NAME_SINGLE_THRESHOLD - Variable in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
the threshold of the single word that is a single word when it and the last char of the name make up a word.
next() - Method in class org.lionsoul.jcseg.sentence.SentenceSeg
get the next sentence
next() - Method in class org.lionsoul.jcseg.tokenizer.ASegment
 
next() - Method in interface org.lionsoul.jcseg.tokenizer.core.ISegment
segment a word from a char array from a specified position.
next() - Method in class org.lionsoul.jcseg.tokenizer.DelimiterSeg
 
next() - Method in class org.lionsoul.jcseg.tokenizer.DetectSeg
 
next() - Method in class org.lionsoul.jcseg.tokenizer.NLPSeg
Override the next method to add the date-time entity recognition And we also invoke the parent.next method to get the next token
next - Variable in class org.lionsoul.jcseg.util.IHashQueue.Entry
 
next - Variable in class org.lionsoul.jcseg.util.IIntFIFO.Entry
 
next - Variable in class org.lionsoul.jcseg.util.IIntQueue.Entry
 
nextCJKSentence(int) - Method in class org.lionsoul.jcseg.tokenizer.ASegment
load a CJK char list from the stream start from the current position till the char is not a CJK char
nextCNNumeric(char[], int) - Method in class org.lionsoul.jcseg.tokenizer.ASegment
find the Chinese number from the current position count until the char in the specified position is not a other number or whitespace
nextLatinString(int) - Method in class org.lionsoul.jcseg.tokenizer.ASegment
the simple version of the next basic Latin fetch logic Just return the next Latin string with the keep punctuation after it
nextLatinWord(int, int) - Method in class org.lionsoul.jcseg.tokenizer.ASegment
find the letter or digit word from the current position count until the char is whitespace or not letter_digit
nextLatinWord(int, int) - Method in class org.lionsoul.jcseg.tokenizer.NLPSeg
find the letter or digit word from the current position count until the char is whitespace or not letter_digit
nextLetterNumber(int) - Method in class org.lionsoul.jcseg.tokenizer.ASegment
find the next other letter from the current position find the letter number from the current position count until the char in the specified position is not a letter number or whitespace
nextOtherNumber(int) - Method in class org.lionsoul.jcseg.tokenizer.ASegment
find the other number from the current position count until the char in the specified position is not a other number or whitespace
NLP_MODE - Static variable in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
NLPSeg - Class in org.lionsoul.jcseg.tokenizer
NLP segmentation implementation And this extends all the properties of the Complex one the rest of them are build for NLP only
NLPSeg(Reader, JcsegTaskConfig, ADictionary) - Constructor for class org.lionsoul.jcseg.tokenizer.NLPSeg
 
NLPSeg(JcsegTaskConfig, ADictionary) - Constructor for class org.lionsoul.jcseg.tokenizer.NLPSeg
 
NUMERIC_POSPEECH - Static variable in interface org.lionsoul.jcseg.tokenizer.core.IWord
 
NumericUtil - Class in org.lionsoul.jcseg.util
a class to deal with Chinese numeric
NumericUtil() - Constructor for class org.lionsoul.jcseg.util.NumericUtil
 

O

org.lionsoul.jcseg.extractor - package org.lionsoul.jcseg.extractor
 
org.lionsoul.jcseg.extractor.impl - package org.lionsoul.jcseg.extractor.impl
 
org.lionsoul.jcseg.sentence - package org.lionsoul.jcseg.sentence
 
org.lionsoul.jcseg.test - package org.lionsoul.jcseg.test
 
org.lionsoul.jcseg.tokenizer - package org.lionsoul.jcseg.tokenizer
 
org.lionsoul.jcseg.tokenizer.core - package org.lionsoul.jcseg.tokenizer.core
 
org.lionsoul.jcseg.util - package org.lionsoul.jcseg.util
 

P

PPT_MAX_LENGTH - Variable in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
the maximum length for the text between the pair punctuation.
PPT_POSPEECH - Static variable in interface org.lionsoul.jcseg.tokenizer.core.IWord
 
prev - Variable in class org.lionsoul.jcseg.util.IHashQueue.Entry
 
prev - Variable in class org.lionsoul.jcseg.util.IIntQueue.Entry
 
printMatrix(double[][]) - Static method in class org.lionsoul.jcseg.util.Util
print the specified matrix
PUNCTUATION - Static variable in interface org.lionsoul.jcseg.tokenizer.core.IWord
 
pushBack(int) - Method in class org.lionsoul.jcseg.tokenizer.ASegment
push back the data to the stream.
pushBack(String) - Method in class org.lionsoul.jcseg.tokenizer.ASegment
push back a string to the stream
pushBack(int) - Method in class org.lionsoul.jcseg.tokenizer.DelimiterSeg
push back the data to the stream
pushBack(int) - Method in class org.lionsoul.jcseg.tokenizer.DetectSeg
push back the data to the stream

Q

qCNNumericToArabic(String) - Static method in class org.lionsoul.jcseg.util.NumericUtil
 
quickSelect(T[], int) - Static method in class org.lionsoul.jcseg.util.Sort
quick select algorithm
quicksort(T[]) - Static method in class org.lionsoul.jcseg.util.Sort
quick sort algorithm

R

read() - Method in class org.lionsoul.jcseg.util.IPushbackReader
read the next int from the stream this will check the buffer queue first and take the first item of the buffer as the result
read(char[], int, int) - Method in class org.lionsoul.jcseg.util.IPushbackReader
read the specified block from the stream
reader - Variable in class org.lionsoul.jcseg.sentence.SentenceSeg
 
reader - Variable in class org.lionsoul.jcseg.tokenizer.ASegment
 
readNext() - Method in class org.lionsoul.jcseg.sentence.SentenceSeg
read the next char from the current position
readNext() - Method in class org.lionsoul.jcseg.tokenizer.ASegment
read the next char from the current position
readNext() - Method in class org.lionsoul.jcseg.tokenizer.DelimiterSeg
read the next char from the current position
readNext() - Method in class org.lionsoul.jcseg.tokenizer.DetectSeg
read the next char from the current position
readUntil(char) - Method in class org.lionsoul.jcseg.sentence.SentenceSeg
loop the reader until the specifield char is found.
remove(int, String) - Method in class org.lionsoul.jcseg.tokenizer.core.ADictionary
remove the mapping associate with the given key
remove(int, String) - Method in class org.lionsoul.jcseg.tokenizer.Dictionary
 
remove() - Method in class org.lionsoul.jcseg.util.IHashQueue
remove the node from the head and you should make sure the size is larger than 0 by calling size() before you invoke the method or you will just get null.
remove(int) - Method in class org.lionsoul.jcseg.util.IntArrayList
remove the element at the specified position use System.arraycopy intead of a loop may be more effcient
reset(Reader) - Method in class org.lionsoul.jcseg.sentence.SentenceSeg
stream/reader reset.
reset(Reader) - Method in class org.lionsoul.jcseg.tokenizer.ASegment
input stream and reader reset.
reset(Reader) - Method in interface org.lionsoul.jcseg.tokenizer.core.ISegment
reset the reader
reset(Reader) - Method in class org.lionsoul.jcseg.tokenizer.DelimiterSeg
 
reset(Reader) - Method in class org.lionsoul.jcseg.tokenizer.DetectSeg
 
resetMode(int) - Method in class org.lionsoul.jcseg.test.JcsegTest
 
resetPrefixLength(JcsegTaskConfig, ADictionary, int) - Static method in class org.lionsoul.jcseg.tokenizer.core.ADictionary
check and reset the value of ADictionary.mixPrefixLength
resetSuffixLength(JcsegTaskConfig, ADictionary, int) - Static method in class org.lionsoul.jcseg.tokenizer.core.ADictionary
check and reset the value of the ADictionary.mixSuffixLength

S

SEARCH_MODE - Static variable in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
SearchSeg - Class in org.lionsoul.jcseg.tokenizer
search mode implementation all the possible combination will be returned, and build it for search of course.
SearchSeg(JcsegTaskConfig, ADictionary) - Constructor for class org.lionsoul.jcseg.tokenizer.SearchSeg
 
SearchSeg(Reader, JcsegTaskConfig, ADictionary) - Constructor for class org.lionsoul.jcseg.tokenizer.SearchSeg
 
seg - Variable in class org.lionsoul.jcseg.extractor.KeyphraseExtractor
the ISegment object
seg - Variable in class org.lionsoul.jcseg.extractor.KeywordsExtractor
the ISegment object
SegmentFactory - Class in org.lionsoul.jcseg.tokenizer.core
Segment factory to create singleton ISegment object a path of the class that has implemented the ISegment interface must be given first
SegmentFactory() - Constructor for class org.lionsoul.jcseg.tokenizer.core.SegmentFactory
 
Sentence - Class in org.lionsoul.jcseg.sentence
sentence desc class
Sentence(String, int) - Constructor for class org.lionsoul.jcseg.sentence.Sentence
construct method
Sentence(String) - Constructor for class org.lionsoul.jcseg.sentence.Sentence
 
sentence(String) - Method in class org.lionsoul.jcseg.test.JcsegTest
key sentence extractor
sentenceNum - Variable in class org.lionsoul.jcseg.extractor.impl.TextRankSummaryExtractor
 
sentenceSeg - Variable in class org.lionsoul.jcseg.extractor.SummaryExtractor
sentence splitter object
SentenceSeg - Class in org.lionsoul.jcseg.sentence
document sentence splitter
SentenceSeg(Reader) - Constructor for class org.lionsoul.jcseg.sentence.SentenceSeg
construct method
SentenceSeg() - Constructor for class org.lionsoul.jcseg.sentence.SentenceSeg
 
set(int, int) - Method in class org.lionsoul.jcseg.util.IntArrayList
 
set(int, char) - Method in class org.lionsoul.jcseg.util.IStringBuffer
set the char at the specified index
setAppendCJKPinyin(boolean) - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
setAppendCJKSyn(boolean) - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
setAppendPartOfSpeech(boolean) - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
setAutoFilter(boolean) - Method in class org.lionsoul.jcseg.extractor.impl.TextRankKeywordsExtractor
 
setAutoload(boolean) - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
setAutoMinLength(int) - Method in class org.lionsoul.jcseg.extractor.impl.TextRankKeyphraseExtractor
 
setClearStopwords(boolean) - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
setCnFactionToArabic(boolean) - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
setCnNumToArabic(boolean) - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
setConfig(JcsegTaskConfig) - Method in class org.lionsoul.jcseg.tokenizer.ASegment
set the current task configuration instance.
setConfig(JcsegTaskConfig) - Method in class org.lionsoul.jcseg.tokenizer.core.ADictionary
 
setConfig(JcsegTaskConfig) - Method in class org.lionsoul.jcseg.tokenizer.DelimiterSeg
set the current configuration
setConfig(JcsegTaskConfig) - Method in class org.lionsoul.jcseg.tokenizer.DetectSeg
set the current task config
setDelimiter(char) - Method in class org.lionsoul.jcseg.tokenizer.DelimiterSeg
set the delimiter default to whitespace
setDic(ADictionary) - Method in class org.lionsoul.jcseg.tokenizer.DelimiterSeg
set the current dictionary
setDict(ADictionary) - Method in class org.lionsoul.jcseg.tokenizer.ASegment
set the current dictionary
setDict(ADictionary) - Method in class org.lionsoul.jcseg.tokenizer.DetectSeg
set the current dictionary instance
setEnSecondSeg(boolean) - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
setEntity(String) - Method in interface org.lionsoul.jcseg.tokenizer.core.IWord
set the entity name of the word
setEntity(String) - Method in class org.lionsoul.jcseg.tokenizer.Word
 
setFile(File) - Method in class org.lionsoul.jcseg.tokenizer.core.AutoLoadFile
 
setICnName(boolean) - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
setIndex(int) - Method in class org.lionsoul.jcseg.extractor.impl.TextRankSummaryExtractor.Document
 
setKeepPunctuations(String) - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
setKeepUnregWords(boolean) - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
setKeywordsNum(int) - Method in class org.lionsoul.jcseg.extractor.impl.TextRankKeyphraseExtractor
 
setKeywordsNum(int) - Method in class org.lionsoul.jcseg.extractor.impl.TextRankKeywordsExtractor
 
setLastUpdateTime(long) - Method in class org.lionsoul.jcseg.tokenizer.core.AutoLoadFile
 
setLength(int) - Method in class org.lionsoul.jcseg.sentence.Sentence
 
setLength(int) - Method in interface org.lionsoul.jcseg.tokenizer.core.IWord
self define the length
setLength(int) - Method in class org.lionsoul.jcseg.tokenizer.Word
 
setLexiconPath(String[]) - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
setLoadCJKPinyin(boolean) - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
setLoadCJKPos(boolean) - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
setLoadCJKSyn(boolean) - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
setLoadEntity(boolean) - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
setMaxCnLnadron(int) - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
setMaxIterateNum(int) - Method in class org.lionsoul.jcseg.extractor.impl.TextRankKeyphraseExtractor
 
setMaxIterateNum(int) - Method in class org.lionsoul.jcseg.extractor.impl.TextRankKeywordsExtractor
 
setMaxIterateNum(int) - Method in class org.lionsoul.jcseg.extractor.impl.TextRankSummaryExtractor
 
setMaxLength(int) - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
setMaxWordsNum(int) - Method in class org.lionsoul.jcseg.extractor.impl.TextRankKeyphraseExtractor
 
setNameSingleThreshold(int) - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
setPartSpeech(String[]) - Method in interface org.lionsoul.jcseg.tokenizer.core.IWord
 
setPartSpeech(String[]) - Method in class org.lionsoul.jcseg.tokenizer.Word
 
setPinyin(String) - Method in interface org.lionsoul.jcseg.tokenizer.core.IWord
set the pinying of the word
setPinyin(String) - Method in class org.lionsoul.jcseg.tokenizer.Word
 
setPollTime(int) - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
setPosition(int) - Method in class org.lionsoul.jcseg.sentence.Sentence
 
setPosition(int) - Method in interface org.lionsoul.jcseg.tokenizer.core.IWord
set the position of the word
setPosition(int) - Method in class org.lionsoul.jcseg.tokenizer.Word
 
setPPT_MAX_LENGTH(int) - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
setScore(double) - Method in class org.lionsoul.jcseg.extractor.impl.TextRankSummaryExtractor.Document
 
setSeg(ISegment) - Method in class org.lionsoul.jcseg.extractor.KeyphraseExtractor
 
setSeg(ISegment) - Method in class org.lionsoul.jcseg.extractor.KeywordsExtractor
 
setSentence(Sentence) - Method in class org.lionsoul.jcseg.extractor.impl.TextRankSummaryExtractor.Document
 
setSentenceNum(int) - Method in class org.lionsoul.jcseg.extractor.impl.TextRankSummaryExtractor
 
setSentenceSeg(SentenceSeg) - Method in class org.lionsoul.jcseg.extractor.SummaryExtractor
 
setSTokenMinLen(int) - Method in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
 
setSyn(String[]) - Method in interface org.lionsoul.jcseg.tokenizer.core.IWord
 
setSyn(String[]) - Method in class org.lionsoul.jcseg.tokenizer.Word
 
setValue(String) - Method in class org.lionsoul.jcseg.sentence.Sentence
 
setWindowSize(int) - Method in class org.lionsoul.jcseg.extractor.impl.TextRankKeyphraseExtractor
 
setWindowSize(int) - Method in class org.lionsoul.jcseg.extractor.impl.TextRankKeywordsExtractor
 
setWords(List<IWord>) - Method in class org.lionsoul.jcseg.extractor.impl.TextRankSummaryExtractor.Document
 
setWordSeg(ISegment) - Method in class org.lionsoul.jcseg.extractor.SummaryExtractor
 
shellSort(T[]) - Static method in class org.lionsoul.jcseg.util.Sort
shell sort algorithm
SIMPLE_MODE - Static variable in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
simple algorithm or complex algorithm
SimpleSeg - Class in org.lionsoul.jcseg.tokenizer
Jcseg simple segmentation implements extend from ASegment
SimpleSeg(JcsegTaskConfig, ADictionary) - Constructor for class org.lionsoul.jcseg.tokenizer.SimpleSeg
 
SimpleSeg(Reader, JcsegTaskConfig, ADictionary) - Constructor for class org.lionsoul.jcseg.tokenizer.SimpleSeg
 
SIMSTR - Static variable in class org.lionsoul.jcseg.util.STConverter
 
SimToTraditional(String) - Static method in class org.lionsoul.jcseg.util.STConverter
convert the simplified words to traditional words of the specified string.
SimToTraditional(String, IStringBuffer) - Static method in class org.lionsoul.jcseg.util.STConverter
 
size(int) - Method in class org.lionsoul.jcseg.tokenizer.core.ADictionary
return the size of the dictionary
size(int) - Method in class org.lionsoul.jcseg.tokenizer.Dictionary
 
size() - Method in class org.lionsoul.jcseg.util.IHashQueue
get the size of the queue
size() - Method in class org.lionsoul.jcseg.util.IIntFIFO
get the size of the queue
size() - Method in class org.lionsoul.jcseg.util.IIntQueue
get the size of the queue
size() - Method in class org.lionsoul.jcseg.util.IntArrayList
 
Sort - Class in org.lionsoul.jcseg.util
All kind of Sort algorithm implemented method use the default compare method
Sort() - Constructor for class org.lionsoul.jcseg.util.Sort
 
START_SS_MASK - Static variable in interface org.lionsoul.jcseg.tokenizer.core.ISegment
 
startAutoload() - Method in class org.lionsoul.jcseg.tokenizer.core.ADictionary
start the lexicon autoload thread
STConverter - Class in org.lionsoul.jcseg.util
Simplified and traditional chinese convert class all the search work base on String.indexOf(int) you may store all the words in a HashMap for the purpuse of a faster fetch
STConverter() - Constructor for class org.lionsoul.jcseg.util.STConverter
 
STOKEN_MIN_LEN - Variable in class org.lionsoul.jcseg.tokenizer.core.JcsegTaskConfig
Less length for the second split to make up a word
STOP_WORD - Static variable in interface org.lionsoul.jcseg.tokenizer.core.ILexicon
stop words
stopAutoload() - Method in class org.lionsoul.jcseg.tokenizer.core.ADictionary
 
StringUtil - Class in org.lionsoul.jcseg.util
a class to deal with the English stop char like the English punctuation
StringUtil() - Constructor for class org.lionsoul.jcseg.util.StringUtil
 
summary(String) - Method in class org.lionsoul.jcseg.test.JcsegTest
summary extractor
SummaryExtractor - Class in org.lionsoul.jcseg.extractor
document summary extractor
SummaryExtractor(ISegment, SentenceSeg) - Constructor for class org.lionsoul.jcseg.extractor.SummaryExtractor
construct method
sync - Variable in class org.lionsoul.jcseg.tokenizer.core.ADictionary
 

T

T_BASIC_LATIN - Static variable in interface org.lionsoul.jcseg.tokenizer.core.IWord
Latin series.
T_CJK_PINYIN - Static variable in interface org.lionsoul.jcseg.tokenizer.core.IWord
Pinyin
T_CJK_WORD - Static variable in interface org.lionsoul.jcseg.tokenizer.core.IWord
China,JPanese,Korean words
T_CN_NAME - Static variable in interface org.lionsoul.jcseg.tokenizer.core.IWord
Chinese last name.
T_CN_NICKNAME - Static variable in interface org.lionsoul.jcseg.tokenizer.core.IWord
Chinese nickname like: 老陈
T_CN_NUMERIC - Static variable in interface org.lionsoul.jcseg.tokenizer.core.IWord
Chinese numeric
T_LEN - Static variable in interface org.lionsoul.jcseg.tokenizer.core.ILexicon
 
T_LETTER_NUMBER - Static variable in interface org.lionsoul.jcseg.tokenizer.core.IWord
letter number like 'ⅠⅡ'
T_MIXED_WORD - Static variable in interface org.lionsoul.jcseg.tokenizer.core.IWord
Chinese and English mix word like B超,SIM卡.
T_OTHER_NUMBER - Static variable in interface org.lionsoul.jcseg.tokenizer.core.IWord
other number like '①⑩⑽㈩'
T_PUNCTUATION - Static variable in interface org.lionsoul.jcseg.tokenizer.core.IWord
 
T_UNRECOGNIZE_WORD - Static variable in interface org.lionsoul.jcseg.tokenizer.core.IWord
useless chars like the CJK punctuation
TextRankKeyphraseExtractor - Class in org.lionsoul.jcseg.extractor.impl
document key phrase extractor base on textRank algorithm
TextRankKeyphraseExtractor(ISegment) - Constructor for class org.lionsoul.jcseg.extractor.impl.TextRankKeyphraseExtractor
 
TextRankKeywordsExtractor - Class in org.lionsoul.jcseg.extractor.impl
document keywords extractor base on textRank algorithm
TextRankKeywordsExtractor(ISegment) - Constructor for class org.lionsoul.jcseg.extractor.impl.TextRankKeywordsExtractor
 
textRankSortedDocuments(List<Sentence>, List<List<IWord>>) - Method in class org.lionsoul.jcseg.extractor.impl.TextRankSummaryExtractor
get the documents order by relevance score.
TextRankSummaryExtractor - Class in org.lionsoul.jcseg.extractor.impl
TextRank summary extractor base on textRank algorithm
TextRankSummaryExtractor(ISegment, SentenceSeg) - Constructor for class org.lionsoul.jcseg.extractor.impl.TextRankSummaryExtractor
 
TextRankSummaryExtractor.Document - Class in org.lionsoul.jcseg.extractor.impl
summary document inner class
TIME_POSPEECH - Static variable in interface org.lionsoul.jcseg.tokenizer.core.IWord
 
TimeUtil - Class in org.lionsoul.jcseg.util
Time Util class
TimeUtil() - Constructor for class org.lionsoul.jcseg.util.TimeUtil
 
tokenize(String) - Method in class org.lionsoul.jcseg.test.JcsegTest
string tokenize handler
toLowerCase(int) - Static method in class org.lionsoul.jcseg.util.StringUtil
 
toString() - Method in class org.lionsoul.jcseg.sentence.Sentence
rewrite the toString method
toString() - Method in class org.lionsoul.jcseg.tokenizer.Chunk
 
toString() - Method in class org.lionsoul.jcseg.tokenizer.Word
 
toString() - Method in class org.lionsoul.jcseg.util.ByteCharCounter
 
toString() - Method in class org.lionsoul.jcseg.util.IStringBuffer
return the string of the current buffer
toUpperCase(int) - Static method in class org.lionsoul.jcseg.util.StringUtil
 
TRASTR - Static variable in class org.lionsoul.jcseg.util.STConverter
 
TraToSimplified(String) - Static method in class org.lionsoul.jcseg.util.STConverter
convert the traditional words to simplified words of the specified string.
TraToSimplified(String, IStringBuffer) - Static method in class org.lionsoul.jcseg.util.STConverter
 

U

UNMATCH_CJK_WORD - Static variable in interface org.lionsoul.jcseg.tokenizer.core.ILexicon
unmatched word
unread(int) - Method in class org.lionsoul.jcseg.util.IPushbackReader
unread the specified data to the stream push the data back to the queue in fact, you know
unread(char[], int, int) - Method in class org.lionsoul.jcseg.util.IPushbackReader
unread a block from a char array to the stream
UNRECOGNIZE - Static variable in interface org.lionsoul.jcseg.tokenizer.core.IWord
 
Util - Class in org.lionsoul.jcseg.util
static method for jcseg.
Util() - Constructor for class org.lionsoul.jcseg.util.Util
 

V

version - Static variable in class org.lionsoul.jcseg.tokenizer.core.SegmentFactory
 

W

windowSize - Variable in class org.lionsoul.jcseg.extractor.impl.TextRankKeyphraseExtractor
 
windowSize - Variable in class org.lionsoul.jcseg.extractor.impl.TextRankKeywordsExtractor
 
Word - Class in org.lionsoul.jcseg.tokenizer
word class for Jcseg with the org.lionsoul.jcseg.core.IWord interface implemented at 2017/03/29: make the synonyms series method Word.getSyn() Word.setSyn(String[]) Word.addSyn(String) and the part of speech series method Word.getPartSpeech() Word.setPartSpeech(String[]) Word.addPartSpeech(String) and the Word.clone() method synchronized for may coming concurrent access.
Word(String, int, int, String) - Constructor for class org.lionsoul.jcseg.tokenizer.Word
construct method to initialize the newly created Word instance
Word(String, int, int) - Constructor for class org.lionsoul.jcseg.tokenizer.Word
 
Word(String, int) - Constructor for class org.lionsoul.jcseg.tokenizer.Word
 
Word(String, int, String) - Constructor for class org.lionsoul.jcseg.tokenizer.Word
 
wordPool - Variable in class org.lionsoul.jcseg.tokenizer.ASegment
CJK word cache pool, Reusable string buffer and the array list for basic integer
wordPool - Variable in class org.lionsoul.jcseg.tokenizer.DelimiterSeg
 
wordSeg - Variable in class org.lionsoul.jcseg.extractor.SummaryExtractor
ISegment word tokenizer object

_

__toString() - Method in class org.lionsoul.jcseg.tokenizer.Word
for debug only
A B C D E F G H I J K L M N O P Q R S T U V W _ 
Skip navigation links

Copyright © 2017. All Rights Reserved.