Utils¶
Different utilities
nlp
utils¶
-
clstk.utils.nlp.
getSentenceSplitter
()¶ Get sentence splitter function
Returns: A function which takes a string and return list of sentence as strings.
-
clstk.utils.nlp.
getTokenizer
(lang)¶ Get tokenizer for a given language
Parameters: lang – language Returns: tokenizer, which takes a sentence as string and returns list of tokens
-
clstk.utils.nlp.
getDetokenizer
(lang)¶ Get detokenizer for a given language
Parameters: lang – language Returns: detokenizer, which takes list of tokens and returns a sentence as string
-
clstk.utils.nlp.
getStemmer
()¶ Get stemmer. For now returns Porter Stemmer
Returns: stemmer, which takes a token and returns its stem
-
clstk.utils.nlp.
getStopwords
(lang)¶ Get list of stopwords for a given language
Parameters: lang – language Returns: list of stopwords including common puncuations
ProgressBar
class¶
-
class
clstk.utils.progress.
ProgressBar
(totalCount)¶ Bases:
object
Class to manage and show pretty progress-bar in the console
-
__init__
(totalCount)¶ Initialize the progressbar
Parameters: totalCount – Total items to be processed
-
done
(doneCount)¶ Move progressbar ahead
Parameters: doneCount – Out of totalCount
, this many have been processed
-
complete
()¶ Complete progress
-
__weakref__
¶ list of weak references to the object (if defined)
-