State of the art ; Wortklassenerkennung, automatische; Word class assignment, automatic; Tagging (computational linguistics) ; Automatic word class assignment ; Sampling; Language corpus; Sprachkorpus; Datensammlung; Data collection; Textkorpus (method.) ; Text corpus (method.) ; Corpus linguistics ; Quantitative Linguistik; Mathematical linguistics ; Techniques (in research) ; Research techniques ; Word classes ; Mandarin; Kantonesisch; Cantonese; Putonghua; Sinitic; Hakka; Yue; Chinese