Urdu Linguistics and Computing

Pages

  • Home
  • Alphabets
  • Diacritics
  • Digits
  • Special Chars & Symbols
  • Project Urdu Corpus

Tuesday, May 10, 2016

Urdu Text Classification

http://www.percipienceanalytics.com/papers/Urdu%20Text%20Classification.pdf

Posted by Syed Muhammad Humayun at 4:57 PM
Email ThisBlogThis!Share to XShare to FacebookShare to Pinterest
Labels: affix elimination stemming, character normalization, diacritic elimination, frequency based stop words, lexicon lookup word segmentation, manual affix list, manual lexicon creation, paper 2009, stop words

No comments:

Post a Comment

Newer Post Older Post Home
Subscribe to: Post Comments (Atom)

Labels

  • add character lists acl (1)
  • affix elimination stemming (1)
  • affix list (2)
  • auto text summarization (1)
  • benchmark (1)
  • bi-gram (1)
  • case marker (1)
  • character normalization (1)
  • classification (1)
  • clir (1)
  • clustering (1)
  • collation (1)
  • connected component analysis (1)
  • consonants (1)
  • corpus (2)
  • corpus cleaning (1)
  • diacritic elimination (2)
  • entity finding (1)
  • equivalence class (1)
  • euclidean distance (1)
  • exception list (1)
  • feature extraction (1)
  • frequency based stop words (1)
  • frequency based stripping (1)
  • functional morphology fm (1)
  • gazetteer (1)
  • harf (1)
  • heuristics (1)
  • hmm tagger (1)
  • indexing (2)
  • infix stemming (1)
  • inverted index (1)
  • lemmas (1)
  • length based stripping (1)
  • levenshtein distance (1)
  • lexicon (1)
  • lexicon extraction (1)
  • lexicon format (1)
  • lexicon lookup word segmentation (1)
  • ligature (1)
  • longest suffix stripping algorithm (1)
  • manual affix list (2)
  • manual lexicon creation (1)
  • mean average precision map (1)
  • morphology (5)
  • n-gram (3)
  • named entity recognition ner (1)
  • natural language tool kit (1)
  • normalization (2)
  • open source (1)
  • opinion (1)
  • optical character recognition ocr (1)
  • orthography (2)
  • over stemming (1)
  • paper (14)
  • paper 2007 (1)
  • paper 2009 (3)
  • paper 2010 (1)
  • paper 2011 (1)
  • paper 2012 (1)
  • paper 2013 (1)
  • paper 2014 (1)
  • paper 2015 (3)
  • paper 2016 (3)
  • part of speech (4)
  • partial word (1)
  • phonemic transcription (1)
  • pl2 weighting model (1)
  • polarity (1)
  • pos tagger (1)
  • postfix stemming (1)
  • prefix stemming (1)
  • python (1)
  • retrieval (2)
  • rf tagger (1)
  • roman urdu (1)
  • roman urdu conversion (1)
  • rules based stemmer (3)
  • salience (1)
  • segmentation (1)
  • sentiments (1)
  • soundex (1)
  • space insertion (1)
  • spell checking (1)
  • stanford pos tagger (1)
  • statistical stemmer (1)
  • stem words dictionary (1)
  • stemmer (5)
  • stemming exception words (1)
  • stemming rules (1)
  • stop words (4)
  • svm tagger (1)
  • terrier (1)
  • tokenization (1)
  • tools (2)
  • translation (1)
  • transliteration (2)
  • tree tagger (1)
  • trigram-and-tag tnt (1)
  • under stemming (1)
  • uni-gram (1)
  • unicode variation (1)
  • unsupervised (1)
  • urdu grammar (1)
  • urdu infix classes (1)
  • utf-16 encoding (1)
  • vowels (1)
  • web page content extractor (1)
  • word forms (1)
  • word spotting (1)

Followers

About Me

Syed Muhammad Humayun
View my complete profile

Total Pageviews

Search This Blog

Researchers Resources

  • ACM Digital Library ($)
  • Google Scholar (Free)
  • Research Gate (Free)

Research Areas

  • Ambiguity resolution
  • Anaphora Resolution
  • Character Recognition
  • Corpus Linguistics
  • Discourse Analysis
  • Ellipses Resolution
  • Fonts
  • Information Retrieval
  • Localization
  • Machine Translation
  • Morphology
  • OCR
  • Part-of-Speech Tagging
  • Pattern Recognition
  • Phonology
  • Semantics
  • Speech Recognition
  • Syntax
  • Text to speech

Blog Archive

  • ▼  2016 (16)
    • ▼  May (12)
      • Design & Development of Rule Based Inflectional an...
      • A Rule based Stemming Method for Multilingual Urdu...
      • Urdu Morphology, Orthography and Lexicon Extraction
      • Urdu Summary Corpus
      • Salience Analysis of NEWS Corpus using Heuristic A...
      • Corpus Based Urdu Lexicon Development
      • Urdu Text Classification
      • Word Spotting based Retrieval of Urdu Handwritten ...
      • A Hybrid Approach for NER System for Scarce Resour...
      • Analysis and Development of Urdu POS Tagged Corpus
      • A Language Independent Approach To Develop Urdu St...
      • Assas-Band, an Affix-Exception-List Based Urdu Ste...
    • ►  April (4)
Watermark theme. Powered by Blogger.