Corpus Analysis Tools | தமிழ் இணையக் கல்விக்கழகம் 
Tamil Virtual Academy

Tamil Virtual Academy

TAMIL VIRTUAL ACADEMY - தமிழ் இணையக் கல்விக்கழகம்



Corpus Analysis Tools

The following tools have been developed for Tamil corpus analysis.
 
1.Word Count

This tool counts the number of words in a file. Options have been given for Tamil sorting and deleting the duplicate words.

 
2.Word Analyzer:

This word Analyzer tool will extract a single letter (word), two-letter words (clusters / germination), three-letter words. An option also has been provided for Tamil sorting and deleting the duplicate words.

 
3.Character Identifier
 

This character identifier will identify and provide the words list within the prescribed delimiter option, either they start with Tamil letter (script) / English (roman) / Arabic numeral.

 
4.Word Concordance
 

This tool will identify, what are the words occurring before and after a particular key-word in a given sentence, and save in a separate text file. An option also has been provided for Tamil sorting and deleting the duplicate words.

 
5. Renaming the files
 

This windows application will rename all the downloaded files accordingly to the format needed by the user.

 
6. Domain Identifier
 

This tool will identify the domain of each word and put them in a separate file. An option is also given for merging the files into one file.

English