DiscoverInfo is a tool to explore a collection of documents developed at SILS, UNC Chapel Hill by Chirag Shah under the guidance of Prof. Gary Marchionini. It facilitates this in three ways.
- Search: one can do full text search in the collection. DiscoverInfo indexes text, HTML, XML, and PDF documents.
- Browse: DiscoverInfo system prepares term cloud based on the term occurrences in the collection as well as across the documents. These clouds can provide a good overview of the underlying collection. The curator can browse through the clickable term clouds.
- Discover: this system not only retrieves relevant information from the indexed collection, but can also evaluate how novel is some information (here, document) with respect to other documents. This can help the curator in discovering not only the relevant, but also novel information. At present, novelty between two documents is measured by (1-overlap of unique words between these two documents).
Note: at present the DiscoverInfo system has only one collection: The North Carolina Election of 1898 from UNC Library. We are working on extending its functionality to include other collections too. Your feedback is highly valuable.
|