Information Systems homework help
You have collected the following documents (unstructured) and plan to apply an index technique to convert them into an inverted index.
Doc 1?Information retrieval is the activity of obtaining information resources relevant to an information need from a collection of information resources. Searches can be based on full-text or other content-based indexing.
Doc 2?Information retrieval is finding material of an unstructured nature that satisfies an information need from within large collections.
Doc 3?Information systems is the study of complementary networks of hardware and software that people and organizations use to collect, filter, process, create, and distribute data.
In the process of creating the inverted index, please complete the following steps:
Doc 1?Information retrieval is the activity of obtaining information resources relevant to an information need from a collection of information resources. Searches can be based on full-text or other content-based indexing.
Doc 2?Information retrieval is finding material of an unstructured nature that satisfies an information need from within large collections.
Doc 3?Information systems is the study of complementary networks of hardware and software that people and organizations use to collect, filter, process, create, and distribute data.
In the process of creating the inverted index, please complete the following steps:
- Remove all stop words and punctuation, and then apply Porter’s stemming algorithm to the documents. Thelist of stop words for this task is provided as follows:
Is, The, Of, To, An, A, From, Can, Be, On, Or, That, Within, And, Use
- Create a merged inverted list including the within-document frequencies for each term.
- Use the index created in part (b) to create a dictionary and the related posting file.
- You may like to test the inverted index by using the following keywords:information, system, index