Programming Homework Help

California University of Management Bar Chart of Usenet Message Groups Project

 

Here is the source for the data: http://qwone.com/~jason/20Newsgroups/. Put the folder in your working directory. I shortened the title to 20news to make it easier to use.

  1. Create a stacked bar chart of usenet message groups showing the number of messages in each group. Add some color to differentiate among the types (letters before the first period). Comment your code line by line.
  2. Clean the text and remove stop words. Show words by newsgroup. Apply tf-idf metric. Show stacked bar charts of word usage for ‘sci’ groups. Comment your code, line by line. Explore another news group in the same way.
  3. Do a pairwise correlation of words by newsgroups. Visualize the strongest correlations among groups. Comment your code, line by line.

Please add your commentary, in your own words, on what the graphs are showing. Related materials and code for references can be found here: https://www.tidytextmining.com/usenet.html

The code and explanation on the text, you are not doing anything extra. I am on Covid Duty and have no time to work on this. I need the rmd file so that I can execute the code on my mac and pdf of the output you executed. Always repeat the question you are answering. Thank you!