Explain Mining World Wide Web.

 Mining World Wide Web/Mining Web Data

  • The World Wide Web serves as a huge, widely distributed, global information center for news, advertisements, consumer information, financial management, education, government, and e-commerce. It contains a rich and dynamic collection of information about web page contents with hypertext structures and multimedia, hyperlink information, and access and usage information, providing fertile sources for data mining. 
  • Web mining is the application of data mining techniques to discover patterns, structures, and knowledge from the Web. According to analysis targets, web mining can be organized into three main areas: web content mining, web structure mining, and web usage mining.
  • The World Wide Web contains huge amounts of information that provides a rich source for data mining. 

  • The web poses great challenges for resource and knowledge discovery based on the following observations;-

  • The web is too huge − The size of the web is very huge and rapidly increasing. This seems that the web is too huge for data warehousing and data mining.
  • The complexity of Web pages − The web pages do not have a unifying structure. They are very complex as compared to traditional text documents. There are a huge amount of documents in a digital library on the web. These libraries are not arranged according to any particular sorted order.
  • The web is a dynamic information source − The information on the web is rapidly updated. The data such as news, stock markets, weather, sports, shopping, etc., are regularly updated.
  • Diversity of user communities − The user community on the web is rapidly expanding. These users have different backgrounds, interests, and usage purposes. There are more than 100 million workstations that are connected to the Internet and still rapidly increasing.
  • Relevancy of Information − It is considered that a particular person is generally interested in only a small portion of the web, while the rest of the portion of the web contains information that is not relevant to the user and may swamp desired results.
                                                OR,

Mining World Wide Web (WWW)

  • The term Web Mining was coined by Orem Etzioni (1996) to denote the use of data mining techniques to automatically discover Web documents and services, extract information from Web resources, and uncover general patterns on the Web.
  • The World Wide Web is a rich, enormous knowledge base that can be useful for many applications. The WWW is a huge, widely distributed, global information service center for news, advertisements, consumer information, financial management, education, government, e-commerce, hyperlink information, access, and usage information.
  • The Web’s large size and its unstructured and dynamic content, as well as its multilingual nature, make extracting useful knowledge from it a challenging research problem.


Comments

Popular posts from this blog

What is the cloud cube model? Explain in context to the Jericho cloud cube model along with its various dimensions.

Explain cloud computing reference model .

Discuss classification or taxonomy of virtualization at different levels.