Posts

8)71st Republic Day 2020 highlights| Beating retreat ceremony with Attari-Wagah border on Republic Day

India Republic Day -- India celebrates its 71st Republic Day nowadays. On this day in 1950the Constitution of Of india came into force. The Republic Day paradewhich is considered the main attraction of the days celebrationwas held along Rajpath. It was a 90-minute celebration. Brazilian President Jair Bolsonaro was the chief guest within the parade. Before the parade beganPrime Minister Narendra Modi paid tribute at the Country wide War Memorial and Us president Ram Nath Kovind unfurled the national flag as well as General Manoj Mukund NaravaneChief of the Army TeamAdmiral Karambir SinghChief of the Naval StaffMarshal Rakesh Kumar Singh BhadauriaChief of the Air Team. 5 41 PM WIRD PM Narendra Modi arrives at Rashtrapati Bhawan for At home reception hosted simply by President Ram Nath Kovind. 5 12 pm WIRD Beating retreat ceremony at Attari-Wagah border on Republic Day. 4 36 pm hours IST Air India Distributes 30000 National Red flags To Passengers On Republic Day The national t

Data mining

Image
Data mining is a process of discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems. Data mining is an interdisciplinary subfield of computer science and statistics with an overall goal to extract information (with intelligent methods) from a data set and transform the information into a comprehensible structure for further use. Data mining is the analysis step of the "knowledge discovery in databases" process, or KDD. Aside from the raw analysis step, it also involves database and data management aspects, data pre-processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures, visualization, and online updating. The term "data mining" is a misnomer, because the goal is the extraction of patterns and knowledge from large amounts of data, not the extraction ( mining ) of data itself. It also is a buzzword and is fr

Etymology

Image
In the 1960s, statisticians and economists used terms like data fishing or data dredging to refer to what they considered the bad practice of analyzing data without an a-priori hypothesis. The term "data mining" was used in a similarly critical way by economist Michael Lovell in an article published in the Review of Economic Studies in 1983. Lovell indicates that the practice "masquerades under a variety of aliases, ranging from "experimentation" (positive) to "fishing" or "snooping" (negative). The term data mining appeared around 1990 in the database community, generally with positive connotations. For a short time in 1980s, a phrase "database mining"™, was used, but since it was trademarked by HNC, a San Diego-based company, to pitch their Database Mining Workstation; researchers consequently turned to data mining . Other terms used include data archaeology , information harvesting , information discovery , knowledge extract

Background

Image
The manual extraction of patterns from data has occurred for centuries. Early methods of identifying patterns in data include Bayes' theorem (1700s) and regression analysis (1800s). The proliferation, ubiquity and increasing power of computer technology have dramatically increased data collection, storage, and manipulation ability. As data sets have grown in size and complexity, direct "hands-on" data analysis has increasingly been augmented with indirect, automated data processing, aided by other discoveries in computer science, specially in the field of machine learning, such as neural networks, cluster analysis, genetic algorithms (1950s), decision trees and decision rules (1960s), and support vector machines (1990s). Data mining is the process of applying these methods with the intention of uncovering hidden patterns. in large data sets. It bridges the gap from applied statistics and artificial intelligence (which usually provide the mathematical background) to databa

Process

Image
The knowledge discovery in databases (KDD) process is commonly defined with the stages: Selection Pre-processing Transformation Data mining Interpretation/evaluation. It exists, however, in many variations on this theme, such as the Cross-industry standard process for data mining (CRISP-DM) which defines six phases: Business understanding Data understanding Data preparation Modeling Evaluation Deployment or a simplified process such as (1) Pre-processing, (2) Data Mining, and (3) Results Validation. Polls conducted in 2002, 2004, 2007 and 2014 show that the CRISP-DM methodology is the leading methodology used by data miners. The only other data mining standard named in these polls was SEMMA. However, 3–4 times as many people reported using CRISP-DM. Several teams of researchers have published reviews of data mining process models, and Azevedo and Santos conducted a comparison of CRISP-DM and SEMMA in 2008. Pre-processing edit Before data mining algorithms can be used, a targe

Research

The premier professional body in the field is the Association for Computing Machinery's (ACM) Special Interest Group (SIG) on Knowledge Discovery and Data Mining (SIGKDD). Since 1989, this ACM SIG has hosted an annual international conference and published its proceedings, and since 1999 it has published a biannual academic journal titled "SIGKDD Explorations". Computer science conferences on data mining include: CIKM Conference – ACM Conference on Information and Knowledge Management European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases KDD Conference – ACM SIGKDD Conference on Knowledge Discovery and Data Mining Data mining topics are also present on many data management/database conferences such as the ICDE Conference, SIGMOD Conference and International Conference on Very Large Data Bases

Standards

There have been some efforts to define standards for the data mining process, for example, the 1999 European Cross Industry Standard Process for Data Mining (CRISP-DM 1.0) and the 2004 Java Data Mining standard (JDM 1.0). Development on successors to these processes (CRISP-DM 2.0 and JDM 2.0) was active in 2006 but has stalled since. JDM 2.0 was withdrawn without reaching a final draft. For exchanging the extracted models—in particular for use in predictive analytics—the key standard is the Predictive Model Markup Language (PMML), which is an XML-based language developed by the Data Mining Group (DMG) and supported as exchange format by many data mining applications. As the name suggests, it only covers prediction models, a particular data mining task of high importance to business applications. However, extensions to cover (for example) subspace clustering have been proposed independently of the DMG.