Skip to main content

What is Data Mining, and how does it work?

 Data mining is a word we have always heard and tried to understand what it is but never got the right information, right! Just read along to get your hands on Data mining and you`ll be scared as well after knowing what it is. What you`ll know by the end of this blog:

·        Definition

·        How Google and Facebook fetch your data

·        What are Cookies

·        What are Tracking Cookies and Deep Face

·        4 steps of Data mining

·        Data Mining Techniques

·        Software and Tools

·        Pros and Cons

Data is “Raw facts, and figures” while Mining is “Extraction” like gold mining. So,

we can say that,

Data mining is the process of finding peculiarity, patterns and correlations within

large data sets to predict outcomes. Using a broad range of techniques.

Let`s take a look at an example to understand it better (Note: You can try it as

well). First, open Facebook and scroll through without opening anything, now

open something let's say, you opened an e-store and you were looking through

mobiles, don`t tap at any phone, just close the Facebook and type the company

name of the phone that you just saw at the Facebook. You`ll be surprised to see

the name of the mobile phone that appeared you were scrolling through as illustrated in the video below.



 How does that happen?

Google uses IP addresses and Cookies to fetch your data, wondering what is an IP address and what are cookies?

IP address: Internet Protocol is a protocol your device needs to connect to the internet, IP address is usually in the form of 192.168.10.0, these numbers store your:

·        City

·         Zipcode/area code

·         And your ISP (Internet Service Provider) name

It tells your accurate location not wholly but, partially for sure. Want to know your IP address:



Cookies: Cookies are the things you need to be aware of the most. It is a small file that website stores in your browser to store your data like whatever site you visit and you log in, the next you`ll see is a form/panel hanging saying, “Save Password” somewhat like this:


Your username and password are also saved in your browser in that particular cookie file because if the company went to store your passwords and usernames, their databases will surely run down on storage after a hundred or more records.

Now, we have Facebook`s Tracking Cookies, you`ll be thinking what is the difference between normal cookies and tracking cookies. Normal cookies can`t track your mouse, the website owner doesn`t know where your mouse is hovering but, tracking cookies do know where your mouse hovered on their website.

We have always seen notifications on Facebook, recently we have started receiving notifications, somewhat like this: 

Someone from your friends has uploaded a photo of you, how does Facebook know?

For this purpose, Facebook uses Deep Face (Accuracy: 97.35%) that can recognize you even if your picture is a little bit in another direction or its upside down, like this:


Picture (a) is the original one which is slight to the right but after crossing it from Deep Face you can see in the picture (g) it's a front-side of his face.

Now, we have 4 steps of Data Mining, which include:

1.     Data Gathering

2.     Data Preparation

3.     Mining the data

4.     Data Analysis and interpretation

To understand it better, let`s see a picture below:


The Data Source in the image is the first process where the data is gathered and then the data is sent to ETL (Extraction, Transformation, and Loading) this phase also includes error removing and a lot more than ETL actually. Next, the data went for the data warehouse, in the warehouse, there are small Data Marts, Data Marts store the most used data like cache memory in the computer that stores the recently used programs/applications. Then, this data is fetched through OLAP Server (Online Analytical Processing Server) and used for data mining, reporting tool, and analysis tool.

After 4 steps of Data Mining, we have different Data Mining Techniques. Don`t worry! We won`t go into more detail (Just names, will make a blog on them after it). These are:

·         Classification

·         Clustering

·         Regression

·         Neural networks

·         Association

·         Sequence

Now, there are some software's through which we do Data Mining (Of course, we need something to mine data, we can`t do it in the air), these software`s are:

1.     Alteryx

2.     Amazon Web Services

3.     Data Bricks

4.     Data Robot

5.     and there are many other

Last but not least is its Pros and Cons:

Pros:

1.     Business Purpose

2.     Better Customer Service

Cons:

1.     Security

2.     Information Misuse

Business Purpose: Companies use data mining to know their customer's preferences so, that they can work more on it and show the user relevant products or even ads. Like Google Ad Sense. See open any website on your browser and they will show you the ads relevant to you and appropriate regarding your country, like here:


Google is showing the ad of PSL because of my Location.

Better Customer Service: When companies look at the problem they are facing like finding a product on a website, the company will look after it and do what they can to solve the problem. Like they made a Recommended for you button:


Now, we get Cons as well.

Security:

This is a YouTube channel Analytics, you can see it says when your viewers are on YouTube, Other Channel Your audience watches, and other videos your audience watched, means how do you know? Sometimes, we don`t want others to know what we watched or which YouTube channels we watched. So, first comes a Security risk.

Information Misuse: 2 years ago, the data of 115 million people of Pakistan was on the Dark Web for sale.


As we have gone through everything about Data Mining, let's go for its history.

History: It all emerges in the late 1980s and early 1990s to analyze the vast amount of data when companies all over the world were gathering and producing data. The word Data Mining was in use by 1995 when the first international conference was held in Montreal. The event was sponsored by AARI (Association for the Advancement of Artificial Intelligence). A journal called Data Mining and Knowledge Discovery published its first problem in 1997, and so on the problems and the advancement begins.


Comments

Popular posts from this blog

Struggling for job as an 18 year

Today, I am going to talk about the biggest problem for an  18-year-old and that is finding a job. When I turned 18 years old the biggest problem I face d is what to do. I tried finding jobs on Facebook, Career builder and even tried to do freelancing on freelancer and Fiverr. Well, I would say Career Builder is the best option only if you are a U.S. citizen. They provide you every kind of job and even the companies contact you after reviewing your resume, but this is only if you are a U.S. citizen. So, what could you do if you are not a U.S. citizen? So, here I found an option of investing in digital currency. But first I had to know, “ What is digital currency ?” L et’s see it : Digital Currency:                                           Digital currency  ( digital money ,  electronic money  or  electronic currency ) is a balance or a record stored in a  distributed database  on the  Internet , in an electronic  computer database , within  digital files  or within a 

China and Musk trashes Bitcoin

  This year in history, Bitcoin reached its highest value ever as such during the lockdown the bitcoin wasn`t performing well. But, after it, the bitcoin steadily starts climbing the chart even though many countries like India was planning to ban crypto, people were investing in it. And when Elon Musk (Owner of Tesla Motors) announced on 9 Feb 2021 , that Tesla is going to invest $1.95 billion, the result was that the bitcoin reached its highest value in history, approximately $63,729 (according to data from Coin Metrics) on April 13, 2021. Now, approximately 2 weeks ago China banned cryptocurrency means you cannot trade anything with Bitcoin which initially decreases the value of Bitcoin to about 9%. And also, Elon Musk in his tweet said that Tesla will no longer accept Bitcoin for car purchase, due to which it goes to 30-40% and reached its lowest value since Feb 2020. it also affected the other crypto`s associated with Bitcoin like Ethereum. He said in his Tweet that “ Tesla has s