Diana’s Blog

Digital Marketing and Cyber-Security

Objectives of my posts and target audience.

  1. The objectives of my posts are:
  • To inform vulnerable people of possible undesirable consequences from Digital Marketing
  • To give advice on how to protect yourself from those consequences
  1. My target audience are:
  • Anyone and everyone in today’s day and age

 

Keywords.

I have used Google Keyboard Tool to find out Demand for the following keywords:

  • Digital marketing
  • Cyber Security
  • Digital Marketing and Cyber Security
  • Digital Marketing Cyber Security

Screenshot (49)

Figure 1. Print Screen from Google Keyword Tool.

Then I was using allintitle: function to determine Supply for the above keywords.

Screenshot (54) Screenshot (55) Screenshot (56) Screenshot (57)

Figure 2. Print Screens from Google allintitle: function.

 

The idea for good content is to find keywords that gives you high demand and low supply.

 

I could not come up with such keywords, and my objective is to discuss clients’ security on-line.

 

Competitive Analysis

There are many websites informing about Customer Data Protection form an organisation point of view.

What could I do different or better?

I would like to discuss Customer Data Protection from a customer point of view.

I decided to break down my topic into 4 posts:

  1. Basic Terms you need to know to understand the risks.
  • Digital Marketing
  • Cyber Security
  • Big Data
  1. Caring before Sharing
    • Understand the Responsibility
  2. Free cheese is only in a mouse-trap
  • Two examples of extreme Cases
  1. Ways to be safe while surfing the web
  • 5 Rules

Basic Terms you need to know to understand the risks.

5 W’s of Digital Marketing.

What: is the promotion of product/service via a form of electronic media.

Who: Companies, Brands, Individuals.

Why: To benefit Marketing efforts – gain more customers.

To build and enrich relationships with customers.

To provide better Customer Support.

Where: Mainly Internet, but also mobiles phones, display ads, etc.…

When: Since 2010’s but it has roots in 1980’s.

Digital Marketing continues to grow and develop.

 

Marketing is all about meeting the needs and wants of customers. They do not buy what is being sold, they buy what has value to them. Marketing focuses on consumers’ interests, experiences and satisfaction, to create a personalised experience to each customer.

 

Cyber Security is the protection of Information Systems form theft or damage to the hardware, the software, and to the information on them, as well as from misdirection of services they provide.

 

Big Data.

There is no particular definition of Big Data. I am going to use definition from www.forbes.com :

“The ability of society to harness information in novel ways to produce useful insights or goods and services of significant value.”

I am sure you can find yourselves some information on Big Data and its uses; for the topic I am going to talk about, it is enough to see that Data and Information is a Valuable Source, and breaching Security is a serious offence.

Caring before Sharing.

Why is there Big Data? – Because Data makes money.

Why is there so many promoting ads? Do you really believe that someone is interested into selling you stuff?

Why are there Cyber Attacks targeting information on computers?

  • Because Information has a Bigger Value that you think.

 

People were living on Earth thousands of years, and statistics tells us we have created 90% of the world data in less than last 2 years. Does it make sense? – No. It means that information was not captured in any form (capturing was limited).

 

The next question to answer is: Why do we need to store so many details? I will leave it to you to meditate on it.

Free cheese is only in mouse-trap.

This is a very popular Russian saying. I like it because it’s true.

 

Let’s have a look at the following examples of people having too much trust.

 

Case 1. Munich shooting that happened on 22nd of July 2016.

Police said Sonboly appeared to have hacked a Facebook account and sent a message urging people to come to the McDonald’s in the Olympia shopping centre if they wanted free food.

https://www.theguardian.com/world/live/2016/jul/22/munich-shooting-police-evacuate-shopping-centre-live

 

 

Case 2. Conning Pensioners from 2005 to 2009.

The defendants identified victims, or “leads,” by purchasing from list brokers the names and contact information of U.S. residents who subscribed to sweepstakes lotteries.

http://libertyfight.com/2016/israelis-bilk-steal-from-elderly.html

From my personal observation I can tell that the more scams and scammers get detected, the more inventive they become.

 

It still does not tell us how to avoid them but maybe we can learn from them.

Ways to be safe while surfing the web.

Rule Nr 1: It is not possible to 100% protect yourself from being scammed on the web.

Rule Nr 2: Get to know more about Cyber Security and do steps for preventing information leakages on your computer.

Rule Nr 3: Share your personal information only when it is really needed and you are confident about the second party.

Rule Nr 4: Don’t subscribe to organisations if you do not have any reason to.

Rule Nr 5: Don’t expect too much favouritism: “free cheese is only in a mouse-trap”.

 

Is Data New Oil?

Data is New Oil.DataOil

 

It is very indefinite widely-spread phrase, usually it follows by:

Big Data=Big Oil=Big Profit.

Simple as that. But is it really?

Let us look at some difference between Data and Oil:

Information is the ultimate renewable resource.  Any kind of data reserve that exists has not been lying in wait beneath the surface; data are being created, in vast quantities, every day.

Finding value from data is much more a process of handling than it is one of extraction.

 

I would love to use this buzz phrase (data as oil) to demonstrate risks of consequences.

We have already seen “data spills” happen (when large amounts of personal data are inadvertently leaked). Will it be much longer until we see dangerous data drilling practices? Or until we start to see long term effects from “data pollution”?

One of the places where we have to tread most carefully — another place where our data/oil model can be useful — is in the realm of personal data. A great deal of the profit that is being made right now in the data world is being made through the use of human-generated information. Our browsing habits, our conversations with friends, our movements and location — all of these things are being monetized. This is deeply human data, though very often it is not treated as such.

 

I reckon, for safety, we all need to consider the following:

First, people need to understand and experience data ownership.

Second, we need to have a more open conversation about data and ethics.

Finally, we need to change the way that we collectively think about data, so that it is not a new oil, but instead a new kind of resource entirely.

 

What are your thoughts of “Data is New Oil”?

3 V’s Vice Versa

There are three key concepts that can help understand Big Data and those concepts are: volume, velocity, and variety. BD3Vtopost

Volume: The sheer volume of the data is enormous and a very large contributor to the ever expanding digital universe is the Internet of Things with sensors all over the world in all devices creating data every second.

Velocity: is the speed at which the data is created, stored, analysed and visualized. In the Big Data era, data is created in real-time or near real-time. The challenge organisations have is to cope with the enormous speed the data is created and used in real-time.

Variety: Data today comes in many different formats: structured data, semi-structured data, unstructured data and even complex structured data. The wide variety of data requires a different approach as well as different techniques to store all raw data.

There are 3 essential aspects Big Data is all about.

There are more very important V’s applicable to Big Data:

Veracity: Having a lot of data in different volumes coming in at high speed is worthless if that data is incorrect. Incorrect data can cause a lot of problems for organisations as well as for consumers.

Variability: Big data is extremely variable. Variability means that the meaning is changing rapidly.

Visualisation: With the right analyses and visualizations, raw data can be put to use otherwise raw data remains essentially useless.

Value: Data in itself is not valuable at all. The value is in the analyses done on that data and how the data is turned into information and eventually turning it into knowledge.

 

While I was trying to think of any other V’s of Big Data can be described, I have decided to think about V’s Big Data should not be.

I came up with three V’s: Vague, Void, and Vulnerable.

Vague: Big Data must not be vague, it should be certain and specific.

Void: Data for Analysis must not be void, Big Data must have clear meaning.

Vulnerable: Data must be safe and protected.

VV

Do you agree with my V’s?

Can you think of any other V’s Big Data should not be associated with?

 

Is Big Data a good thing?

What is Big Data?

Big Data is a collection of data from traditional and digital sources inside and outside the company that represents a source for ongoing discovery and analysis.

Collecting and parsing vast amounts of consumer information from disparate channels, Big Data organisations present major profit possibilities.

BIG DATA = BIG OPPORTUNITIES

Interesting fact about Data.

The number of Bits of Information Stores in the Digital Universe is thought to have exceeded the number of Stars in the Physical Universe in 2007.

Pros and cons of Big Data.

Pros:

  • There are almost unlimited storage possibilities for huge data volumes.
  • Big Data are now accessible from any place and via various devices as they are normally stored in Clouds.
  • The speed of Big Data transmission and processing is very high owing to cutting-edge technologies.
  • Modern analytical methods, technologies and tools allow analysts to gain very deep insights into Big Data, which was impossible in the past with limited data volumes and weaker processing tools.

Cons:

  • Big Data often have big noise, i.e. there may be many meaningless data points. The analyst should work hard to separate the wheat from the tares.
  • Big Data often implies privacy problems, which can be seen, for instance, from the analysis of social networks. Big data also means quite a low security level. It is natural as Clouds are always not as secure as on-site data warehouses.

 

How does Big Data affect you?

What do you think of when you think of “Big Data”? Perhaps, you are thinking of receiving some kind of personalised advertisement form a retailer. Please spare a couple of minutes to watch the trailer form movie “They live” (1988) by John Carpenter:

But big data is so much deeper and broader than that. On top of helping companies achieving their strategies, Big Data can be used to improve our lives, for example, health and security.

Everyone needs to fully understand big data:

  • what it is to them,
  • what is does for them,
  • what it means to them
  • how to use beneficially.

Statistical Analysis.

Q.1 Lift Analysis. Chips&Burgers.

Sausages ^Sausages
Burgers 600 400 1000
^Burgers 200 200  400
800 600 1400

Lift(Burgers, Chips) = (600/1400)/ ((800/1400)*(1000/1400)) = 1.05

Lift(Burgers, Chips)>1 → Positive Correlation.

Lift(Burgers, ^Chips) = (200/1400)/ ((800/1400)*(400/1400)) = 0.875

Lift(Burgers, ^Chips) <1 → Negative Correlation.

Lift(^Burgers, Chips) = (200/1400)/ ((800/1400)*(400/1400))= 0.875

Lift(^Burgers, Chips) <1→ Negative Correlation.

Lift(^Burgers, ^Chips) = (200/1400)/ ((60/1400)*(200/1400)) = 2.3

Lift(^Burgers, ^Chips) >1 → Positive Correlation.

 

2. Lift Analysis. Ketchup&Shampoo.

Shampoo ^Shampoo
Ketchup 100 200 300
^Ketchup 200 400 600
300 600 900

Lift(Ketchup, Shampoo)= (100/900)/ ((300/900)*(300/900)) = 1 →

→ Independent correlation.

Lift(Ketchup, ^Shampoo) = (200/900)/ ((600/900)*(300/900)) = 1 →

→ No correlation.

Lift(^Ketchup, Shampoo) = (200/900)/ ((300/900)*(600/900)) = 1 →

→ No correlation.

Lift(^Ketchup, ^Shampoo) = (400/900)/ ((600/900)*(600/900)) = 1 →

→ Independent.

Q.3. Chi Squared Analysis. Burgers&Chips.

Chips ^Chips
Burgers 900 (800) 100 (200) 1000
^Burgers 300 (400) 200 (100)  500
Total Column 1200 300 1500

χ^2=((900-800)^2)/800 + ((100-200)^2)/200 +

+((300-400)^2)/400 + ((200-100)^2)/100 = 187.5

χ^2 >0  → There is correlation.

Burgers and Chips: 900 sold, 800 expected → positive correlation.

Burgers and Not Chips: 100 sold, 200 expected → negative correlation.

Chips and Not Burgers: 300 sold, 400 expected → negative correlation.

Not Chips and Not Burgers: 200 sold, 100 expected→ positive correlation.

 

Q.4. Chi Squared Analysis. Burgers and Sausages.

Sausages ^Sausages
Burgers 800 (800) 200 (200) 1000
^Burgers 400 (400) 100 (100)  500
1200 300 1500

 

χ^2=((800-800)^2)/800 + ((200-200)^2)/200 + ((400-400)^2)/400 + ((100-100)^2)/100 =0      χ^2=0 → There is no correlation.

Each paired combination got the same value observed and expected.

 

Q.5.  When is Lift and Chi Squared a poor algorithm?

When you are using really large numbers, there is going to be a huge swing in the Lift value which will result in great limitation. Null-invariant measures were invented for this purpose, for example, Jaccard Coefficient, Kulczynski measure.

Try R – Troubling Doubling at School.

Riddle: 

The number of girls who do wear a watch
is double the number who don’t.
But the number of boys who do not wear a watch
is double the number who do.

If I tell you the number of girls in my class
is double the number of boys,
Can you tell me the number I teach? Here’s a clue:
More than 20; below 32!

*Solution to the riddle:

The number of boys must be a multiple of 3, so that it may be split in the ratio of (2:1).

  1. In RStudio I have given the parameters for
    • the number of boys who wear a watch
    • the number of boys who do not
    • the number of girls wearing a watch
    • the number of girls who do not.
  2. Then I grouped the boys and the girls using the following commands:
  • Boys=c(boyswwatch, boysnowatch)
  • Girls=c(girlswatch, girlsnowatch)
  1. Then I grouped boys and girls as “students”.
  2. In order to get 2 graphs on one picture, I have used:
  • par(mfrow=c(1,2))
  • Pieces of code were executed to get the results on the image. You can see the code below the picture.
  • The resulted image was saved as a .jpeg file.

  • On the R image, the breakdown of number of boys and girls is represented graphically.

    I had a look at some libraries for plotting in RStudio, i.e. ggplot2. It took me a while to figure out what code I need to use as I have never worked with statistics package before.  R programming language is for statistical computing and graphics, and is way different from Maple, MATLAB and LabView.

    I have decided to use the general commands for this example.

    Regarding my Riddle, there are a few things that could be visualised:

    • total number of girls:total number of boys
    • probability of a kid wearing a watch
    • probability of a child wearing a watch to be a boy
    • total number of watches in the class.

Here is the print-screen from Code School web-page – R course completed. Overall, I liked R language.

DianaPetuhovaRLanguage

Fusion Tables for Population of Ireland (2011).

This is the link to Google Fusion Tables:

https://www.google.com/fusiontables/DataSource?docid=1J1gS4dPoIgYAvhP5CN0gu-mMF2AblJqYNQ3vedU4

Screenshot (33)

Purpose of the project:
-To visually illustrate population of Ireland based on 2011 Census data.

Methods:

1. Data for Irish Population was taken from Central Statistics Office web-site:
http://www.cso.ie/en/statistics/population/populationofeachprovincecountyandcity2011/

2. The data was transferred into Excel worksheet. Then the Excel file was cleaned the way to suit the needs for the project.

3. Ireland Map .kml file was found and opened in the public data tables search.

4. The Excel file with Population 2011 data was adjusted once more to match Ireland Map .kml file (*County*).

5. Excel file was saved as .txt extension (Tab delimited) because it was the only one format that worked fine for this project.

6. Excel file .txt was uploaded on Google fusion tables, Separator Character – Tab, Character Encoding – auto-detect. And the new fusion table (see point 3.) was merged with .kml file using URL.

7. Feature Styles were changed using Custom Buckets (Column = Total Persons). Auto-legend was chosen as well.

8. The resulted heatmap was ready to share/publish.

Comments: It took some time to figure out which .kml file to use for visualisation and to decide what file extension would work with Excel file. But at the end, everything worked OK.

Results: The map reflects population density and one can easily see what counties are most or least populated with people. If you mouse-over an area of the map, it will give you info on: Name of County
How many Males
How many Females
Total Persons

Recommendations: Using the given data and Google fusion tables’ features, it is possible to demonstrate which county is more populated with men or women, what areas of the country are overpopulated and then do some research on the reasons why it is like that and is there a need to balance it out.

Conclusions: Google Fusion Tables is a very handy tool to provide visual representation of entered data. Depending on the data specialization, the resulted heatmap may show other statistical data, which is very useful for people working in social sector.