After we introduced the four Vs of the data, we will introduce why the data is so popular. Why in the past few years, when everyone would say, "We want to learn the data, we want to test the true nature of the data, we want to use the data to improve the country, to improve the business." Why did the data become so popular? There are several reasons. The first reason is the progress of technology. This is the progress of the development of technology. But this is the truth. We will explain in detail later. The second part is the improvement of the basic architecture. What is the basic architecture? It's okay, we will explain later. The top part is that the data has changed a lot. We can remove the data a lot. So there will be large data. If our data is all the same, each 100MB, 200MB, what data we want to analyze? The data is not large enough. So the data volume has increased, the data development has become so popular, it's an important part. So these three sections are the progress of the technology, infrastructure, the development of the basic architecture. The third part is the data has really increased. So the progress of the technology has included three parts. Show more
The first part is the improvement of the computing ability. Now our computers can calculate the amount and speed. It's completely different from the old computers. Now your phone may be faster than your old computer. Such a small phone can do various things. So the computing power of the computer has become faster. The speed of the computer has become faster. It means that we can analyze such large data. When the speed was not so fast, we may open the file and analyze it for 10 days or 20 days. After we analyze it, it's too late to react. Things have already happened. If we want to predict a disaster, this will happen. So it's too late to analyze. So the computing speed has become faster. It's a very important part of the popular data. The second part is that the hardware has become cheaper. If the computing speed is very fast, then only the desktop can afford it. It's useless. So the hardware has become cheaper. All companies can do the hardware upgrade with large data. It's also a very important part. The third part is the storage of the cloud. The storage of cloud has become very cheap. I think that everyone should use the cloud storage for free. Our school has started to have Office 365. The OneDrive provides a very large space. This space was not able to leave the data on the Internet. The space for cloud storage and hardware have become larger. It's also a very important factor for large data analysis and large data. Let's take a look at the computing power. This computing power has been in the world for three years. Show more
It will turn twice every three years. When your processor becomes twice the speed of computing in three years, you can use the same data for one-third of the time. This is actually very scary. Why hasn't anyone said this word ten years ago? Because it's been three years. So the speed of computing has increased by eight times. So the speed of computing has increased faster. We can reflect on it. We can calculate the things we want and make the right decision in time. Then it's cheaper. The processor has become cheaper. It will become one-third of the price Show more
in one-third of the year. So the next year, when you buy the same thing, it will become one-third of the price. It will become $1,000 and $500. It will become $250. So it's like we have a small market. It's also a burden. So it's become popular. Because the mid-sized enterprises are also burdened. It's not just the mid-sized enterprises. A big company is also burdened. The mid-sized enterprises are also burdened. We can bring this concept into your lives. So the price is cheaper. It's also very important. Of course, it's the cloud storage. I think these things, these cloud hard drives, they are very familiar. From left to right, Show more
it's OneDrive, DrawBox, Google Drive and Box. The price of each of them is about $300 per TV. The current eTB hardware is about $2,000. It's about $300 per TV on the cloud. $300 is about maybe less than 10 times per TV. So if you drink less per TV a month, you can have the eTB content on the Internet. Actually, this is very, very cheap. But in fact, in the data analysis and the data management, the cloud hard drive is not the more familiar cloud hard drive. The cloud storage we use more often mainly have three. Taiwan is more common. Show more
Actually, it's also abroad. The first is Amazon Web Service, called AWS. Amazon is actually very early in the cloud. They are not only selling those frauds, they are not only running some human services. The calculation in the cloud actually started very early. They provide cloud storage, this is called simple storage service, called S3, because it's 3S. Simple storage service. It's called 4V, S3, etc. The second one, I've heard of it many times, is Microsoft Azure. This Azure is also a cloud service. You can buy the power of the calculation, the space of the storage, etc. The third part is that it's been quite hard to push in Taiwan, called GCP, Google Cloud Platform GCP. If you see it on the Internet, it says GCP. Google Cloud Platform is the same, you can buy the speed of the calculation, you can also buy the space of the storage. Usually, everyone will buy it together, because you buy a one-on-one on the Internet, it's a server, you may also use the space of the storage. So it's usually sold in one set. How much do you use at the moment? If your service is not that long, you can use it very quickly, without having to use a server. It's very convenient. Next, infrastructure. The infrastructure is called the basic architecture. I didn't want you to see the map, Show more
the manufacturers, but I want to feel the experience of the data analysis, the data processing, so many manufacturers, except for all of them, whether it's free or money, or the software that is open source. You can use these software, even if you have a lot of them, it's free, you can easily price it, you can analyze the data, this can be downloaded. So the data analysis becomes very easy. You don't need to write a city again, you just need to download the things that others have written, you can start using them. So when the input becomes easy, the input person becomes more, the technology becomes more complex, naturally the same person will become more. So why do we say infrastructure infrastructure becomes popular with the big data? It's a very important part. The third one is the connectivity. What is connectivity? I don't know if you know how Facebook controls your life. Show more
Facebook, if you want to say "I don't like it" then you don't know that you like it. If you are on Facebook, whether it's the website or the app, you just need to stop at some part. For example, I watch this ad, I don't like it, but I secretly watch it. It actually has records, how much time you stop at a certain part of the page. That is to say, it is chasing your mouse, chasing the time you stop on the screen. These are all data. Except for the video you upload, the like you press, the fan-signal you like, the video you upload, the photos, etc. The time you stop is recorded. So this data is very, very scary. You all like Google, you like to copy and paste. You can actually download and select every Google search, and record it. So why do we say that there are more data? Because you are a "helper" to use these data. When you are on the Internet, you can actually see a very outstanding icon. It is called Infographic, a data map. Show more
Infographic says how many data will be produced in one minute. Let's take this example, YouTube, there are 48 hours of video uploaded in one minute. In one minute, there are 2 million Google search is pressed, and then sent out. For example, in one minute, Facebook has so many shares, you share. So actually, we have been on the Internet for so long, these companies provide very important data source. We provide these important data source, they can use these data to analyze who you are going to send it to, see how you are going to recommend your friends, how you are going to recommend your favorite fans, etc. Now, you all like to use Instagram, Facebook and Facebook are not open. In fact, every one minute, there are 3600 new photos. This is actually quite exciting. So, using this Infographic, because of the network connection, because of the phone's supply, every minute's production of data is much more than you think. In addition to these, what else can you imagine? What else do you have? In fact, you may not go to the hospital, because I am doing research on medical data analysis. So we have to consider medical data. Show more
So, for example, MRI, for example, CT, broken into small pieces, and then we have to do the MRI, and so on. These images, each one may be 1GB, a few hundred MB, a few 10MB, etc. These data, in fact, is quite large. So every time you go to the health check, there are also a lot of data. So, actually, the health data is much more than you think. There are so-called "trial map" in Taipei City. This data is actually a copy of the text. The text is hard to understand. For example, Show more
"Middle East Road" No. 356, who knows where it is? So, it is a good newbie to draw it into such a trial map. So, in fact, there are a lot of open data. From all these channels, you can know what is the place where you are. Whether it is salt water or not, I have to know what kind of things I have to do. So, we just mentioned salt water. For example, in this area, if you live near the double-deck station, you may have to think about salt water. If the rain is over 78.8 mm/h, you may have salt water. This salt water map Show more
is from the open data. So, not only you may not be able to get the data, for example, the Facebook data, for example, Twitter data, or medical data, you may not be able to get it. But like the map of salt water, like the map of the Sky Museum, in fact, you can find a lot of resources on the Internet. You can imagine what kind of application you can do. So, this is the first part. What is the big data? Let me review this part. There are four V's in total. The first V is volume. The big data is that the volume is large enough. The second V is variety. We can see that there are maps, Show more
images, videos, sounds, etc. So, the diversity is very high. The third part is velocity. You can imagine the patients in the hospital are producing new data every second. If you receive millions of requests every second, the result of the big data is very fast. The last V is velocity. Velocity is the accuracy of the data. We must distinguish what data is useful and what data is not useful. Then, we make a further so-called assertion. The data is four V's. Everyone, please put these four V's in your heart. It is very important. [MUSIC] Show more