Big Data Analytics with GIS
- Description
- Curriculum
- FAQ
- Reviews
Everyone is talking about big data and GIS, butĀ is anyoneĀ really doingĀ it? In this courseĀ youāllĀ workĀ with
gigabytesĀ of data toĀ solve manyĀ different spatial and data relatedĀ questions. All the software is free, but don’t let that fool you: we’ll be using the most effective open source products like Postgres and QGIS, and we’ll even perform parallel processing with Manifold Viewer – I hope you have a multi-core computer to see how fast this stuff is!
AtĀ the end of the course, youāll understand:
- theĀ principles behind big dataĀ geo-analytics and theĀ role of statistics, databases,Ā parallel processing, andĀ hardware and software in support of big data geo-analytics.
- how to use open source software and ManifoldĀ GIS to perform parallel processing and manage spatialĀ data.
- how to conduct a big data geo-analytics project by interrogatingĀ multi-gigabyte realĀ world databases.
And best of all, the software we use in this class is FREE and easy to set up – you’ll do it all yourself!Ā Ā
The courseĀ is taught by Dr. Arthur Lembo who is a Professor at Salisbury University and has worked in the GIS field forĀ
over 30 years.
-
3Getting started with the course
As we get started in our class to learn about big data analytics with GIS, this lecture is going to provide you with an overview of the class, the goals and objectives, and the data sets we will be using.
-
4Our Data
In this course, we are going to use a variety of data so that we can see how big data analytics can work in different contexts. Our data will come from municipal government, emergency management, natural resources, and business. Also, our data will come in many forms: databases, comma separated files, ESRI geodatabases, etc.
-
5The Three V's
Big data is discussed in many different ways. We are going to explore the concept of Volume, Variety, and Veracity. At the end, we'll see how working with the three V's will enable to us add one more V: Value.
-
6When more is less: a quick example
While you can use the software and follow along just about anywhere in this course, this lecture will be our first official hands-on exercise by joining tens of thousands of soil boundaries and tens of thousands of building centroids.
-
7Types of data
Just because a computer lets you push a button to do some work, doesn't mean you should. Understanding the different types of data is very important if you are going to choose the right tools to analyze big data.
-
8Classifying data
Sometimes big data is simply too big to comprehend. The difficulty of comprehending really large data sets require us to classify them in order to show general trends or patterns. Also, classification allows us to present our data to a broader audience, in a more understandable way.
-
9Introduction to databases
One of the best tools for working with big data is a robust enterprise class database. In this lecture, you'll lean about different databases, both large and small. We'll also focus more on the use of Postgres as a database. And, knowing how to use Postgres is important because it is powerful, and free.
-
10Hands on SQL in Postgres
SQL is the language of databases. If you want to stick with using wizards that hold your hand through processes, you are never going to progress in your career. In this lecture, we'll write dozens of SQL statements to interrogate the attributes of traditional databases.
-
11spatial SQL queries
Adding to our SQL knowledge will be the use of SQL within a spatial context. Here you'll learn that Spatial is not Special, and that by knowing how to apply SQL techniques within a spatial context, you'll have unlimited power as a data analyst.
-
12Understanding indexes
All the SQL in the world is not going to do you any good if it takes to long to analyze data. This lecture goes over the long lost art of using indexes within databases. For most of us, we never think of using indexes because the data is so small, and computers are so fast. But, if you are going to work with big data, you better get used to working with indexes - they will make your life so much easier.
-
13Descriptive Statistics
-
14The Central Limit Theorem
The Central Limit Theorem is an extremely important statistical concept to consider when working on big data. And, now that you've learned about indexes, you'll see how we can leverage both of these concepts to provide excellent estimates of data results within seconds.
-
15Coordinate Systems
-
16Multi-core processing
-
17Hardware and Software
-
18Spatial Indexing
-
19When more is less: an example from the field
There are many clever ways to solve problems, and when you understand how indexes work (both spatial and attribute), you can tackle really big problems. This lecture is going to show an example from my own research where we had to do some creative data preparation to improve the performance.
Social Network