Saturday, February 4, 2017

Implementing Sampling Techniques

Introduction

  The first goal of this lab is to learn about different types sampling with an emphasis put on spatial sampling. The second goal for this lab is to work within a group of three people to collect elevation data from a 114 cm² sized sandbox using one of the three sampling techniques explained below.
  Sampling is a method used to gather data and represent an entire area or population without measuring every single observation. Spatial sampling is when all of the points in the sample are located withing a certain geographical area. There are three main types of spatial sampling: random, systematic, and stratified. 

Random Sampling
  Random sampling does the best to avoid bias. The points collected are completely random. This has both positive and negative consequences. One of the main negatives is that a large part of the geographic region could be not sampled, making it a bad sample. However, random sampling decreases the likelihood that large areas will be missed, so this doesn't happen too frequently. Below in figure 1.0 is an image of a spatial random sampling method. Because the sample is random, no spot is any more likely to be chosen than any other spot. That is why there are some point clusters and some areas where there are a lack of points.
Random Sample
Fig 1.0: Random Sample
Image Source: Statistical Consultants Ltd 2010 
Systematic Sampling
  Systematic sampling consists of collecting the sample points in a regular and methodical way. On a grid, sample points could come from the middle of the square, at the intersections, or in some other regular fashion. Benefits of systematic sampling include that the area will be well covered in a uniform way and that it is relatively simple to follow. Disadvantages include that points are not selected randomly because points are predetermined, and that this can lead to a misrepresentation of a specific pattern. An example of a systematic sample is shown below in figure 1.1.
Systematic Sample
Fig 1.1: Systematic Sample
Image Source: Statistical Consultants Ltd 2010 
Stratified Sampling
  Stratified sampling is used to make sure all parts of the study area or population get represented. Subgroups or subareas are created from the whole grid or area, and then either a systematic or random sampling method is applied. Doing this will make sure that all subgroups or subareas will contain an even proportional amount of sample. Therefore, each area will be represented equally. An example of a random stratified sample is shown below in figure 1.2. The grid is broken up into 16 sub-squares each containing 25 little squares. Because each sub-square is the same size, each contains three sample points. If the sub-squares were not of equal area then the number of sample points would be proportional to its area, making sure that no sub-area gets left out of the sample.
Stratified Sample
Fig 1.2: Stratified  Random Sample
Image Source: Statistical Consultants Ltd 2010 

Methods

  After reviewing the three sampling techniques, our group chose to use a systematic sampling method. This method was chosen because it would be the most accurate way to determine where the sample points are. Also, it made the sample points easily identifiable, and simpler to enter into an excel spreadsheet. Our group could have also used the random sampling and stratified sampling methods, but chose not to because by using the systematic method it was assured that the data points would be the most accurate and that the most data points would be collected. This would assure that the data collected represented the whole area well. A photo of the study area is shown below in figure 1.3. In the sandbox there were a few depressions, a valley, a large ridge, a couple hills, and a flat plain.
Sandbox Study Area
Fig 1.3: Sandbox Study Area
  The sandbox was located in Eau Claire Wisconsin on Eau Claire Univeristy's campus, about 100 yards east of Phillips hall. The weather conditions during the lab were not ideal: 23°F with snow and a 10 mph SE wind. Materials used in the lab include string, tacks, sand, wood, a meter stick, a ruler, cell phones, and a computer.
  Our group set up the sampling scheme first by creating varying topography in the sandbox. Then, a grid was created using the string and tacks. the tacks were already placed into the wood when our group arrived. The tacks were evenly spaced 6 cm apart. The String was strung around each of the tacks making a 19 by 19 square grid. Each square was 6 cm by 6 cm large. Below, in figure 1.4 is what the sandbox looked like after the grid was created. Its difficult to see from the picture, but both a white string and an orange string were used to make the grid. The knot in the line, and the orange piece of string going diagonal served no significant purpose.
Sandbox Grid
Fig 1.4: Sandbox Grid
  One measurement was taken in each grid square. Instead of measuring the elevation in the exact center of the square, the measurement was taken along the northern edge in the center of the square. This made measurements more accurate and easier to take.
Entering Data into Excel
Fig 1.5: Entering Data into Excel
  Each group member had a specific job while the data was being collected. Because of the weather conditions, our group decided that it would be best if the entries were typed directly into the computer. Data entered into a notebook would have likely gotten wet, and would have been difficult to record. To be able to communicate which square was being recorded, the squares where referenced by rows and columns with each row and column being assigned a number and letter respectively. One person took the measurements and told the communicator the value. The second person was on the phone with the third person communicating these values, and had the job of making sure the values were communicated correctly. The third person's job was to enter the data directly into excel in the geography lab as measurements were taken. Figure 1.5 shows where the data was entered in the lab with my two group members. This photo was taken while our group was still setting everything up. After every row was complete, verification would be done to make sure everyone was in line with each other.
  Instead of accounting for zero while the measurements were being taken. Our group decided that it would be best to enter in the raw data and then determine the zero point afterwards. Raw values equaled the distance from the sand to the string, as this was the best way to record and communicate measurements. Once the survey was complete, our group decided that the zero point should be the lowest measurement taken. This would make for all values to be positive and make analysis easier with less confusion. This was done by taking the raw value multiplying it by -1 and then adding 128. All measurements were taken in millimeters. The value 128 was added because this was the highest raw value in the data set which meant it was actually the lowest value in terms of elevation. Another interesting thing to point out is that the highest elevation point was flush with the grid string. This made it simpler to adjust the raw data into meaningful values. It eliminated a subtraction step not needed in the process above.

Results/Discussion

  One value was record for each grid square. With the grid being 19 by 19 squares large, a total of 361 measurements were taken. The following statistics are based off of the adjusted sample values which were calculated using the process explained above. These values are the elevation values which carry the most importance. The range of the data was 128 mm, with the highest point being 128 mm and the lowest point being 0 mm. The median of the sample was 70 mm. This makes sense, because there was a fairly large plain area which had many values in the 67 mm to 70 mm range. the mean of the data was 68.6 mm which means that the data was slightly left skewed. This means that there were more really low elevation points near 0 mm than there were high elevation points near 128 mm. The sample standard deviation was 25.3 mm. This means that 68% of the values ranged between 43.3 mm and 93.9 mm.
  The systematic sampling method did a good job of representing the entire sandbox area. Having 361 points, one every 6 cm apart was also a pretty simple way to take the measurements. A stratified or random sampling method would have not been better suited for this lab. The benefits of the systematic approach were to much to change the method. These benefits included taking the measurement in the same relative point in each square and having the whole area be represented. Our group stuck with the same method throughout the lab. This made all of our measurements consistent with each other.
  Once the data was in excel, our group used the cell styles feature to give each measurement a certain hue of blue. This was a good way to visualize the data, and helps for basic spatial analysis. Interestingly enough, only four squares separated the highest value from the lowest. Below, in figure 1.6, is an image showing the sample data in excel. The darker hues of blue represent a lower elevation, and the lighter hues of blue represent a higher elevation. By looking at the image, one can clearly see where the ridge, plain, hills, depressions are located. 
Visual Representation of Measurements
Fig 1.6: Visual Representation of Measurements 
Snow Accumulation
Fig 1.7: Snow Accumulation
  Although this is a nice way to present data in excel, it is useless for importing into ArcMap. The values were entered into a new excel sheet with the columns being the "X" field, the rows being the "Y" field, and the elevation being the "Z" field. The data was not exported into ArcMap in this lab which is why the visual representation of the data was made in excel. However, the data will be exported into ArcMap in the next lab.
Fig 1.8: Measuring Elevation in the Snow
  There were a few issues that arose. First, the string used for the grid was all tangled up when our group opened up the sandbox. This was solved by slowing untangling one of the strings, and then by grabbing a new string inside the geography lab. The next issue was the meter stick. It was too tall and bendable and did not work well to measure the elevation. After about 50 or so measurements, our group switched to using a ruler shown in figure 1.7. This was much easier to measure the topography. It was less bulky, and was clear in color. The last issue was the snow, shown at right in figure 1.7. A total of 2 cm fell while our group was collecting data. This was a real nuisance as it made it difficult to collect measurements. Before each measurement was taken, about a centimeter or two of snow had to be moved to reach the sand. This was done using the ruler as there was no better way to do it. Luckily, the snow was very light and fluffy, so it could be moved very easily. Figure 1.8, displayed at right, shows the measurement locations, which become identifiable because of the snow.

Conclusion

  The sampling method chosen worked well for this lab. The samples collected using the systematic method represented the entire sandbox evenly. Because elevation changes so frequently, it was important to collect many sample points. That is why our group collected 361 of them. Other sample methods such as the random sampling technique would have likely left out important areas of elevation change. Using a stratified sample method would have also worked, but it was unnecessary given the size of the sandbox. Given this, if the lab were to be done again, the same method would have been implemented
  Sampling is used in a spatial situation because it would often cost too much money and too much time to measure every point in the area. On the sandbox scale, taking a measurement every 6 cm was realistic. On a larger scale this would be impossible using the method our group used. This relates to not only sampling elevation data, but many other spatially distributed things, such as snow depth, or a population of a certain species. At a larger scale, a stratified, random sample would be more appropriate. This would allow for all areas to be represented proportionally, and for the data points to be random.

Sources

Statistical Consultants Ltd. 2010. Sample Image Examples
Rogerson, P. A. (2015). Statistical methods for geography. pgs 151-153 London: Sage.
R. (Ed.). (n.d.). Sampling techniques. Retrieved February 04, 2017, from            http://www.rgs.org/OurWork/Schools/Fieldwork+and+local+learning/Fieldwork+techniques/Sampling+techniques.htm

  


No comments:

Post a Comment