Introduction » CIS: background and data types » Data types | |
Data typesData are handled and stored in one of three ways (Census, Sample and Region) reflecting their origins and use. It is important to understand the differences between them. All data are spatially registered using the Ordnance Survey's grid co-ordinate system. Values represent the location of kilometre squares (i.e. they have a 3 digit number defining their position on the east-west axis and a 3 digit number for the north-south axis). Regions are defined by groups of kilometre squares and are stored in Region files (stored on your computer with an extension of *.rgn); a region is the area on which an analysis is performed. It may be a pre-defined area such as a country (England, Scotland Northern Ireland or Wales) or generated as a result of a data query (e.g. all squares with more than 25 ha of coniferous woodland). Regions are the mechanism by which data in sample and census files are extracted and reported. Census datasets contain an independent value for a feature or element (e.g. built up and gardens) for every location. They are stored in Census files (with a file extension *.ccf) and represent spatially comprehensive surveys. Census datafiles can contain different data forms and spatial resolutions. The three main types are continuously variable numeric data, such as the spatial extent of cover types; numeric data that represent a ranking (e.g. soil pH or critical load to nitrogen); and categorical data that can be labelled with numbers or names (e.g. environmental zones and counties). Land classifications used as stratifications for sampled data are stored as census files. When only a limited number of representative observations are recorded to describe a group of squares the whole group are given the average (arithmetic mean) and they are stored as a Sample file (with the extension *.csf). To improve the overall estimate, the variation in the observations can be reduced by dividing into sub-groups with similar characteristics. The sub-groups are called strata and both the sample data points and the full population of squares need to be stratified using the same system, so two datafiles (one sample and one census) are needed to interpolate results. Assuming the samples are collected in an unbiased way, a statistical description of their variability (a standard error term) can be calculated and stored along with the average values. When estimates are calculated for regions of interest the standard errors can be weighted for the different strata sizes and presented as a measure of confidence in the results. The sample files supplied with CIS originate from the field survey components of the Countryside Surveys and should only be used with care until you have a good appreciation of the limits of their use. The selection of sample sites was stratified by the ITE Land Classification, which identified 32 (later expanded to 40) environmental land classes1. Sample data are automatically interpolated across their land classes following any alteration in size of the region. Alternative classifications of kilometre squares and sample files can be used within CIS, but statistical expertise is needed to make sure the results are valid. 1 Bunce, RGH, Barr, CJ, Clarke, RT, Howard, DC and Lane, AMJ (1996). The ITE Merlewood land classification Journal of Biogeography 23, 625-634
|
|
navigation |
|
|
© 2005 content CIS | interface developed by Internet Solutions from ADAS |
|