Metadata-Driven Analysis Tools

October 18, 2017 | Penulis: Consortium of Universities for the Advancement of Hydrologic Science, Inc. | Kategori: Metadata, Sas (Software), Spss, Matlab, Web Service
Share Embed


Deskripsi Singkat

Description: One of the onerous but necessary tasks in performing quality control or analyzing data is creation of stati...

Deskripsi

One of the onerous but necessary tasks in performing quality control or analyzing data is creation of statistical programs that ingest data in a wide variety of formats. Depending upon the number of columns and the complexity of their format within data tables, this task can vary from a few minutes to an hour or more for each data table. When dealing with many different data tables, the cumulative effort can require a substantial amount of time and effort. However, metadata-driven tools can make creation of basic statistical programs a matter of seconds, not hours. Adequate metadata should include all the information needed to read and understand the associated data. If structured, as in Ecological Metadata Language (EML), the metadata elements needed to read data tables can be extracted automatically and used to write simple programs for use with statistical programs. These simple programs can input data, perform basic quality assurance checks, such as type and range checks, and provide basic statistical summaries. With our colleagues, we have developed a variety of tools for transforming EML metadata into useful statistical programs for R, Matlab, SAS or SPSS. Each program downloads the data (if the data is available online), ingests all the data tables associated with a particular dataset, performs basic QA/QC and creates simple statistical summaries. The programs leave the investigator with a set of pre-ingested, analysis-ready data. The programs can then be modified by investigators to add new analysis steps. We have created web services that, given the identity of a specific dataset in a repository or any web-accessible EML metadata will create R, Matlab, SAS or SPSS programs that are described at http://www.vcrlter.virginia.edu/data/eml2/PASTAprogHelp.html, with detailed information at: http://www.vcrlter.virginia.edu/data/eml2/PASTAprogWebService.pdf. There is also a web-portal for statistical code generation at: http://www.vcrlter.virginia.edu/data/eml2/eml2stat.html. EML Metadata can also be used to drive web-based analysis environments to perform quality assurance analyses, map dataset locations and even to create online graphics (see http://ngis.tfri.gov.tw/modules/modules_en/). Additionally, the Kepler and DataONE environments support EML-based tools for automating analyses.
Lihat lebih banyak...

Komentar

Hak Cipta © 2017 PDFDOKUMEN Inc.