2019-08-29T14:08:29 n4158

larc - Least Angle Regression Companion

Viewed: 1756

This repository contains the data and code necessary to replicate the analysis described in the PLOS ONE article:'Ultrahigh Dimensional Variable Selection for Interpolation of Point Referenced Spatial Data: A Digital Soil Mapping Case Study'  by Benjamin R. Fitzpatrick (BRF), David W. Lamb (DWL) and Kerrie Mengersen (KM).

Code and repository authorship was the sole responsibility of Benjamin R. Fitzpatrick.

The code file example_analysis.R illustrates how the functions included in this repository may be used to replicate the analysis described in the article. The article discusses the relevant theory and demonstrates the application of these methods to a geostatistical case study. This repository contains a set of functions written in the R Language for Statistical Computing. The analysis this repository enables makes heavy use of the Least Angle Regression (LAR) algorithm for finding Least Absolute Shrinkage Selection Operator (LASSO) regularised solutions to multiple linear regression problems. An R package for conducting Least Absolute Shrinkage Selection Operator (LASSO) variable selection with the LAR algorithm already exists and is hosted on the Comprehensive R Archive Network under the name 'lars'. This repository makes heavy use of functions from the 'lars' package.

This repository contains functions that:

  • randomly generate unique divisions of a sequence of numbers into two groups of user specified sizes (the intent being that these two groups of numbers are used as row indices to create training and validation sets from a full dataframe)
  • use the LAR algorithm within a cross validation scheme in a manner that permits greater control of the particulars than is provided by the cv.lars( ) function from the 'lars' package
  • use chord diagrams to visualise the covariate selection frequencies that result from conducting LAR within a cross validation scheme
  • model average the predictions from the models selected for each of the training sets in the cross validation scheme
  • interpolate a geostatistical response variable to a full cover predicted raster via such model averaged predictions.

The functions provided here depend on the R packages:

Access rights

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program, in the form a text file titled 'LICENSE. If not, see http://www.gnu.org/licenses/. Contact Information b1.fitzpatrick@qut.edu.au Mr Benjamin R. Fitzpatrick

Geographical area of data collection

kmlPolyCoords
153.079103,-27.475784

Publications

Ultrahigh Dimensional Variable Selection for Interpolation of Point Referenced Spatial Data: A Digital Soil Mapping Case Study https://eprints.qut.edu.au/109005/

Research areas

Agricultural soil science
Polynomials
Linear regression analysis
Physical geography
Algorithms
Interpolation
Carbon Sequestration
Machine learning

Cite this collection

Fitzpatrick,Benjamin. (2017): larc - Least Angle Regression Companion. [Queensland University of Technology]. https://doi.org/10.4225/09/5964187f3b994

Data file types

R files which require R packages: lars, ggplot2, randtoolbox, raster and leaps.

Licence

GNU General Public License (GPL)
http://www.gnu.org/licenses/gpl.html

Copyright

© 2016 Benjamin R. Fitzpatrick, David W. Lamb and Kerrie Mengersen.

Dates of data collection

From 2016 to 2016

Connections

Has association with
Kerrie Mengersen  (Researcher)
Has chief investigator
Benjamin Fitzpatrick  (Researcher)

Contacts

Name: Mr Benjamin Fitzpatrick

Other

Date record created:
2017-07-19T16:13:17
Date record modified:
2019-08-29T14:08:29
Record status:
Published - Open Access