sbtools: A Package Connecting R to Cloud-based Data for Collaborative Online Research
Abstract
The adoption of high-quality tools for collaboration and reproducibile research such as R
and Github is becoming more common in many research fields. While Github and other version
management systems are excellent resources, they were originally designed to handle code and scale
poorly to large text-based or binary datasets. A number of scientific data repositories are coming
online and are often focused on dataset archival and publication. To handle collaborative workflows
using large scientific datasets, there is increasing need to connect cloud-based online data storage to
R. In this article, we describe how the new R package sbtools enables direct access to the advanced online data functionality provided by ScienceBase, the U.S. Geological Survey's online scientific data storage platform