• Ways to Give
  • Contact an Expert
  • Explore WRI Perspectives

Filter Your Site Experience by Topic

Applying the filters below will filter all articles, data, insights and projects by the topic area you select.

  • All Topics Remove filter
  • Climate filter site by Climate
  • Cities filter site by Cities
  • Energy filter site by Energy
  • Food filter site by Food
  • Forests filter site by Forests
  • Freshwater filter site by Freshwater
  • Ocean filter site by Ocean
  • Business filter site by Business
  • Economics filter site by Economics
  • Finance filter site by Finance
  • Equity & Governance filter site by Equity & Governance

Search WRI.org

Not sure where to find something? Search all of the site's content.

Overview of 100+ Climate Data Platforms

This Data Visualization is part of Climate Watch . Reach out to Irene Berman-Vaporis for more information.

  • Irene Berman-Vaporis

From country to city-level emissions, measuring adaptation vs. mitigation, tracking climate finance from the public and private sectors, understanding global climate goals and the action needed to meet them requires a wealth of information. Data exists for many aspects of climate change, but with hundreds of platforms and countless datasets, it can be difficult to distinguish the best information for a particular need, or to find where data gaps exist.

This interactive visual shows a matrix of more than 100 major climate data platforms, displayed by topic (x-axis) and geographic level/scale (y-axis). Users can also view this data as a filterable table or download it directly .

Potential use cases include:

  • Climate data analysts can find the most relevant platform for their specific topic and the level at which they work.
  • Funders and data creators can pinpoint data gaps, check if there are existing platforms on a subject to avoid redundancy of new work, or leverage synergies by building on existing work.
  • City planners can find data on peer cities’ emissions and climate actions, as well as how those actions might relate to national-level targets.
  • Decision makers looking for specific data on policy, finance or other topics can find relevant data for various scales, comparing policy or investment from various cities, countries or regions.

Some of the notable patterns found in curating this dataset include: an overarching focus on mitigation, energy and country-level data; numerous platforms showcasing similar datasets; and a general lack of maintenance, even on carefully built datasets. These all underscore a clear need to improve the world’s climate-related data infrastructure.

Projects that include this Resource

rural-village-clean-energy

Climate Watch

Climate Watch offers powerful insights and data on national climate plans, long-term strategies and greenhouse gas emissions to help countries achieve their climate and sustainable development goals. 

Primary Contact

Irene Berman-Vaporis

Head of Communications, Systems Change Lab and Climate Watch

How You Can Help

WRI relies on the generosity of donors like you to turn research into action. You can support our work by making a gift today or exploring other ways to give.

Stay Informed

World Resources Institute 10 G Street NE Suite 800 Washington DC 20002 +1 (202) 729-7600

© 2024 World Resources Institute

Envision a world where everyone can enjoy clean air, walkable cities, vibrant landscapes, nutritious food and affordable energy.

Logo

  • MOTIVATION.
  • LINKING COMMENTARY TO CLIMATE DATA.
  • TECHNICAL ARCHITECTURE.
  • EXPLORING COMMENTARY IN SPACE AND TIME.
  • FUTURE PLANS FOR CHARME.

Berrick , S. W. , G. Leptoukh , J. D. Farley , and H. Rui , 2009 : Giovanni: A web service workflow-based data visualization and analysis system . IEEE Trans. Geosci. Remote Sens. , 47 , 106 – 113 , doi: 10.1109/TGRS.2008.2003183 .

  • Search Google Scholar
  • Export Citation

Blower , J. D. , and Coauthors , 2014 : Understanding climate data through commentary metadata: The CHARMe project. Theory and Practice of Digital Libraries-TPDL 2013 Selected Workshops , L. Bolikowski et al., Eds., Springer, 28–39, doi: 10.1007/978-3-319-08425-1_4 .

Dowell , M. , and Coauthors , 2013 : Strategy towards an architecture for climate monitoring from space, 39 pp. [Available online at www.wmo.int/pages/prog/sat/documents/ARCH_strategy-climate-architecture-space.pdf .]

Kershaw , P ., 2014 : Data model for commentary metadata (d400.1). Tech. rep., The CHARMe Project .

Nagni , M. , and P. Kershaw , 2014 : Concrete encodings of commentary metadata (d400.3). Tech. rep., The CHARMe Project .

Rood , R. , and P. Edwards , 2014 : Climate informatics: Human experts and the end-to-end system. Earthzine . [Available online at www.earthzine.org/2014/05/22/climate-informatics-human-experts-and-the-end-to-end-system/ .]

An example of an annotation that describes a dataset found via the CMSAF website.

The basic Open Annotation (OA) data model showing how an annotation links one piece of information (held in the body) with another (the target), and is described using the Resource Description Framework (RDF).

The CHARMe data model being used to link a comment about a conference paper to the data it cites (in this example, the “ATSR2lb product”). The links are encoded using standard ontologies for describing resources such as RDF and Dublin Core (dc); see the For Further Reading section for more details.

Overview of CHARMe client-server architecture, with the JavaScript plug-in shown as an example of a client program. More details on the technologies used can be found in the technical documentation listed in the For Further Reading section. The CHARMe node (blue box) provides a range of web interfaces (green box). SPARQL and REST-ful (OpenSearch) query interfaces are provided for structured queries using RDF metadata. A REST API allows for submission, deletion, or modification of annotations and supports a JSON-LD serialization. The security layer provides access control via an OAuth 2.0 interface. Validation middleware checks the format of what has been submitted. A web admin interface allows users with the appropriate privileges to log in to the node directly (e.g., as a moderator) or to set up new data providers. The triple store and interface (gray box) is based on Apache Jena and Fuseki, respectively. The triple store is augmented with a NoSQL plug-in to index information and improve performance for search.

The screenshots show a user viewing an annotation via (left) a browser or (right) the plug-in. If the user is the creator of the annotation, or has moderator or superuser privileges, the interface will include a "Delete" button, as highlighted, with the further ability to "Modify" an annotation via the plug-in.

The Significant Events Viewer being used to explore time series of global ozone in two reanalysis datasets (ERA-40 and ERA-Int), with a significant event timeline plotted below. The right-hand panel, “Event Information,” describes the selected significant event (indicated by the yellow bubble). The clear CHARMe icon (top right) indicates that there are currently no user annotations recorded for the selected significant event.

A screenshot of the CHARMe Maps tool, highlighting the “fine-grained commentary” capability. Here, the user is visualizing two variables from a dataset (sea surface temperature and its associated error field) along with comments that have been attached to specific points or regions within the dataset. Note that each variable is associated with a different set of commentary.

A screenshot of the CHARMe Maps tool, highlighting the “intercomparison” capability. Two different albedo datasets are being visualized (left), with the two right-hand columns showing the commentary metadata that have been attached to each dataset. The dataset described in the right-most column (corresponding to the lower one in the visualization panel) has a number of publications attached to it.

Simplified representation of the data model for fine-grained commentary illustrating the use of Open Annotation's capability to annotate spatial and temporal subsets of a resource (in this case, a climate dataset). The full data model for fine-grained commentary includes more properties of the SubsetSelector.

  • View raw image
  • Download Powerpoint Slide

climatic data capture handling and presentation

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 488 139 8
PDF Downloads 136 37 5

Related Content

The impacts of interannual climate variability on the declining trend in terrestrial water storage over the tigris–euphrates river basin, evaluation of snowfall retrieval performance of gpm constellation radiometers relative to spaceborne radars, a 440-year reconstruction of heavy precipitation in california from blue oak tree rings, quantifying the role of internal climate variability and its translation from climate variables to hydropower production at basin scale in india, extreme convective rainfall and flooding from winter season extratropical cyclones in the mid-atlantic region of the united states.

  • Previous Article
  • Next Article

Capturing and Sharing Our Collective Expertise on Climate Data: The CHARMe Project

Displayed acceptance dates for articles published prior to 2023 are approximate to within a week. If needed, exact acceptance dates can be obtained by emailing  [email protected] .

  • Download PDF

For users of climate services, the ability to quickly determine the datasets that best fit one’s needs would be invaluable. The volume, variety, and complexity of climate data makes this judgment difficult. The ambition of CHARMe (Characterization of metadata to enable high-quality climate services) is to give a wider interdisciplinary community access to a range of supporting information, such as journal articles, technical reports, or feedback on previous applications of the data. The capture and discovery of this “commentary” information, often created by data users rather than data providers, and currently not linked to the data themselves, has not been significantly addressed previously. CHARMe applies the principles of Linked Data and open web standards to associate, record, search, and publish user-derived annotations in a way that can be read both by users and automated systems. Tools have been developed within the CHARMe project that enable annotation capability for data delivery systems already in wide use for discovering climate data. In addition, the project has developed advanced tools for exploring data and commentary in innovative ways, including an interactive data explorer and comparator (“CHARMe Maps”), and a tool for correlating climate time series with external “significant events” (e.g., instrument failures or large volcanic eruptions) that affect the data quality. Although the project focuses on climate science, the concepts are general and could be applied to other fields. All CHARMe system software is open-source and released under a liberal license, permitting future projects to reuse the source code as they wish.

Increasingly, people gather inputs and make decisions through websites that offer evolving commentary from others at the point of decision or purchase; think of Amazon, TripAdvisor, Yelp, and countless others. Introducing an analogous capability into Earth science data selection and acquisition has the potential to turn what is currently a solitary exploration into one where the user’s decisions are informed by a broad community.

Today, science users in search of relevant datasets for their investigations find these data in a variety of ways: filtering by climate parameter, crawling/browsing data servers, or through some other sophisticated means. Regardless of how a user arrived there, ultimately she is presented with a list of links to data sources (files) through some data system interface. These are what the science user accesses to compute, analyze, and visualize information, yet many times they lack up-to-date ancillary information—for example, pointers to documentation on the dataset, contact information for the dataset engineer or the responsible science lead, information on how others have used the data, or if there are known problems. In all of these cases the interested user must navigate away from the search results in order to discover the information they require, and it is not always obvious how to go about it, or whether the information she discovers is valid for, or even relevant to, the particular dataset of interest.

The European collaborative project CHARMe (Characterization of metadata to enable high-quality climate services) has developed a system that avoids these navigation and presentation issues by providing crowd-sourced user commentary directly next to the data download link. The connection is immediate and obvious, and content continues to expand as the data get more use. Users can post notes about issues and questions to the data providers and other users, and the data engineers responsible for the dataset know exactly what data are involved when answering questions. The technology allows for this knowledge capture to stay connected to the dataset itself, no matter how the user arrives at the download link.

As an example, Fig. 1 shows CHARMe being used via the website of the Climate Monitoring Satellite Applications Facility (CMSAF), hosted by Deutscher Wetterdienst: www.cmsaf.eu/doi . In this case the user is browsing a list of datasets, and having clicked on the blue “C” icon is viewing an annotation linking the selected dataset to a validation report. The tags “describing” and “linking” (as well as others available to the user when submitting the annotation) help subsequent users to understand why the comment was made, and discover the comment using a facility known as a “faceted search.”

Citation: Bulletin of the American Meteorological Society 97, 4; 10.1175/BAMS-D-14-00189.1

  • Download Figure
  • Download figure as PowerPoint slide

This article describes the application of this system to climate data, the underlying technology and data model, and the tools that have been developed to demonstrate use of this commentary to explore climate data in new ways.

Climate data are diverse, encompassing in situ and remotely sensed observations, the results of numerical models, and the combination of models and observations in reanalysis programs. A particular feature of climate data is that their intrinsic value grows over time, but there is a risk that the expertise in the use of the data is lost as people move on and expert teams disperse. End products are often derived from a variety of sources, making it difficult to issue simple statements about a product’s quality, and impossible to label a particular dataset as “the best.” Instead, users need to weigh up a range of features to judge a dataset’s fitness for their specific purpose. The importance of this “knowledge around the data” is recognized by the International Strategy Toward an Architecture Climate Monitoring from Space ( Dowell et al. 2013 ): it is just as important to preserve this knowledge about the data as it is to preserve the measurements themselves.

There are many international collaborations and initiatives already gathering the information users need about climate data. For instance, the European project “Coordinating Earth Observation data for reanalysis for climate services: CORE-CLIMAX” ( www.coreclimax.eu/ ) is bringing together the data and information to support reanalyses of past climate. The international Obs4MIPS (Observations for Model Intercomparisons; www.earthsystemcog.org/projects/obs4mips/ ) activity is making observational products more accessible for climate model intercomparisons, partly through the generation of technical notes that describe the characteristics of the observational data in a way that is tailored to the needs of climate modelers.

The Climate Data Guide from the National Center for Atmospheric Research ( https://climatedataguide.ucar.edu/ ) allows users to compare the attributes, strengths, and limitations of multiple datasets. However the website specifically states “The Climate Data Guide generally does not distribute data sets. It is your responsibility to find and process the data that you need. ” Again, the data and supporting information are in different locations, and it is up to the user to navigate between them.

In addition, every Earth science researcher and climate data user around the world will be generating commentary as they go about their work. It is surely the case that the same advances and dead-ends are being discovered time and again. Traditionally, such knowledge is captured in narrative form and shared through human-readable means such as papers, articles, and presentations; a great deal of extra value can be gained by sharing this information in a machine-readable, searchable way.

The term “Linked Data” refers to a set of best-practice techniques that describe how one can make data available on the web and interconnect it with other data, with the aim of increasing its value for applications and users. The CHARMe project applies the principles of Linked Data to climate data commentary: the representation of the commentary within a formal data model is a critical part of the CHARMe design. Open Annotation, a W3C effort to develop a common approach to annotating digital resources, provides the underlying concept ( www.openannotation.org/ ). It offers a simple and general data model for recording annotations about objects. An annotation associates a piece of information (the body) with a subject (the target), as shown in Fig. 2 . Although applied largely to arts- and humanities-related applications thus far, the model has shown itself to be versatile and readily adaptable to Earth observation and climate science use. For example, Open Annotation provides a means of specifying subsets of a given target, such as a character range to reference a given piece of text from a document. Building on these concepts in the framework, it is possible to define extensions to describe geographic subsets of datasets, which are described further in the section below, “Exploring Commentary in Space and Time.”

In a data portal, a target is typically a dataset (or subset of a dataset), while the annotation body holds the commentary. The overall design offers flexibility: a single comment body can be associated with many data targets, or the body of one annotation can be the target of another. This is illustrated in Fig. 3 , which shows how the CHARMe data model represents a comment on a conference paper, also capturing a link to the dataset cited by the paper. In this way the CHARMe system begins to link the user to a “web of knowledge” about the data they are interested in.

The GeoViQua project ( www.geoviqua.org/ ) has developed a data model for user feedback on datasets in the Global Earth Observation System of Systems. This model shares some conceptual similarities with the CHARMe data model, with the main difference being that the CHARMe model is built on Linked Data principles and the RDF (Resource Description Framework; www.w3.org/RDF/ ) data model, whereas the GeoViQua model is based around a UML (Unified Modeling Language; www.uml.org/ ) model and a derived XML encoding. These two approaches are complementary. UML models describe information in a relatively fixed, rigid fashion; this allows data producers and consumers to interoperate closely because the consumer knows exactly what data structure to expect. The disadvantage of this is that data models, once fixed and agreed upon, can be hard to apply to situations that were not expected at design time. By contrast, RDF models enable the data producer to structure data more flexibly, enabling new requirements to be more smoothly integrated. However, it can be difficult to write data-consuming software that can handle all the possibilities afforded by this high degree of flexibility. Discussions are ongoing within the Open Geospatial Consortium to harmonize the CHARMe and GeoViQua models at the conceptual level, enabling implementers to apply the encoding they feel is most appropriate to the application.

Rather than create a new web portal to expose climate data commentary to users, CHARMe has developed a plug-in that is simple to include in existing data-access portals. The plug-in highlights to users the existence of commentary on their datasets of interest and allows them to make comments of their own. A third-party data provider is “CHARMe enabled” by integrating the JavaScript for the plug-in in their website. As shown in Fig. 1 , CHARMe also provides a convenient way to share information from the data provider (e.g., dataset provenance, updates, or corrections) at the point of access.

CHARMe has been implemented as a client-server architecture. On the server side, there is currently a single repository (a CHARMe “node”) that stores all the annotation information and has interfaces to support many clients. Figure 4 shows the architecture of a CHARMe node, with the plug-in as an example of a client application. In most cases, the node does not store the target information itself—for instance, the actual dataset and conference paper journal in Fig. 3 —but instead stores a link to the target; it is an important principle of Linked Data that these targets have persistent identifiers, such as a Digital Object Identifier (DOI) or persistent Uniform Resource Locator (pURL). This is critical not only for the CHARMe system to work but also for the larger problem of wider use of data citation in general.

The philosophy underlying the technical implementation was to develop a generic piece of software that can be configured to work with different underlying “off-the-shelf” technologies, and for the information held in the CHARMe node to be accessible via a range of web service interfaces. For example, CHARMe provides an open, read-only, web service endpoint (using the SPARQL protocol: www.w3.org/TR/rdf-sparql-query/ ) allowing potentially complex data mining and analysis for data scientists, as well as a simpler (but more limited) OpenSearch interface ( www.opensearch.org/ ). The faceted search facility in the plug-in is effectively a graphical user interface for queries to this OpenSearch interface. As these interfaces are built on open standards, it is possible for third parties to build other applications that produce and consume CHARMe commentary, with the CHARMe information syndicated to multiple end-user applications.

CHARMe uses the popular OAuth 2.0 framework ( http://oauth.net/2/ ) to secure interactions between client programs and the node. The user authenticates and delegates permission to the client program to execute any secured operations on their behalf. All annotations submitted are publicly accessible in read-only mode; add, delete, and modify functions are secured and require login. Users have authority to modify or delete only the annotations that they themselves have submitted. In addition, however, there are two other elevated sets of privileges, for “moderators” and “superusers,” which require registration with the node. Moderators are able to modify and delete any annotation originating from their client program(s), while superuser privileges can be granted to an overall administrator or administrators for the node. The client program is identified by an ID assigned by the node to the instance of the program (such as the plug-in) when it is deployed, and the moderator has oversight of content entered from their deployment of CHARMe. Superusers, in contrast, can modify or delete any annotation, from any source. This is provided as a second line of support for the moderation function used by individual clients. Example interfaces from a browser or the plug-in are pictured in Fig. 5 .

As part of the CHARMe project, the European Centre for Medium-Range Weather Forecasts (ECMWF) has developed a web-based graphical tool for associating features in climate time-series data with commentary that indicates “significant events” that may have affected the data, such as volcanic eruptions or satellite instrument failures. This “Significant Events Viewer” has been developed to work with ECMWF’s reanalysis datasets and internal observation and events databases, but is designed to be general enough to be extended to other datasets and user needs. The viewer provides users with an opportunity to become more familiar with the variety of observations that feed into the reanalysis, and to determine whether the variability and features seen in the dataset are likely to be artifacts of the measurements or processing steps, or real changes in the environment. A registered CHARMe user can also add commentary to the significant event. Figure 6 shows an example of the Significant Events Viewer being used to explore ECMWF’s reanalysis datasets ERA-40 and ERA-Interim alongside the significant events timeline. The tool is publicly available, following a simple and free registration, at http://apps.ecmwf.int/significant-events/ .

To explore commentary in geographic space, CHARMe has developed a further prototype tool, “CHARMe Maps,” that demonstrates the use of commentary metadata in an interactive mapping interface. The tool has two main capabilities, which are illustrated in Figs. 7 and 8 :

the ability to attach annotations to a specific subset of a dataset—for example, a particular geographic region (we refer to this as “fine-grained commentary”), and

the ability to compare two datasets both visually and by the commentary that has been attached to them (we refer to this as intercomparison).

The linking of annotations to a subset of a dataset requires a modification to the general data model in Fig. 3 . Fortunately, this kind of capability was anticipated by the designers of the Open Annotation specification. The properties of the subset include a geographic extent (defined as a 2D geometry), a temporal extent (defined by start and end times), and a vertical extent (not used in this prototype). Additionally, the definition of a subset allows the user to specify exactly which variables within a dataset are considered to be part of the subset; in this way, the user can attach a comment to a very specific part of a multivariate, multidimensional climate dataset. Figure 9 illustrates this data model for fine-grained commentary.

The CHARMe project has built a system to support the creation, modification, and archiving of comments linked to climate datasets and other targets. The success of the CHARMe tools in achieving the vision set out at the start of this article will depend not only on further technology development, but also on the cultivation of a community of users who will build up the web of annotations and links over time. Although the project focuses on climate science, the technologies and concepts are very general and could be applied to other fields.

Data providers can enable CHARMe functionality in their websites by installing the JavaScript plug-in. It is also possible for institutions to host their own CHARMe node to store annotation information, but there is as yet no capability to federate searches across multiple nodes, so this is more appropriate if an institution wishes to keep annotations internal rather than public.

The node provides a standard API from which it is hoped many different client applications could be developed, of which the Maps tool is just one example that demonstrates the possibilities. One potential reuse of CHARMe that is under consideration is integration into NASA’s Giovanni tool, a web-based tool designed to enable visual data exploration and comparison of data offered by the Earth Observing System Data and Information System ( http://giovanni.gsfc.nasa.gov/giovanni/ ). In the most recent Giovanni architecture, service workflows are specified as URLs that encode the service request and a specification of the data subset to be visualized, comprising the data variables, spatial region, and temporal range. That is, the URL includes the same information to specify a data subset as the specification in the CHARMe Maps data model. As such, the open machine-accessible architecture would allow a fairly straightforward incorporation of the CHARMe Maps ability to support commentary on data subsets.

Another potential NASA system to explore CHARMe integration is the related Regional Climate Model Evaluation System (RCMES; https://rcmes.jpl.nasa.gov/ ). RCMES is both a database for observations and an analytical toolkit allowing regridding, metrics computation, and visualization demonstrating comparisons of observations and climate model outputs. We envision an augmented RCMES allowing users to leverage CHARMe to explain the results of their model evaluation, and science workbench notes that are not currently captured in the RCMES tool. There is also strong interest in using CHARMe in the Earth System CoG collaboration environment ( www.earthsystemcog.org/ ). CoG provides a search interface to the Earth System Grid Federation climate data archive, which houses climate model output and other widely used datasets, along with wikis, forums, and other tools for distributed discussion and analysis. Here, CHARMe could be integrated into the data search as a way to build an online knowledge base available at the point of download.

A further need for development is around moderation tools and policy. It does not appear that the social media world has solved the issue of controversial annotations. The main risk to a CHARMe annotation is not so much the controversy it sparks, as the possibility that the debate becomes irrelevant to the initial annotation and overwhelms substantive commenting, with the value of the commentary lost in the noise. This risk is likely relatively small in the beginning while the community is also small, but can become problematic as the community grows. We are currently surveying social media implementations and literature for promising approaches—such as upvoting/downvoting, reputation scoring, and sort/group mechanisms—to mitigate this risk.

The CHARMe code and user manuals are available at https://github.com/charme-project . The CHARMe system software is open-source, released under a BSD license, permitting future projects to reuse the source code as they wish. The Maps prototype is not currently accessible as an operational tool, but we would be happy to collaborate with anyone wishing to develop this capability in their own system.

ACKNOWLEDGMENTS.

The CHARMe project was coordinated by the University of Reading, and project partners were Airbus Defence and Space, CGI, Deutscher Wetterdienst (DWD), the European Centre for Medium-Range Weather Forecasts, the Royal Netherlands Meteorological Institute (KNMI), the Science and Technology Facilities Council, Terraspatium, and the UK Met Office. The project received funding from the European Union’s Seventh Framework Programme for research, technological development, and demonstration under Grant Agreement 312541.

FOR FURTHER READING

Ams publications.

Chorus

Get Involved with AMS

Affiliate sites.

Email : [email protected]

Phone : 617-227-2425

Fax : 617-742-8718

Headquarters:

45 Beacon Street

Boston, MA 02108-3693

1200 New York Ave NW

Washington, DC 200005-3928

  • Privacy Policy & Disclaimer
  • Get Adobe Acrobat Reader
  • © 2024 American Meteorological Society
  • [195.158.225.244]
  • 195.158.225.244

Character limit 500 /500

Navigation Menu

Search code, repositories, users, issues, pull requests..., provide feedback.

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly.

To see all available qualifiers, see our documentation .

  • Notifications You must be signed in to change notification settings

CMIP6 climate data extraction and treatment by multi-polygon shapefile using ESGF NetCDF and WORLDCLIM datasets

Hydroenvironment/CMIP6-WORLDCLIM-HANDLING

Folders and files.

NameName
51 Commits

Repository files navigation

Project Status: Active – The project has reached a stable, usable state and is being actively developed.

CMIP6-WORLDCLIM-HANDLING

🌏 DAILY AND MONTHLY DATA TREATMENT FROM GENERAL CIRCULATION MODELS (GCMs) BY CMIP6 PHASE

✅The 2013 IPCC fifth assessment report (AR5) featured climate models from CMIP5, while the upcoming 2021 IPCC sixth assessment report (AR6) will feature new state-of-the-art CMIP6 models.

✅CMIP6 will consist of the “runs” from around 100 distinct climate models being produced across 49 different modelling groups. While the results from only around 40 CMIP6 models have been published so far, it is already evident that a number of them have a notably higher climate sensitivity than models in CMIP5

✅The overview paper on the CMIP6 experimental design and organization has now been published in GMD (Eyring et al., 2016). This CMIP6 overview paper presents the background and rationale for the new structure of CMIP, provides a detailed description of the CMIP Diagnostic, Evaluation and Characterization of Klima (DECK) experiments and CMIP6 historical simulations, and includes a brief introduction to the 23 CMIP6-Endorsed MIPs.

✅The IPCC AR5 featured four Representative Concentration Pathways (RCPs) that examined different possible future greenhouse gas emissions. These scenarios – RCP2.6, RCP4.5, RCP6.0, and RCP8.5 – have new versions in CMIP6. These updated scenarios are called SSP1-2.6, SSP2-4.5, SSP4-6.0, and SSP5-8.5, each of which result in similar 2100 radiative forcing levels as their predecessor in AR5.

A brief summary can be found in the following overview presentation: https://www.wcrp-climate.org/images/modelling/WGCM/CMIP/CMIP6FinalDesign_GMD_180329.pdf

Data Snapshots

An up-to-date archive of freely available climate maps, ready to use in websites, presentations, or broadcasts. Click on the thumbnails below to access image galleries.

Sample image of the monthly difference from average temperature for the CONUS

Temperature

Sample image of monthly Total Precipitation

Precipitation

Sample image of the weekly Drought Monitor

Projections

Sample image of the weekly Age of Arctic Sea Ice

Ice & Snow

Sample image of monthly sea surface temperature anomalies in the ENSO monitoring region of the Pacific Ocean

Severe Weather

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • NASA Author Manuscripts

Logo of nasapa

Big Data Challenges in Climate Science

John l. schnase.

NASA Goddard Space Flight Center, Greenbelt, MD 20771 USA

Tsengdar J. Lee

NASA Headquarters, Washington, DC 20546 USA

Chris A. Mattmann

NASA Jet Propulsion Laboratory, Pasadena, CA 91109 USA

Christopher S. Lynnes

Luca cinquini, paul m. ramirez, andre f. hart, dean n. williams.

Lawrence Livermore National Laboratory, Livermore, CA 94550 USA

Duane Waliser

Pamela rinsland.

NASA Langley Research Center, Hampton, VA 23681 USA

W. Philip Webster

Daniel q. duffy, mark a. mcinerney, glenn s. tamkin, gerald l. potter, laura carrier.

The knowledge we gain from research in climate science depends on the generation, dissemination, and analysis of high-quality data. This work comprises technical practice as well as social practice, both of which are distinguished by their massive scale and global reach. As a result, the amount of data involved in climate research is growing at an unprecedented rate. Climate model intercomparison (CMIP) experiments, the integration of observational data and climate reanalysis data with climate model outputs, as seen in the Obs4MIPs, Ana4MIPs, and CREATE-IP activities, and the collaborative work of the Intergovernmental Panel on Climate Change (IPCC) provide examples of the types of activities that increasingly require an improved cyberinfrastructure for dealing with large amounts of critical scientific data. This paper provides an overview of some of climate science’s big data problems and the technical solutions being developed to advance data publication, climate analytics as a service, and interoperability within the Earth System Grid Federation (ESGF), the primary cyberinfrastructure currently supporting global climate research activities.

I. Introduction

The term “big data” is used to describe data sets that are too large or complex to be worked with using commonly-available tools [ 1 ]. Climate science represents a big data domain that is experiencing unprecedented growth [ 2 ]. Some of the major big data technical challenges facing climate science are easy to understand: large repositories mean that the data sets themselves cannot easily be moved—instead, analytical operations must migrate to where the data reside; complex analyses over large repositories require high-performance computing; large amounts of information increase the importance of metadata, provenance management, and discovery; migrating codes and analytic products within a growing network of storage and computational resources creates a need for fast networks, intermediation, and resource balancing; and, importantly, the ability to respond quickly to customer demands for new and often unanticipated uses for climate data requires greater agility in building and deploying applications [ 3 ].

In addressing these challenges, it is important to recognize that the work of climate science comprises social practice as well as technical practice [ 4 , 5 ]. There are established human processes for creating, sharing, and analyzing scientific data sets, often in a highly collaborative mode. The work is both valued by society and subject to intense critical scrutiny. It informs national and international policy decisions. Collectively, these social factors add urgency and complexity to our efforts to build an effective cyberinfrastructure to support climate science.

This paper provides an overview of some of climate science’s big data problems and the technical solutions being developed to improve data publication, analysis, and accessibility. This material combines the contributions of those who participated in the 2014 Big Data From Space Conference (BiDS ‘14) session titled “Big Data Challenges in Climate Science” [ 6 – 8 ]. We use the work being done by the Intergovernmental Panel on Climate Change as the context for our presentation, with particular focus on the global climate research community’s Earth System Grid Federation collaborative infrastructure and the community’s Climate Model Intercomparison efforts.

II. Background

Our understanding of the Earth’s processes is based on a combination of observational data records and mathematical models. The size of our space-based observational data sets is growing dramatically as new missions come online. However, a potentially bigger data challenge is posed by the work of climate scientists, whose models are producing data sets of hundreds of terabytes or more [ 9 ].

There are two major challenges posed by the data intensive nature of climate science. There is the need to provide effective means for publishing large-scale scientific data collections. This capability is the foundation upon which a variety of data services can be provided, from supporting active research to large-scale data federation, data distribution, and archival storage.

The other data intensive challenge has to do with how these large datasets are used: data analytics—the capacity to perform useful scientific analyses over large quantities of data in reasonable amounts of time. In many respects this is the biggest challenge, for without effective means for transforming large scientific data collections into meaningful scientific knowledge, our climate science mission fails.

In order to gain a perspective on the big data challenges in climate science and the efforts that are underway to address those challenges, it is helpful to examine four elements operating at the core of global-scale climate research: (1) the Intergovernmental Panel on Climate change, which has responsibility for integrating scientific results and presenting them in meaningful ways to policy makers throughout the world; (2) climate model intercomparison experiments that coordinate research on general circulation models, arguably the most important tools available to scientists who study the climate; (3) the Earth System Grid Federation, which provides the distributed infrastructure for publishing climate model outputs, sharing scientific knowledge, and supporting global-scale collaboration; and (4) a new wave of data publication activities aimed at integrating observational data and reanalysis data into the Earth System Grid Federation. In this section, we take a closer look at each of these elements.

A. Intergovernmental Panel on Climate Change

The Intergovernmental Panel on Climate Change (IPCC) is the leading international body for the assessment of climate change [ 10 ]. It was established by the United Nations Environment Program (UNEP) and the World Meteorological Organization (WMO) in 1988 to provide the world with a clear scientific view on the current state of scientific knowledge about climate change and its potential environmental and socio-economic impacts.

The IPCC is open to all member countries of the UN and WMO. Currently 195 countries are members of the IPCC. Governments participate in the review process and the plenary sessions, where main decisions about the IPCC work program are taken and reports are accepted, adopted, and approved. Thousands of scientists from all over the world contribute to the work of the IPCC on a voluntary basis. Review is an essential part of the IPCC process, to ensure an objective and complete assessment of current information. IPCC aims to reflect a range of views and expertise. The IPCC Secretariat coordinates all the IPCC work and liaises with Governments.

Because of its scientific and intergovernmental nature, the IPCC embodies a unique opportunity to provide rigorous and balanced scientific information to decision makers. By endorsing the IPCC reports, governments acknowledge the authority of their scientific content. The work of the organization is therefore policy-relevant and yet policy-neutral, never policy-prescriptive.

B. Climate Model Intercomparison

Climate model intercomparison is one of the most important lines of research contributing to our understanding of the climate, and it contributes significantly to the work of the IPCC [ 11 , 12 ]. The World Climate Research Programme’s (WCRP) Working Group on Coupled Modelling (WGCM) established the Coupled Model Intercomparison Project (CMIP) as a standard experimental protocol for studying the output of coupled atmosphere-ocean general circulation models (GCMs). CMIP provides a community-based infrastructure in support of climate model diagnosis, validation, intercomparison, documentation, and data access. This framework enables a diverse community of scientists to analyze GCMs in a systematic fashion, a process that serves to facilitate model improvement. Virtually the entire international climate modeling community has participated in this project since its inception in 1995. The Program for Climate Model Diagnosis and Intercomparison (PCMDI) archives much of the CMIP data and is one of a number of international climate data repositories that provide support for CMIP. PCMDI’s CMIP effort is funded by the Regional and Global Climate Modeling (RGCM) Program of the Climate and Environmental Sciences Division of the US Department of Energy’s Office of Science, Biological, and Environmental Research (BER) program.

Coupled atmosphere-ocean general circulation models allow the simulated climate to adjust to changes in climate forcing, such as increasing atmospheric carbon dioxide. CMIP began in 1995 by collecting output from model “control runs” in which climate forcing is held constant. Later versions of CMIP collected output from an idealized scenario of global warming, with atmospheric CO2 increasing at the rate of 1% per year until it doubles at about Year 70. CMIP output is available for study by diagnostic sub-projects, academic users, and the public.

Climate model intercomparison has proven to be an effective method to both improve climate models in general and to provide the basis for preparing ensembles to improve climate prediction. In the past, preparation of the data for such activities was the responsibility of the individual researcher. Recently, however, large international collaborative projects such as the CMIP3 and CMIP5 projects have agreed to share model output through the Earth System Grid Federation.

C. Earth System Grid Federation

The climate research community uses the Earth System Grid Federation (ESGF) as the primary mechanism for publishing and sharing IPCC data as well as the ancillary observational and reanalysis products described below [ 13 , 14 ]. ESGF is an international collaboration with a focus on serving the coupled model intercomparison projects and supporting climate and environmental science in general. The ESGF grew out of the larger Global Organization for Earth System Science Portals (GO-ESSP) community and reflects a broad array of contributions from its collaborating partners.

ESGF combines features found in a variety of grid computing approaches. ESGF is a peer-to-peer content distribution network in which geographically distributed collections can be accessed by the climate research community through a certificate authority mechanism. Published ESGF data, regardless of source, conforms to the community-defined CMIP5 Data Reference Syntax and Controlled Vocabularies standard. The trust relationship set up by the authority mechanism essentially creates a virtual organization of producers and consumers of ESGF products.

Reformatting the model output to a common standard and distributing the data though a common portal has proven to be an innovative approach allowing thousands of additional researchers access to data previously limited to a much more sophisticated technical audience [ 6 , 7 ]. For example, IPCC Working Group Two, which focused on climate change impacts, adaptations, and vulnerabilities, and Working Group Three, which dealt with the mitigation of climate change, made extensive use of the CMIP3 and CMIP5 archives in the preparation of recent IPCC Assessment Reports. This approach to data distribution has proven to be so successful that other climate related projects have emerged to provide CMIP-relevant observations and reanalysis. More than 1300 scientific papers have been written using these data. Distributing satellite observations and reanalysis products for use by the climate research community is the next step.

D. Obs4MIPs, Ana4MIPs, and CREATE-IP

Observations tailored for use by the climate science community has long been a dream of many climate modeling scientists and their graduate students [ 15 ]. When science teams associated with Earth observational missions produced new level three products in the 1980’s—the Earth Radiation Budget Experiment (ERBE), for example—it was a challenge for climate researchers to customize the data so that they could be used to validate the model’s Top Of Atmosphere (TOA) energy balance and cloud radiative properties. Once they mastered the format, each scientist obtained their own copy of the data and used it for model evaluation. This process has been repeated over and over by individual scientists, even today. As the processing of satellite data became more sophisticated, accessing the data became more onerous because of the proliferation of versions, levels of processing, and other features. As a result, the IPCC’s Third Assessment Report, released in 2001, only dedicated a minimal amount of discussion to model validation using observations.

By 2013, IPCC’s Fifth Assessment Report included more extensive use of observational data, facilitated in part by the efforts to make satellite data more accessible in the intervening years. This was accompanied by a growing interest in the use of reanalysis data, another application of observational data of particular value to climate monitoring and research. Reanalyses assimilate historical observational data spanning an extended period of time using a single, constant assimilation scheme. They ingest all available observational data every 6–12 hours over the period being analyzed and produce a dynamically consistent estimate of the climate state at each time interval. Reanalysis data sets can span decades, going as far back as the beginning of the satellite era [ 2 ].

Because of this growing need to use observations in the IPCC process, the Observations for Model Intercomparison Projects (Obs4MIPs), Analysis for Model Intercomparison Projects (Ana4MIPs), and the Collaborative REAnalysis Technical Environment-Intercomparison Project (CREATE-IP) [ 7 ] have been created to provide a new way to distribute observational data and reanalyses for use by climate scientists. The objective of these projects is to prepare observational data (currently mostly satellite data) and selected reanalysis products in the same way as the CMIP model data have been reformatted and tagged for inclusion into ESGF. The preparation involves ensuring the data files are in NetCDF ( https://www.unidata.ucar.edu/software/netcdf/docs/ ) format and the data adhere to the Climate and Forecast (CF) metadata conventions in addition to other formatting procedures that have been agreed upon by the World Climate Research Program (WCRP) Working group on Coupled Modeling (WGCM). To aid with the formatting procedures, a software utility, Climate Model Output Rewriter (CMOR), is available that ensures adherence to the standard formatting. Software is also available to display and analyze the data in 2D and 3D.

Data entered into the projects must have a history of peer reviewed publications, be version controlled, and reside in a long-term archive. For example, a WCRP Data Advisory Council (WDAC) Obs4MIPs task team has been established to govern the data inclusion process. For inclusion into the Obs4MIPs archive, a data producer proposes to the WDAC task team with the detailed information required above. The first step in preparation of the data is generally done in consultation with the individual science teams, who identify specifics about the data, including the appropriate processing version, citations, and other details. Documentation and error estimates are also required.

Table 1 shows a current list of the observational data products available through ESGF. Because of the strict NetCDF file format and CF-compliance requirement, one limitation that is still being resolved is the desire by some climate modeling researchers to include data that does not have a corresponding variable in the CMIP archive but has significant value to the climate research community. For instance, the Moderate Resolution Imaging Spectroradiometer (MODIS) produces several dozen products, yet only a few variables have a corresponding CMIP variable and are thus eligible for publication under the present guidelines. Another limitation is the limited capability of including uncertainty information in the Obs4MIPs formatted files.

Obs4MIPs and CREATE-IP Variables Available in ESGF

Air Temperature Standard ErrorNorthward Wind
Ambient Aerosol Optical Thickness at 550 nmNumber of CloudSat Profiles Contributing to the Calculation
Ambient Aerosol Optical Thickness at 550nm ObservationsNumber of MISR Samples
 Ambient Aerosol Optical Thickness at 550nm Standard DeviationPARASOL Reflectance
CALIPSO 3D Clear fractionPrecipitation - monthly and 3h
CALIPSO 3D Undefined fractionPrecipitation Standard Error
CALIPSO Clear Cloud FractionSea Surface Height Above Geoid
CALIPSO Cloud FractionSea Surface Height Above Geoid Observations
CALIPSO High Level Cloud FractionSea Surface Height Above Geoid Standard Error
CALIPSO Low Level Cloud FractionSea Surface Temperature
CALIPSO Mid Level Cloud FractionSea Surface Temperature Number of Observations
CALIPSO Scattering RatioSea Surface Temperature Standard Error
CALIPSO Total Cloud FractionSpecific Humidity
Cloud Fraction retrieved by MISRSpecific Humidity Number of Observations
CloudSat 94GHz radar Total Cloud FractionSpecific Humidity Standard Error
CloudSat Radar Reflectivity CFADSurface Downwelling Clear-Sky Longwave Radiation
Eastward Near-Surface WindSurface Downwelling Clear-Sky Shortwave Radiation
Eastward Near-Surface Wind Number of ObservationsSurface Downwelling Longwave Radiation
Eastward Near-Surface Wind Standard ErrorSurface Downwelling Shortwave Radiation
Eastward WindSurface Upwelling Clear-Sky Shortwave Radiation
Fraction of Absorbed Photosynthetically Active RadiationSurface Upwelling Longwave Radiation
ISCCP Cloud Area FractionSurface Upwelling Shortwave Radiation
ISCCP Mean Cloud Albedo (day)TOA incident Shortwave Radiation
ISCCP Mean Cloud Top Pressure (day)TOA Outgoing Clear-Sky Longwave Radiation
ISCCP Mean Cloud Top Temperature (day)TOA Outgoing Clear-Sky Shortwave Radiation
ISCCP Total Cloud Fraction (daytime only)TOA Outgoing Longwave Radiation
Leaf Area IndexTOA Outgoing Shortwave Radiation
Mole Fraction of O3Total Cloud Fraction
Mole Fraction of O3 Number of ObservationsTotal Cloud Fraction Number of Observations
Mole Fraction of O3 Standard ErrorTotal Cloud Fraction Standard Deviation
Near-Surface Wind SpeedWater Vapor Path
Near-Surface Wind Speed Number of ObservationsSea Surface Temperature
Near-Surface Wind Speed Standard ErrorSolar Zenith Angle
Air TemperatureSpecific Humidity
Condensed Water PathSurface Air Pressure
Convective PrecipitationSurface Downward Eastward Wind Stress
Eastward Near-Surface WindSurface Downward Northward Wind Stress
Eastward WindSurface Downwelling Longwave Radiation
EvaporationSurface Downwelling Shortwave Radiation
Geopotential HeightSurface Temperature
Ice Water PathSurface Upward Latent Heat Flux
Near-Surface Air TemperatureSurface Upward Sensible Heat Flux
Near-Surface Wind SpeedSurface Upwelling Longwave Radiation
Northward Near-Surface WindSurface Upwelling Shortwave Radiation
Northward WindTOA incident Shortwave Radiation
PrecipitationTOA Outgoing Clear-Sky Longwave Radiation
Relative HumidityTOA Outgoing Longwave Radiation
Sea Level PressureTotal Cloud Fraction
Snowfall FluxWater Vapor Path
Omega (=dp/dt)

Reanalysis is extremely useful for many issues relating to climate models [ 16 , 17 ]. The Ana4MIPs effort focuses on providing a select set of reanalysis variables to climate model intercomparison efforts. This project provides only variables that are a match for the CMIP5 protocol and of particular use to researchers who need reanalysis data as a baseline for model and model ensemble evaluation. It has become apparent, however, that there is strong interest in making a more expansive set of atmospheric reanalysis data available to the community via the ESGF. In response, NASA has initiated the CREATE-IP project. CREATE-IP includes reanalysis products from the European Center for Medium-Range Weather Forecasts (ECMWF), National Oceanic and Atmospheric Administration (NOAA)/National Center for Environmental Prediction (NCEP), NOAA/Earth system Research Laboratory (ESRL), NASA, and the Japanese Meteorological Agency (JMA). Each reanalysis has been repackaged in a form similar to the CMIP and Obs4MIPs projects. Table 1 shows the current CREATE-IP variables.

III. Next Generation Cyberinfrastructure for Climate Data Publication

Because of the fundamental importance of high-quality, readily-accessible data, an effective cyberinfrastructure for climate science requires improved ways to generate and disseminate data. Institutions that host ESGF servers have responsibility for correctly formatting and registering their data contributions. IPCC data are produced in forms that are directly compatible with the ESGF CMIP5 standard. As described above, data products from other sources—such as Obs4MIPs, generally require reformatting. This alignment—moving from the frame of reference defined by the observational community to that used by the climate community—is often a mixed process of automatic and manual conversion and contributes significantly to the data preparation overhead of the Obs4MIPs activities. It is at the heart of the Obs4MIPs, Ana4MIPs, and CREATE-IP data challenge [ 18 ].

Efforts are underway to develop a cyberinfrastructure that overcomes these challenges [ 6 ]. The new capabilities will provide automatic conversion of NASA HDF-EOS/HDF datasets into NetCDF/CF datasets compatible with the ESGF, the ability to perform model checking on those converted datasets using the Climate Model Output Rewriter, and the ability to automatically publish remote sensing data into the ESGF.

We are working with three NASA Distributed Active Archive Centers (DAACs) to identify requirements for various ad-hoc data publication pipelines used in the Obs4MIPs projects and then standardize them into a toolkit. The publication infrastructure is now part of a core project called Open Climate Workbench (OCW) [ 19 ] stewarded at the open source Apache Software Foundation (ASF), the world’s largest open source organization and home to some of the Web’s most widely-used software systems. For example, its flagship HTTPD web server services 53% of the Web requests on the Internet.

A. Architecture

A notional architecture for a next generation publishing cyberinfrastructure is shown in Fig. 1 . As originally conceived, remote sensing data would enter the system from the bottom left of the figure. Remote sensing data used for comparison with climate models are generally gridded, though the system could handle swath information through its transformation process as described below.

An external file that holds a picture, illustration, etc.
Object name is nihms-864037-f0001.jpg

The NASA ESGF cyberinfrastructure shown (upper left) is responsible for publishing remote sensing datasets to the ESGF portal (upper right). Automated data generation and dissemination workflows substantially improve the efficiency and accuracy of the data publication process.

In an initial step ( Fig. 1 , Step 1), the architecture would leverage a technology such as OPeNDAP ( http://www.opendap.org/ ) to access and subset the data, which provides input to the next step ( Fig. 1 , Step 2) where data wrappers encapsulate mission-specific transformations needed to yield a variable (e.g., sea ice), along with its latitude and longitude in WGS84 format, time in ISO6801 format, and an optional height value. This five-tuple of (variable value, latitude, longitude, time, height) would then be passed to a regridding step ( Fig. 1 , Step 3) where the data would be spatially and temporally aligned with the desired climate model output and written to a NetCDF/CF-compliant file with the necessary metadata information ( Fig. 1 , Step 4). Finally, the data would be validated using the Climate Model Output Rewriter ( Fig. 1 , Step 5) and published to the ESGF ( Fig. 1 , Step 6).

The right side of Fig. 1 shows what a user would do once the remote sensing data are available in the ESGF. Here again OPeNDAP provides user and application access to published ESGF data ( Fig. 1 , Step 7). In this case, the architecture creates leveraged opportunities to combine OPeNDAP with other community-oriented tools, such as the Regional Climate Model Evaluation System (RCMES; https://rcmes.jpl.nasa.gov/ ), a Web-accessible database of remote sensing observations and analytical toolkit for computing climate metrics ( Fig. 1 , Steps 8–9).

B. Technologies and Implementation

Fig. 2 shows how we have implemented the notional architecture described above. We standardized on the use of a few technologies to implement the architecture, and we simplified the process by collapsing Steps 1–4 into Data Extraction and Data Conversion steps. The extraction steps are provided by OPeNDAP and Apache’s Object Oriented Data Technology (OODT) framework via the framework’s core services and three of its client tools, the Crawler, Workflow Manager, and File Manager.

An external file that holds a picture, illustration, etc.
Object name is nihms-864037-f0002.jpg

The as-implemented architecture of the NASA ESGF cyber-infrastructure comprises a series of workflow stages that combine Apache OODT software, NetCDF operators, OPeNDAP, Apache Solr, and the ESGF publishing toolkit.

The Workflow Manager encapsulates control and data flow and allows a user to model a series of steps in the scientific process as well as the input and output passed between steps. The File Manager tracks a file’s key information, including its metadata, provenance, location, Multi-Purpose Internet Mail Extensions (MIME) type, etc., and it provides data movement capabilities. The Crawler provides automated methods for ingesting, locating, selecting, and interactively extracting files and metadata managed by the File Manager, while simultaneously notifying the Workflow Manager that pipelines need to be executed.

The Crawler is seeded with an initial data staging area or a non-local OPeNDAP directory of remote sensing data. The Crawler extracts file and HDF metadata, which it subsequently presents to the File Manager for ingestion. At the same time, the Crawler notifies the Workflow Manager that the conversion pipeline should be initiated for the variable of interest. Data Extraction is kicked off, and the required five-tuple of information is extracted. Any necessary conversion is performed in the Data Conversion step using the NetCDF Operators package, which then writes a new NetCDF file based on the extracted five-tuple. The resulting output is sent to the Data Validation step that in turn calls a Python Web service that applies the CMOR checker. If the validation is successful, Metadata Harvesting collects the NetCDF information into a Thematic Real-Time Environmental Distributed Data Services (THREDDS) data server, publishes it to Apache Solr, and, ultimately, delivers it to the Earth System Grid Federation in the Publishing to ESGF step.

Publishing remote sensing data alongside climate model output enables better comparisons and understanding that, in turn, more completely inform those who study the climate and those who make crucial policy decisions affecting the climate. Our expectation is that using automated workflows to streamline the publication of high-quality data will significantly improve this crucial activity.

IV. Next Generation Cyberinfrastructure for Climate Data Analytics

Climate model input and output data provide the basis for intellectual work in climate science. As these data sets grow in size, new approaches to data analysis are needed. in efforts to address the big data challenges of climate science, some researchers are moving toward a notion of Climate Analytics-as-a-Service (CAaaS). CAaaS combines high-performance computing and server-side analytics with scalable data management, cloud computing, a notion of adaptive analytics, and domain-specific APis to improve the accessibility and usability of large collections of climate data [ 3 , 8 ]. In this section we take a closer look at these concepts and a specific implementation of CAaaS in NASA’s MERRA Analytic Services project.

A. High-performance server-side analytics

At its core, CAaaS must bring together data storage and high-performance computing in order to perform analyses over data where the data reside. MapReduce has been of particular interest, because it provides an approach to high-performance analytics that is proving to be useful in many data intensive domains [ 3 ]. MapReduce enables distributed computing over large data sets using high-end computer clusters. It is an analysis paradigm that combines distributed storage and retrieval with distributed, parallel computation, allocating to the data repository analytical operations that yield reduced outputs to applications and interfaces that may reside elsewhere. Since MapReduce implements repositories as storage clusters, data set size and system scalability are limited only by the number of nodes in the clusters.

MapReduce distributes computations across large data sets using a large number of computers (nodes). In a “map” operation a head node takes the input, partitions it into smaller sub-problems, and distributes them to data nodes. A data node may do this again in turn, leading to a multi-level tree structure. The data node processes the smaller problem, and passes the answer back to a reducer node to perform the reduction operation. In a “reduce” step, the reducer node then collects the answers to all the sub-problems and combines them in some way to form the output—an answer to the problem it was originally trying to solve.

While MapReduce has proven effective for large repositories of textual data, its use in data intensive science applications has been limited, because many scientific data sets are inherently complex, have high dimensionality, and use binary formats. Adapting MapReduce to complex, binary data types has been a major advancement to these efforts. Due to the importance of MapReduce in large-scale analytics, and its widespread use, there has been significant private-sector investments in recent years aimed at improving the performance and applicability of the technology—improvements that benefit and leverage the efforts of science communities that are becoming more involved in analytics.

B. Workflow-stratified adaptive analytics

The relationship between data and workflows contributes to the way we think about data analytics. Data-intensive analysis workflows, in general, bridge between a largely unstructured mass of archived scientific data and the highly structured, tailored, reduced, and refined analytic products that are used by individual scientists and form the basis of intellectual work in the domain. In general, the initial steps of an analysis, those operations that first interact with a data repository, tend to be the most general, while data manipulations closer to the client tend to be the most tailored—specialized to the individual, to the domain, or to the science question under study. The amount of data being operated on also tends to be larger on the repository-side of the workflow, smaller toward the client-side end-products.

This stratification can be used to optimize data-intensive workflows. We believe that the first job of an analytics system is to implement a set of near-archive, early-stage operations that are a common starting point in many of these analysis workflows. For example, it is important that a system be able to compute maximum, minimum, sum, count, average, variance, and difference operations such as:

that return, as in this example, the average value of a variable when given its name, a temporal extent, and a spatial extent. Because of their widespread use, these simple operations— microservices, if you will—function as “canonical operations” with which more complex opeations can be built. This is an active area of research with many analytic frameworks in development [ 20 – 22 ]. However, our work with its current focus on workflow stratification, microservices, and the client-side construction of complex operations using server-side microservices is distinctive [ 23 ]. And, by implementing basic descriptive statistics and other primitive operations over data in a high-performance compute-storage environment using powerful analytical software, the system is able to support more complex analyses, such as the predictive modeling, machine learning, and neural networking approaches often associated with advanced analytics.

C. Domain-specific application programming interfaces

CAaaS capabilities are exposed to clients through a RESTful Web services interface. In order to make these capabilities easier to use, we are building a client-side Climate Data Services (CDS) application programming interface (API) that essentially wraps REST interface’s Web service endpoints and presents them to client applications through a library of Python-based methods. With this arrangement, application developers have the option of coding against the REST interface directly or using the CDS API Python’s library and with its more familiar method syntax.

APIs can take many forms, but the goal for all APIs is to make it easier to implement the abstract capabilities of a system. In building the CDS API, we are trying to provide for climate science a uniform semantic treatment of the combined functionalities of large-scale data management and server-side analytics. We do this by combining concepts from the Open Archive Information Systems (OAIS) reference model, highly dynamic object-oriented programming APIs, and Web 2.0 resource-oriented APIs.

The OAIS reference model, defined by the Consultative Committee on Space Data Systems, addresses a full range of archival information preservation functions including ingest, archival storage, data management, access, and dissemination—full information lifecycle management. OAIS provides examples and some ”best practice” recommendations and defines a minimal set of responsibilities for an archive to be called an OAIS [ 25 ]. These high-level services provide a vocabulary that we have adopted for the CDS Reference Model and associated Library and API.

The CDS Reference Model is a logical specification that presents a single abstract data and analytic services model to calling applications. The Reference Model can be implemented using various technologies; in all cases, however, actions are based on the following six primitives:

Submit data to a service
Retrieve data from a service (synchronous)
Request data from a service (asynchronous)
Retrieve data from a service
Track progress of service activity
Initiate a service-definable extension.

Within this OAIS-inspired framework, the Python-based CDS Library sits atop a RESTful Web services client that encapsulates inbound and outbound interactions with various climate data services ( Fig. 3 ). These provide the foundation upon which we have also built a CDS command line interpreter (CLI) that supports interactive sessions. In addition, Python scripts and full Python applications can use methods imported from the API. The resulting client stack can be distributed as a software package or used to build a cloud-based service (SaaS) or distributable cloud image (PaaS).

An external file that holds a picture, illustration, etc.
Object name is nihms-864037-f0003.jpg

Notional architecture of a CAaaS system. Applications have the option of reaching services directly through the system’s Web service REST interface or through the CDS API’s Python libraries.

Unlike other APIs, our approach focuses on the specific analytic requirements of climate science and unites the language and abstractions of collections management with those of high-performance analytics. Doing so reflects at the application level the confluence of storage and computation that is driving big data architectures of the future.

D. MERRA Analytic Services

The MERRA Analytic Services (MERRA/AS) project brings these elements together in an end-to-end demonstration of CAaaS ( Fig. 4 ). MERRA/AS enables MapReduce analytics over NASA’s Modern-Era Retrospective Analysis for Research and Applications (MERRA) data collection. The MERRA reanalysis integrates observational data with numerical models to produce a global temporally and spatially consistent synthesis of key climate variables [ 25 ]. The effectiveness of MERRA/AS has been demonstrated in several applications, and the work is contributing new ideas about how a next generation cyberinfrastructure for climate data analytics might be designed.

An external file that holds a picture, illustration, etc.
Object name is nihms-864037-f0004.jpg

The MERRA Analytic Service provides an end-to-end demonstration of the principals underlying Climate Analytics-as-a-Service: important data embedded in a high-performance storage-compute environment where analytic services are exposed via Web services to client-side applications through an easy-to-use client-side API tailored to the climate research community.

In simple terms, our vision for MERRA/AS is that it allows MERRA data to be stored in a Hadoop Distributed File System (HDFS) on a MERRA/AS cluster. Functionality is exposed through the CDS API. The API exposures enable a basic set of operations that can be used to build arbitrarily complex workflows and assembled into more complex operations (which can be folded back into the API and MERRA/AS service as further extensions). The complexities of the underlying mapper and reducer codes for the basic operations are encapsulated and abstracted away from the user, making these common operations easier to use.

The Apache Hadoop software library is the classic framework for MapReduce distributed analytics. We are using Cloudera, the 100% open source, enterprise-ready distribution of Apache Hadoop. Cloudera is integrated with configuration and administration tools and related open source packages. The total size of the MERRA/AS HDFS repository is approximately 480 TB. Currently, MERRA/AS is running on a 36-node Dell cluster that has 576 Intel 2.6 GHz SandyBridge cores, 1300 TB of raw storage, 1250 GB of RAM, and a 11.7 TF theoretical peak compute capacity. Nodes communicate through a Fourteen Data Rate (FDR) Infiniband network having peak TCP/IP speeds in excess of 20 Gbps.

The canonical operations that implement MERRA/AS’s maximum, minimum, count, sum, difference, average, and variance calculations are Java MapReduce programs that are ultimately exposed as simple references to CDS Library methods or as Web services endpoints. There is a substantial code ecosystem behind these apparently simple operations, nearly 6000 lines of Java code being offloaded from the user to the MERRA/AS service.

E. MERRA/AS in use

MERRA/AS currently is in beta testing with about two dozen partners across a wide range of organizations and topic areas. It operates at a NASA Technology Readiness Level of seven (TRL 7) as a prototype deployed in an operational environment at or near scale of the production system, with most functions available for demonstration and test. While the system is not available for open beta testing to the general public, arrangements can be made to test the system through NASA’s Climate Model Data Services [ 27 ].

In one beta application, MERRA/AS’s web service is providing data to the RECOVER wildfire decision support system, which is being used for post-fire rehabilitation planning by Burned Area Emergency Response (BAER) teams within the US Department of Interior and the US Forest Service [ 28 ]. This capability has lead to the development of new data products based on climate reanalysis data that until now were not available to the wildfire community.

In our largest deployment exercise to date, the CDS Client Distribution Package and the CDS API have been used by the iPlant Collaborative to integrate MERRA data and MERRA/AS functionality into the iPlant Discovery Environment. iPlant is a virtual organization created by a cooperative agreement funded by the US National Science Foundation (NSF) to create cyberinfrastructure for the plant sciences. The project develops computing systems and software that combine computing resources, like those of TeraGrid, and bioinformatics and computational biology software. Its goal is easier collaboration among researchers with improved data access and processing efficiency. Primarily centered in the US, it collaborates internationally and includes a wide range of governmental and private-sector partners [ 29 ].

Initial results have shown that analytic engine optimizations can yield near real-time performance of MERRA/AS’s canonical operations and that the total time required to assemble relevant data for many applications can be significantly reduced, often by as much as two to three orders of magnitude [ 24 ].

V. Next Generation Cyberinfrastructure for Enhanced Interoperability

Big data challenges are sometimes viewed as problems of large-scale data management where solutions are offered through an array of traditional storage and archive theories and technologies. These approaches tend to view big data as an issue of storing and managing large amounts of structured data for the purpose of finding subsets of interest. Alternatively, big data challenges can be viewed as knowledge management problems where solutions are offered through an array of analytic techniques and technologies. These approaches tend to view big data as an issue of extracting meaningful patterns from large amounts of unstructured data for the purpose of finding insights of interest.

As the ESGF community grapples with its scaling challenges, it seeks to find a balance between these competing views. This is evident in the charge that the ESGF Compute Working Team (ESGF-CWT)—the international team of collaborators responsible for designing ESGF’s ”next generation” architecture—has laid out for itself. The Team’s overarching goal is to increase the analytical capabilities of the enterprise, primarily by exposing high-performance computing resources and analysis tools to the community through Web services [ 30 ]. Ideally, ESGF data from the Federation’s distributed collections would be united with the Web-accessible tools and compute resources needed to perform advanced analytics at the scale needed for IPCC’s increasingly complex work.

However, integrating high-performance computing and high-performance analytics—finding an optimal storage-compute balance in ESGF’s ecosystem of distributed resources—is not a trivial exercise. ESGF’s technical heritage is that of a large-scale distributed archive. Its nodes basically store and distribute data. They typically support compute resources sufficient to stream data out of storage onto the network for client consumption, and the behaviors implemented and exposed by ESGF’s Web service interface are the basic discovery and download operations of an archive.

Currently, the ESGF is looking to the geospatial community for ideas on how to strike a balance between data analytics and data storage. Improved access to distributed compute and storage resources has been achieved in the geographic information systems (GIS) community through a series of standards-making activities aimed at enhancing machine-to-machine interoperabity, one of the most notable being the work of the Open Geospatial Consortium (OGC). OGC is an international industry consortium of over five hundred companies, government agencies, and universities participating in a consensus process to develop publicly available interface standards. OGC’s abstract specifications and implementation standards are designed to support interoperable solutions that “geo-enable” a wide range of hardware platforms and software applications [ 31 ]. To see how improved machine-to-machine interoperability can lead to increased analytic capabilities across distributed storage systems, it is helpful to understand Web services and the role that Web APIs play in the discussion.

A. Web services and domain-specific API enhancements

As described above, in the world of Web services, there are two types of interfaces. On the service side, a system interface maps the methods, functions, and programs that implement the service’s capabilities to Hypertext Transfer Protocol (HTTP) messages that expose the service’s capabilities to the outside world. Client applications can consume these Web service endpoints to access services. The World Wide Web Consortium (W3C) views Web services as a way to insure machine-to-machine interoperability [ 32 ]. The precise messaging format can vary from community to community, often reflecting the specialized functions or audiences they serve. Significant standards activities have grown up around the design and implementation of such Web services.

There also are the classic client-side APIs familiar to application developers. Generally, these comprise local libraries that reside on the developer’s host computer and can be statically or dynamically referenced by client applications. They speed development, reduce error, and often implement abstractions that are specialized to the needs of the audiences they serve. They can be used to build applications, workflows, and domain-specific toolkits, workbenches, and integrated development environments (IDEs). Building on the concepts underlying CAaaS, the ESGF-CWT is working at both levels.

B. Implementation approach

The ESGF-CWT is adopting OGC’s Web Processing Service (WPS) interface standard for its next generation architecture [ 33 ]. WPS is essentially an xml-based remote procedure call (RPC) protocol for invoking processing capabilities as Web services. It has been used in the geospatial community for delivering low-level geospatial processing services. However, WPS can be generalized to other types of applications and data because of its simplicity: WPS uses a single operation (Execute) to invoke remote services; its two other operations (GetCapabilities and DescribeProcess) are used for discovery and to query services for information necessary to build signatures needed by Execute operations.

ESGF can improve interoperability and accessibility by defining ESGF community standards at one or more places in its Web services architecture. First, ESGF can define an ESGF Compute Node Service Specification —an agreed upon capability and naming convention for each conformant compute node. Regardless of how the services are accessed, each node would have known capabilities implemented in known ways. Second, ESGF can define an ESGF WPS Extension Specification —a specialization of the WPS standard wherein the syntax and semantics of required and optional fields of WPS response documents are tailored to the needs of the ESGF. With this approach, regardless of how services are implemented or named, their means of access is commonly understood within the Federation. Finally, ESGF can define an ESGF API —a client-side API that consumes the Web service endpoints exposed by a WPS-compliant ESGF service and presents them to client applications as a library of easy-to-use function calls tailored to the needs of the ESGF community. Here, regardless of implementation and communication details, programmers could access node capabilities using a familiar programming library.

The ESGF-CWT is developing options two and three: an ESGF WPS Extension Specification and an accompanying client-side ESGF API along the lines of the CDS API ( Fig.3 ). A reference implementation of an ESGF Multi-Model Averaging Service will be released soon. These enhancements will be of value to the ESGF community because they will improve interoperability at two levels within ESGF’s overall architecture. Greater system-to-system interoperability improves connectivity and, in the case of WPS, allows the ESGF community to avail itself of WPS-compliant capabilities that exist within the geospatial community; having an API makes it easier to create toolkits, workbenches, and plug-ins tailored to the ESGF that can improve efficiencies and communication within the community.

VI. Conclusion

The climate research activities that provided the basis for IPCC’s 2013 Fifth Assessment Report worked with about two petabytes of data. It is estimated that the research community’s collective work on the Sixth Assessment Report, which will probably be released around 2020, will generate as much as 100 petabytes of data [ 7 ]. The ESGF provides the primary cyberinfrastructure to support this global scientific collaboration. Clearly, IPCC’s success depends on our ability to scale ESGF capabilities to accommodate the big data challenges posed by this effort. The technology enhancements described here will not provide a comprehensive solution to the challenges facing the climate science community. But they do represent important threads of development that we believe are on the path to a significantly improved next generation cyberinfrastructure for climate science.

Acknowledgments

This work has been funded by the NASA Computational Modeling Algorithms and Cyberinfrastructure (CMAC) program through grants to the authors’ collaborating institutions.

John L. Schnase, along with Tsengdar Lee, co-chaired the 2014 Big Data in Space Conference (BiDS ‘14) special session on “Big Data Challenges in Climate Science” upon which this article is based. Dr. Schnase attended Angelo j State University, the University of Texas at Austin, Baylor College of Medicine, and Texas A&M University, where he received the Ph.D. degree in computer science in 1992.

Before joining NASA in 1999, his work on the natural history of Cassin’s Sparrow ( Aimophila cassinii ) resulted in an early application of computers in avian energetics modeling. Currently, he is the climate informatics focus area lead in NASA Goddard Space Flight Center’s Office of Computational and Information Sciences and Technology, where his work focuses on the development of advanced information systems to support the Earth sciences. He also holds adjunct faculty appointments at George Mason University and the University of Maryland, College Park.

Dr. Schnase is former Director of the Center for Botanical Informatics at the Missouri Botanical Garden and former Director of the Advanced Technology Group at Washington University School of Medicine. He is a Fellow of the American Association for the Advancement of Science (AAAS), a member of the Executive Committee of the Computing Accreditation Commission (CAC) of ABET, a former member of the President’s Council of Advisors on Science and Technology (PCAST) Panel on Biodiversity and Ecosystems, and currently Co-Chairs the Ecosystems Societal Benefit Area of the Office of Science and Technology Policy (OSTP) National Observation Assessment.

Tsengdar J. Lee received the M.S. degree in civil engineering in 1988, and the Ph.D. degree in civil engineering in 1988, and the Ph.D. degree in atmospheric science in 1992 from Colorado State University, Fort Collins, CO. Trained as a short-term weather modeler, his work focused on the integration of weather and ancillary geographical information data into W eather models to produce reliable forecasts. His research pioneered the modeling of land surface hydrology’s impact on weather forecasting.

Prior to joining NASA in 2001, Dr. Lee held positions as Senior Technical Advisor with Northrop Grumman Information Technology and Senior Staff Engineer with Litton PRC. He worked on the Advanced Weather Information Processing System (AWIPS) for the National Weather Service, where he was responsible for the rapid development, integration, and commercialization of the AWIPS client-server system. Lee also was a principal engineer on the effort to develop the AWIPS network monitoring and control system.

Dr. Lee currently manages the High-End Computing Program at NASA Headquarters, where he is responsible for maintaining the high-end computing capability to support the agency’s aeronautics research, human exploration, scientific discovery, and space operations missions. He is the NASA Weather focus area lead. In this role, he is responsible for advanced planning for weather research and development priorities. Between 2011 and 2012, Dr. Lee served as the acting CTO for IT at NASA, at which time he funded the agency’s computing service initiatives, including OpenStack.

Contributor Information

John L. Schnase, NASA Goddard Space Flight Center, Greenbelt, MD 20771 USA.

Tsengdar J. Lee, NASA Headquarters, Washington, DC 20546 USA.

Chris A. Mattmann, NASA Jet Propulsion Laboratory, Pasadena, CA 91109 USA.

Christopher S. Lynnes, NASA Goddard Space Flight Center, Greenbelt, MD 20771 USA.

Luca Cinquini, NASA Jet Propulsion Laboratory, Pasadena, CA 91109 USA.

Paul M. Ramirez, NASA Jet Propulsion Laboratory, Pasadena, CA 91109 USA.

Andre F. Hart, NASA Jet Propulsion Laboratory, Pasadena, CA 91109 USA.

Dean N. Williams, Lawrence Livermore National Laboratory, Livermore, CA 94550 USA.

Duane Waliser, NASA Jet Propulsion Laboratory, Pasadena, CA 91109 USA.

Pamela Rinsland, NASA Langley Research Center, Hampton, VA 23681 USA.

W. Philip Webster, NASA Goddard Space Flight Center, Greenbelt, MD 20771 USA.

Daniel Q. Duffy, NASA Goddard Space Flight Center, Greenbelt, MD 20771 USA.

Mark A. McInerney, NASA Goddard Space Flight Center, Greenbelt, MD 20771 USA.

Glenn S. Tamkin, NASA Goddard Space Flight Center, Greenbelt, MD 20771 USA.

Gerald L. Potter, NASA Goddard Space Flight Center, Greenbelt, MD 20771 USA.

Laura Carrier, NASA Goddard Space Flight Center, Greenbelt, MD 20771 USA.

SYSTEMATIC REVIEW article

Climate data sonification and visualization: an analysis of topics, aesthetics, and characteristics in 32 recent projects.

\nPerMagnus Lindborg

  • 1 SoundLab, School of Creative Media, City University of Hong Kong, Kowloon, Hong Kong SAR, China
  • 2 Critical Alarms Laboratory, Faculty of Industrial Design Engineering, Delft University of Technology, Delft, Netherlands

Introduction: It has proven a hard challenge to stimulate climate action with climate data. While scientists communicate through words, numbers, and diagrams, artists use movement, images, and sound. Sonification, the translation of data into sound, and visualization, offer techniques for representing climate data with often innovative and exciting results. The concept of sonification was initially defined in terms of engineering, and while this view remains dominant, researchers increasingly make use of knowledge from electroacoustic music (EAM) to make sonifications more convincing.

Methods: The Aesthetic Perspective Space (APS) is a two-dimensional model that bridges utilitarian-oriented sonification and music. We started with a review of 395 sonification projects, from which a corpus of 32 that target climate change was chosen; a subset of 18 also integrate visualization of the data. To clarify relationships with climate data sources, we determined topics and subtopics in a hierarchical classification. Media duration and lexical diversity in descriptions were determined. We developed a protocol to span the APS dimensions, Intentionality and Indexicality, and evaluated its circumplexity.

Results: We constructed 25 scales to cover a range of qualitative characteristics applicable to sonification and sonification-visualization projects, and through exploratory factor analysis, identified five essential aspects of the project descriptions, labeled Action, Technical, Context, Perspective, and Visualization. Through linear regression modeling, we investigated the prediction of aesthetic perspective from essential aspects, media duration, and lexical diversity. Significant regressions across the corpus were identified for Perspective (ß = 0.41 *** ) and lexical diversity (ß = −0.23 * ) on Intentionality, and for Perspective (ß = 0.36 *** ) and Duration (logarithmic; ß = −0.25 * ) on Indexicality.

Discussion: We discuss how these relationships play out in specific projects, also within the corpus subset that integrated data visualization, as well as broader implications of aesthetics on design techniques for multimodal representations aimed at conveying scientific data. Our approach is informed by the ongoing discussion in sound design and auditory perception research communities on the relationship between sonification and EAM. Through its analysis of topics, qualitative characteristics, and aesthetics across a range of projects, our study contributes to the development of empirically founded design techniques, applicable to climate science communication and other fields.

1. Introduction

Increasingly, researchers are asked to be more than knowledge-creators within their field of expertise, and “envisage the optimal processes and techniques for translating data into understandable, consumable modes of representation for audiences to digest” ( Chandler et al., 2015 ). It has proven a hard challenge to present climate science to convince not only decision-makers but also the general public. Another challenge is the “information deficit” fallacy, which arises when abundant information is assumed to lead to better understanding simply by existing (e.g., in scientific publications) but fails in its purpose because the meaning is “inaccessible and misaligned with the needs of different audiences” ( Jacobs et al., 2017 ). This is a conundrum that plays into the hands of climate denialists. While scientists mainly communicate through words, numbers, and diagrams, artists and designers may use movement, images, sound, and sculpture. Artistically inclined researchers have proposed to “move away from…static visualizations, and visual narratives with simplistic messages” ( Jacobs et al., 2017 ) and instead communicate through “data displays that embed,” and embody, knowledge about climate science “in more sensory, tangible and visceral representations” which will make the scientific data “come alive” ( Polli, 2011 ). This is possible by creating “dynamic and performative experiences of scientific data…that support engagement with issues of complexity, uncertainty and risk” ( Jacobs et al., 2017 ). Interdisciplinary projects are often based on the notion that both art and science are “founded essentially on curiosity, but the challenge and the difficulty reside in the reality of bringing together contrasting methodologies that frequently use very different written and visual languages” ( Ruddock et al., 2012 ).

Sonification, the translation of data into sound, and visualization, the translation of data into light, offer techniques for designing sonic and visual representations of scientific data in ways that can often be highly innovative and exciting. Data visualization as a discipline has been systematized over the past century with the definition of standardized methods for the translation of data into visuals such as static images, animated images, and, more recently interactive web applications ( Bertin, 1967 ; Munzner, 2014 ). Meanwhile, sonification is still an emerging discipline that struggles to define its boundaries, its impacts and more importantly, shared methods, processes, and tools for the mapping of data to sound ( Lenzi, 2021 ). The relationship between standardized data visualization strategies such as Bértin's “visual variables” and sonification strategies is a growing Research Topic within the sonification community ( Enge et al., 2021 ; Caiola et al., 2022 ).

It is a major design challenge to create sonifications that are “not only effective at communicating information but which are sufficiently engaging to engender sustained attention. Sonification may be ineffective if the rendered sound appears arbitrary to the listener in relation to the underlying data. The design task then becomes about finding a suitable fit between communicational efficacy and appropriate aesthetic character” ( Vickers et al., 2017 , p. 2). When sonification emerged as a field of study around three decades ago ( Kramer, 1994 ; Hermann et al., 2011 ), it was defined in terms of engineering and utilitarian purposes. Since then, researchers in the field of auditory display have argued, sometimes vividly, whether to make use of design principles developed for sound art and electroacoustic music composition ( Barrass and Vickers, 2011 ), placing a stronger focus on aesthetics and systematic evaluation ( Bonet, 2021 ), or stay true to Kramer's original concept. The first author of the present study has proposed to extend the definition of sonification as “any technique that translates data into non-speech sound, with a systematic, describable, and reproducible method, in order to reveal or facilitate communication, interpretation, or discovery of meaning that is latent in the data, having a practical, artistic, or scientific purpose” ( Liew and Lindborg, 2020 ). A parallel definition was proposed for visualization (see also Lankow et al., 2012 , p. 20). However, note that while sonification defined this way embraces art, it still excludes speech [as Kramer (1994) did in his seminal paper]. Speech contains acoustic symbols (utterances) that convey semantic meaning within a given context (language). In contrast, it is hard to uphold that “visualization” should exclude visual symbols such as text, numbers, emojis etc., at least in practice. The exclusion of “speech sound” in sonification has been questioned ( Boehringer, 2022 , in review).

Recent research shows that sonification can overcome barriers in science communication, because “the translation from data into audio reveals changing variables to the listener through changes in sonic dimensions, such as frequency, pitch, amplitude, and location in the stereo field. In musical contexts, data can map to these sonic dimensions, as well as higher-order musical dimensions, such as tempo, form, and timbre” ( Sawe et al., 2020 ). A recent review study on sonification strategies in astronomical research ( Zanella et al., 2022 ) highlights the added value of using sound to increase the accessibility of scientific knowledge, especially for the engagement of a visually impaired non-expert audience. In a meta-study, Dubus and Bresin (2013) charted strategies for the sonification of real-world physical processes. The central design strategy is referred to as “parameter mapping.” The parameters in question describe the input and output spaces; the input might be time stamped and geo-tagged measurements of the environment such as ocean temperature, atmospheric CO 2 levels, polar ice coverage, biodiversity loss, and so on, and the output are the parameters of sound generation ( Walker and Nees, 2011 ), connected by a “mapping” network that describes the relationships between the two. This perspective, however, does not speak about qualitative characteristics of either input or output, or of the influence of sometimes nebulous information accompanying the sonification itself: like the remora fish that accompany large sharks. While the actual media itself is certainly the focus of our attention (in the present study: audio recordings of sonifications, or movie recordings of concurrent sonification-visualizations), the output as a whole can be a very complex entity. In many cases it is a sprawling set of informative documents, technical descriptions, photography, design drawings, iterative process output, contextual descriptions, published articles, and more. Moreover, there is no standardized way of documenting such projects, and each new project will define the form, shape, and size of presentation materials necessary to bring its communicative purposes across to audiences. How can we evaluate such complexity across several projects?

In the present study, the question of how to understand the relationship between aesthetic perspective and qualitative characteristics of projects is central. Aesthetic judgement is considered to be a perceptual-cognitive mechanism of higher order (consider the BRECVEMA framework, Juslin, 2013 ). Kramer (1994) was keenly aware that certain perceptual qualities are essential for a data sonification to be received and understood as meaningful. He argued for scientific evaluation criteria—systematicity, objectivity, and replicability—so that sonification might “be used as a scientific method” ( Hermann, 2008 ). This resulted in sonification, as a discipline or technique, came to be closely associated with information engineering. In reaction, other researchers argued that data sonification could benefit from knowledge about auditory perception gained in artistic fields, notably sound art and electroacoustic composition ( Vickers and Hogg, 2006 ; Barrass and Vickers, 2011 ; Vickers et al., 2017 ). Research in aesthetic appreciation in the arts has a very long history. The attention to everyday experiences and environments is more recent ( Berlyne, 1974 set the tone), as is the systematic study of aesthetic emotions ( Schindler et al., 2017 ; Menninghaus et al., 2019 ), the measurement of qualia such beauty, awe, and interestingness ( Silvia, 2005 ; Ramakrishnan and Greenwood, 2009 ), and in particular, aesthetics within a framework of music emotions ( Juslin, 2019 ). As a tool to chart the perceptual relationships between sonification and other sonic design techniques, Vickers and Hogg (2006) proposed the Aesthetic Perspective Space (APS; see also Vickers et al., 2017 ). It is a theoretical construct, a two-dimensional circumplex model, aimed at bridging auditory display and electroacoustic music. APS presents itself as a circumplex with two axes, labeled Intentionality and Indexicality. The first dimension describes purpose, and is anchored by the concepts of Ars Informatica (utilitarian) and Ars Musica (artistic). The second dimension describes the nature of sonic materials, whether the sonic material points toward Concrete or Abstract ontologies. Intentionality is a gradient of the designer's intention in taking “deliberate decisions to address specific needs, in a given context and with a purpose, when transforming data into sound” ( Lenzi and Ciuccarelli, 2020 ), while Indexicality indicates the facility of causal inference: how much a sound “sounds like the thing that made it” ( Vickers and Hogg, 2006 , p. 213). Section 4.1.3 below discusses APS in light of Simon Emmerson's theoretical work ( Emmerson, 1986 , 2013-14 ).

Venturing deeper into the aesthetics of sonifications (and sonification-visualizations), we developed a model of perceptual rating scales spanning the APS, and tested it. Following that, we constructed scales to cover qualitative characteristics of interest that are applicable across a range of complex projects that deal with climate data. Through exploratory factor analysis, we identified essential aspects of the projects being studied. Finally, with predictive modeling, we investigated salient relationships between aesthetic dimensions and essential aspects in the dataset.

2. Materials and methods

2.1. corpus.

Recent years have witnessed an increase in sonification projects that aim to communicate complex, socially relevant phenomena to a larger public ( Lenzi, 2021 ), a transition that the community of data sonification and auditory display advocated for on several occasions. For the present review of projects dedicated to climate change, we have largely followed the PRISMA framework (Preferred Reporting Items for Systematic Reviews and Meta-Analyses, https://www.prisma-statement.org/ ) to review all the currently listed projects in what is arguably the most complete repository of sonification projects, namely the Data Sonification Archive (DSA; https://sonification.design ), a curated, community-sourced online collection launched in 2021, together with works presented at the Conference on Data Art for Climate Action (DACA; http://dataclimate.org ) that was held in February 2022. Following the PRISMA flowchart, we initially identified 395 projects: 374 from DSA and 19 from DACA. As this was already a large number for the kind of analysis-intensive study we were preparing, we chose to focus the present study on the complete set of projects listed in these two databases. Future work might start with a broad keyword-matching search on the World Wide Web. As inclusion criteria, we considered projects published within the past 20 years that evidenced a significant component of data sonification relating (in some way or form) to climate action (as in climate change, climate crisis, climate mitigation, and so forth). Projects could additionally employ visualization to represent the data. The first author screened the records and excluded 337 projects (all from DSA; for example, when the title or archival topic indicated that the data was from a source not relevant to climate, such as finance, mobility, astronomy, or war. Another 22 were not eligible because complete information about these projects could not be obtained at the point in time). The selection process yielded a corpus of 32 projects. Of these, 23 are from DSA and 13 from DACA, while 4 appear in both contexts. All focus on sonification of climate data, and 18 out of the 32 projects also include a data visualization component. There are 26 different first authors, out of whom 4 are female, and one whose gender is not known. Including co-authors, there are at least 39 different co-authors, out of which 8 are female, and one whose gender is not known. This imbalance in gender distribution might be looked at more carefully in future studies. The projects are created between 2007 and 2022, with the median year being 2018; this justifies referring to the corpus as consisting of recent research. See Table 1 for an overview of the 32 projects, and Supplementary Data Sheet 1 for details about the corpus, including web links to media and other information.

www.frontiersin.org

Table 1 . Overview of the 32 climate data projects included in the study.

2.1.1. Duration

The duration of project media display (i.e., sonification, visualization) was identified. In several cases, the media display was part of a longer movie (e.g., on YouTube or Vimeo) and only the actual play time of the sonification or visualization was noted. Moreover, the supporting descriptions of projects in many cases included additional audiovisual material, such as spoken presentations or interviews, or sonic output emanating at stages in the design process; these were not counted as part of the media duration.

2.1.2. Lexical diversity

Lexical diversity is one aspect of “lexical richness” and refers to the range of different words used in a text, with a greater range indicating higher diversity ( McCarthy and Jarvis, 2010 ). Different indices exist and many are elaborations on the basic type-token ratio (TTR; the ratio of different unique stems to the total number of words) first developed some 50 years ago. We chose to employ the Measure of Textual Lexical Diversity with moving average window (MTLD-MA; McCarthy and Jarvis, 2010 ), as implemented in koRpus ( Michalke et al., 2021 ) running in R Core Team (2022) .

2.2. Content analysis

The corpus was subjected to systematic content analysis. The first author prepared information about each project that included (1) a web link to media (audio-only or movie, i.e., both audio and video); (2) an unformatted text (such as program notes or an abstract); and (3) a web link to other description (such as websites, podcasts, newspaper, and journal papers). All web links were verified at the point of conducting the evaluation (June 2022), and listed in Supplementary material 1 . To estimate aesthetic perspective and a range of project characteristics, the 32 projects were evaluated by six researchers, each of whom has skills and knowledge in sonification, visualization, and sonic design. Three are the authors of the present manuscript, two are PhD students with the first and second author, respectively, and one is a research assistant at the first author's lab. They individually evaluated each project according to 33 rating scales which are described further below, in randomized order, via an online survey platform ( https://www.questionpro.com/ ). The sonic media itself was considered the most important source for the evaluation, and complemented by text and other descriptions.

2.2.1. Topics

The project descriptions were of various kinds, containing different amounts of text, images, movie clips, speech, references, and other information, provided by the original authors or by others. Some descriptions were long, such as a published paper of several pages or a substantial webblog, while others were short, such as a program note or artist statement. In 28 out of 32 projects, they were collected from a different site than the media itself. To collate reasonably homogenous textual presentations of the projects, while staying true to the authors' idiosyncratic way of presenting their work, we selected a text-only portion that could be pared down to simple format (ASCII, i.e., no images or HTML). The projects were classified according to topics and subtopics, specifically, the provenance of data (i.e., atmosphere, biosphere, hydrosphere) and the data type (such as CO 2 , Plant biodata, Polar cap etc.). An illustration of the classification is given in Figure 1 .

www.frontiersin.org

Figure 1 . Hierarchical overview of the 32 projects, classified under 3 topics (data provenance) and 15 subtopics (data type).

2.2.2. Aesthetic perspective

The Aesthetic Perspective Space (APS; Vickers and Hogg, 2006 ) has two axes, labeled Intentionality and Indexicality . We constructed eight unipolar rating scales to span this space, as shown in Figure 2 (compare with Figure 1 in Vickers' article). The circumplexity of this model was evaluated (see Section 3.5 for details). Ratings were made on a seven-step Likert scale anchored by “Strongly disagree” and “Strongly agree,” and the middle marked “Neutral.” Scales were presented in individually randomized order and with left-right direction randomly flipped for each project and rater. The instruction headline was: “Study the text, sounds, and moving images about the project, then globally evaluate how much you agree or disagree, globally, with each of the following broad characteristics.” The word inside brackets is the convenience variable name, used in Figure 3 and in the formulae below (Section 3.4 Aesthetic perspective). See the Discussion (Section 4.1.3) for further details.

www.frontiersin.org

Figure 2 . Eight statements of perceived characteristics spanning the Aesthetic Perspective Space.

www.frontiersin.org

Figure 3 . Factor analysis plot to test the circumplexity of eight scales underpinning the Aesthetic Perspective Space.

• This sonification is music-like, to be experienced via listening [Ars Musica].

• I hear acoustic instruments/vocalizations/objects and they form a body-engaging music [Concrete-Musical].

• This sonification consists of concrete sounds that are recognizable as recordings of things and actions in the physical world [Concrete].

• I hear acoustic instruments/vocalizations/objects that give meaningful information [Concrete-Informatic].

• This sonification is information-like, to be understood via listening [Ars Informatica].

• I hear beeps/synthesizers/drum-machines that give meaningful information [Abstract-Informatic].

• This sonification consists of abstract sounds that are recognizable as generated by digital computer synthesis [Abstract].

• I hear beeps/synthesizers/drum-machines, and they form a body-engaging music [Abstract-Musical].

2.2.3. Qualitative characteristics

To evaluate a range of characteristics across the projects, 25 rating scales were developed to probe salient aspects of content, methods, and context that would be generally relevant to data sonification and visualization projects. As in the previous part, the researchers who rated the corpus were required to understand the project as a whole before they made their judgements. Ratings were made on seven-step Likert scales labeled “Extremely little”—“Very little”—“Somewhat little”—“Average”—“Somewhat much”—“Very much”—“Extremely much.” Raters were instructed to employ, as far as possible, the full range of each scale across all the projects. The first 20 questions were applicable to all 32 projects, while the last five questions were only applicable to the subset of 18 that integrated data visualization. As before, questions were presented to the raters in randomized order and with left-right direction randomly flipped for each project and rater. The word inside brackets is the convenience variable name which is used in Table 2 and elsewhere in the article.

www.frontiersin.org

Table 2 . Median scores for the 32 surveyed projects.

• How much of the text/description is about the author(s) themselves (as opposed to the work itself)? [Author]

• How much is the text/description about the author's general motivation? [Motivation]

• How much background detail does text/description give about the specific project? [Background]

• How specific is the information about the source data? [SourceData]

• How detailed is the explanation of creative context (such as commissioning body or location of presentation)? [Context]

• How detailed is the recount of impact (such as associated publications, audience testimonies, and visitor numbers)? [Impact]

• How subjective (personal) is the content of the project? [Subjective]

• How objective (distanced) is the content of the project? [Objective]

• How detailed is the information on the original context of fruition (live performance, multimedia product, installation, website…)? [Fruition]

• How detailed is the technical information about the methods of data translation? [Methods]

• What degree of active engagement with the media is called for? [EngageDegree]

• How specific are the instructions for how to engage with the media? [EngageHow]

• How extensive/complete is the legend for understanding how data are represented? [Legend]

• How closely does the media representation match the original phenomenon described by the data? [MatchOrig]

• How convincing is the project in terms of climate science communication? [Convincing]

• How overtly does the project address the climate crisis? [Crisis]

• To what degree is it the author's stated intention for the project to contribute to climate science communication? [SciCom]

• How much does the project raise awareness of the climate crisis? [Awareness]

• How much does the project push for concerted action and adaptation of individual behaviors (e.g., travel, lifestyle choices)?[Behaviors]

• How successful is the project in arousing climate action? [Action]

[If the project includes both sonification and visualization:]

• How important is visualization to the project as a whole? [VisImpo]

• How important is sonification to the project as a whole? [SonImpo]

• In the development of the project, how much did sonification methods drive (initiate) visualization methods? [Son2Vis]

• In the development of the project, how much did visualization methods drive (initiate) sonification methods? [Vis2Son]

• To what degree do visualization and sonification represent the same content? [SonVisConcur]

3.1. Missing values

The individual ratings by the six researchers on 25 scales for 32 projects can be found in Supplementary Data Sheet 3 , which is in “long format” and contains 6,336 data points (6 x 33 x 32). There were 118 missing values, which is 1.86% of the total. We can identify two possible causes for missing values. Firstly, the rating process was laborious and took on average 6 h, effectively, to complete. The researchers had to take one or more breaks, and it appears that the QuestionPro software did not always register the last few ratings before the responses were saved in their system. Secondly, the web server for one project (p31) was unavailable to two raters, who thereby had to skip 66 ratings. All missing values were imputed with the median of within-project ratings by the others. The post-processed data are given in Supplementary Data Sheet 4 (together with computed variables; see below). A conveniently compact layout of median ratings is offered in Table 2 .

3.2. Duration

The duration of project media (i.e., sonification, visualization) was approximately exponentially distributed. The median duration was just under 4 min, in a range from 14 s to about 1 h. There were two installation-type works having indefinite duration (p01 and p13) and for the purposes of the analyses their durations were set to the maximum value of the longest specified media duration in the corpus. We defined a variable logDuration as the natural logarithm of the media duration in seconds; this variable was normally distributed (Shapiro's W = 0.95, p = 0.14).

3.3. Lexical diversity

The descriptive texts contained on average 465 words each, in a range from 52 to 1,533. While this is a large range in text volume, note that a study by McCarthy and Jarvis (2010) showed that MTLD-MA was the only index not varying as a function of text length. In our analysis we adopted the default setting for the TTR factor size of 0.72. In the present data, lexical diversity (MTLD-MA) had a positive skew ( W = 0.88, p < 0.001); this is partly due to a very high value for one project (p16, for which the descriptive text had been extracted from a peer-reviewed journal article). The scores for logDuration and lexical diversity are listed in Table 2 further below.

3.4. Inter-rater agreement

The six researchers individually evaluated the 32 projects, taking from 5.5 to almost 7 h to complete the ratings over several sessions. That is, they spent around 12 min per project. The interrater agreement was good, as indicated by Cronbach's alpha = 0.80 across the 33 scales.

3.5. Circumplexity of APS

As explained above (Section 2.2.2 Aesthetic perspective), the first eight question-scales were aimed at capturing the two main dimensions, labeled Intentionality and Indexicality, in the Aesthetic Perspective Space [APS]. To test if the APS circumplex (specifically, a circulant) would be an accurate representation of the current data, we followed the procedure outlined by Tracey (2000) , also considering Acton and Revelle (2004) . A circulant is defined by equal spacing of variables around a circle. Testing proceeded in three steps: (1) “eyeballing” factor analysis plots; (2) analyzing the residual matrix ( Hartmann et al., 2018 ); and (3) conducting tests of the circulant hypothesis, i.e., equal spacing of variables along the circle and equal radii (loading strength of the eight variables onto two latent factors corresponding to the two main factors of the APS. We used a bootstrap method to estimate the probability of the observed data appearing spuriously.

Firstly, inspection of the biplot in Figure 3 supports an intuitive understanding of the eight variables as forming a roughly circular shape. Note that rotation of factors does not change the evaluation of circumplexity. To make the “comparison-by-eyeballing” straightforward, we have rotated the plot by swapping the two factors between x and y axes, and then flipping the y-axis. This makes the plot of our current observed data more resemblant to Figure 2 , which illustrates the theoretical model. The most important distortions are for AbstractInformatic, which loads too close to abstract (i.e., highly correlated), and ConcreteInformatic, which loads close to Concrete. The reasons for this might be found in somewhat differing understandings of the rating scales among the six researchers. Despite good inter-rater agreement (Cronbach's alpha = 0.83 for the eight APS scales), the raters might have reacted in subtly different ways that are not captured by the alpha statistic. Inspecting individual plots (such as Figure 3 ), we could observe that one researcher produced an almost perfect circle, two had shapes very similar to the average, one had a shape that was slightly more distorted, and the shapes of the last two were more non-circular. Nevertheless, we decided to keep all the six raters.

Secondly, analyzing the residual matrix for all the data, we found that the two indicators given by Hartmann et al. (2018) supported the assumption that our factor model was a good representation of the underlying concept, i.e., correspondence to the two main dimensions of APS. After fitting our data to the theoretical model, the off-axis values in the residual matrix were “close to zero” for each of the eight factors (mean = 0.026), and the maximum (0.15) was well within the range indicated by Hartmann.

Thirdly, we followed Tracey (2000) to evaluate the equal distribution of factors along the rim of the circle. In the data, the gaps (or Distance-to-Next, as in Acton and Revelle, 2004 ) between observed factors and their theoretical position were {−1, 2, −15, −53, 4, 43, −2, −10} degrees, counting from ArsMusica counter-clockwise to AbstractMusical. Note that the highest absolute values are for ConcreteInformatic (−53) and AbstractInformatic (43), confirming what we eyeballed previously. The radii were calculated as the Euclidean distance of loadings from the center. To create bootstrap distributions for angles and radii, 10,000 uniformly randomized sets of 192 pseudo-ratings were generated. The variance of angular gaps and variance of distance-from-center scaled by mean distance were calculated ( Tracey, 2000 ). Comparing the lower end of the sorted distributions with the values for our current data yielded probabilities for the observed data to occur spuriously. They were: p ≤ 0.000 *** for even distribution of factors along the rim of the circle, and p = 0.0102 ** for the radii being similar in length. The two formal criteria for the assumption of circumplexity (circulant) were thus met in the current data. We proceeded by calculating the position for each project and rater according to the theoretical model of the Aesthetic Perspective Space from the ratings on eight scales as follows:

While the distribution of Intentionality was normal (Shapiro's W = 0.97, p = 0.14), that of Indexicality did not pass the test ( W = 0.91, p = 0.014). Pearson's measure of kurtosis was −1.51, as calculated with the psych package ( Revelle, 2020 ) running in R Core Team (2022) , indicating a thin-tailed distribution. If true, the presence of a broad or even bimodal distribution (a positive and a negative node) might indicate that the raters dichotomized amongst the projects along the Indexicality dimension, and tended to make a categorical distinction between abstract and concrete sonic materials. The matter might be addressed in a future study that considers details of causal listening that pertain to action-sound couplings ( Tuuri and Eerola, 2012 ; see Lindborg, 2019 for a proposition). For the present analysis, it is important to note that one assumption for the validity of linear regression results is that residuals of the fitted dependent variable are normally distributed, but it is not required that the variable itself is normal. Nevertheless, we proceed with some caution in interpreting results that involve Indexicality. The scores (medians across raters) are listed in Table 2 further down, and illustrated in Figure 4 . Note that the variance of Intentionality was about half that of Indexicality, so that the distribution visually appears somewhat “squashed.” The aesthetic perspectives of several individual projects are discussed below, in Section 4.2.

www.frontiersin.org

Figure 4 . Mapping of 32 projects in the Aesthetic Perspective Space: median position with lines indicating the 1st and 3rd quantiles. Circles, sonification projects; Squares, sonification-visualization projects. Colors denote topics (data provenance) as follows: orange, atmosphere, green, biosphere, blue, hydrosphere.

3.6. Qualitative characteristics

The 25 rating scales for qualitative characteristics were developed with the assumption that they would cover (to some degree) essential aspects across the 32 projects. These essential aspects can be understood as latent factors in the data, and exploratory factor analysis provides an estimate of the correlations with the latent factor(s) representing the data Revelle (In Press). In the present analysis, the number of factors was determined with the nfactors function in the psych package ( Revelle, 2020 ) running in R Core Team (2022) . We evaluated the functions output of VSS Complexity (VSS; Very Simple Structure) and Empirical BIC (Bayesian Information Criterion) to determine the optimal number of interpretable factors ( Revelle and Rocklin, 1979 ). We then used the fa function from the same library, with settings for ordinary least squares regression and promax rotation, to find a minimum residual solution with factors that lent themselves to a straightforward interpretation in terms of essential aspects of the projects.

In the first analysis we included all the 32 projects and the first 20 rating scales. A parsimonious solution was found with four factors, which together explain 56% of the variance in the data (32 projects × 20 scales). They were labeled Action, Technical, Perspective, and Context. We proceeded by analyzing the 18 projects that integrated both sonification and visualization but this time only including the 5 rating scales applicable to them. A solution was found with one factor, labeled Visualization, that explains 39% of the variance in this subset of the data (18 projects × 5 scales). The latent factors and their loadings on rating scales are given in Table 3 , and discussed further below.

www.frontiersin.org

Table 3 . Exploratory factor analysis of ratings of qualitative characteristics in 32 projects.

The loadings of the individual rating scale variables onto the latent factors can be gathered by inspecting Table 3 . We may note that the first and relatively strongest factor, labeled Action, is influenced foremost by ratings on the scales named Awareness, Action, Convincing, and Behaviors. Similarly, the second factor, Technical, is determined by Legend, SourceData, Methods, and Background, while the third, Context, by the rating variables named Context, Fruition, and Impact. Finally, the fourth is positively influenced by Subjective and negatively by Objective. The last of the latent factors, Visualization, determined by a separate analysis of the subset of 18 projects that integrated visualization, was positively influenced by VisImpo, Vis2Son, and SonVisConcur. To revise the constructs of the 25 rating scales, see the exact wordings for the questions posed to the raters; they are listed in the section above (Qualitative characteristics).

3.7. Multivariate analysis

We then investigated the relationship between Intentionality and Indexicality (APS dimensions) and the essential characteristics (latent factors), namely Action, Technical, Context, Perspective, and Visualization, together with logDuration and MTLD-MA. As in the exploratory factor analysis, we conducted two separate analyses: one on the whole dataset of 32 projects (20 rating scales yielding 4 latent factors), and the other on the subset of 18 projects that included visualization (5 rating scales yielding one latent factor). In this analysis, the ratings were z-scaled (“standardized”) within each rater. Firstly, we tested for multivariate relationships with MANOVA, taking Intentionality and Indexicality as jointly dependent variables, including as the independent variables the factors Action, Technical, Context, and Perspective (in the first case), or Visualization (in the second case), as well as logDuration and MTLD-MA. In both cases, the multivariate analysis of variance revealed the presence of significant differences. We therefore proceeded with modeling the univariate relationships with linear regressions, taking in turn Intentionality and Indexicality as the dependent variable (predictand) and the same variables listed above as predictors. The results are listed in Table 4 . We tested the validity of these results by analyzing the residuals of the dependent variable in each of the four models. In the first case, the residuals for Intentionality after model fitting were near normal (Shapiro-Wilk's W = 0.98, p = 0.01) and passed the test for heteroscedasticity (Breusch-Pagan's BP = 4.5, p = 0.35). Meanwhile, Indexicality residuals were normal ( W = 0.99, p = 0.09) and heteroscedasticity was not present (BP = 12.9, p = 0.11). In the second case, for projects involving data Visualization, residuals passed the two tests both for Intentionality W = 0.98, p = 0.04, BP = 6.8, p = 0.08) and for Indexicality ( W = 0.98, p = 0.11, BP = 13.4, p = 0.04).

www.frontiersin.org

Table 4 . Statistics for regression models predicting Intentionality and Indexicality from rated characteristics, duration, and lexical diversity, in all 32 projects and a subset of 18 projects integrating visualization.

To explore the models further, we applied stepwise reduction, but this did not yield additional information worthy of the effort. Since there are relatively few predictors involved, we believe it is more useful for comparisons to study the regression results while keeping the same set of predictors. With this in mind, we offer an interpretation of results listed in Table 4 . For a visual illustration of the seven significant relationships in the data, see Figures 5A – G .

www.frontiersin.org

Figure 5 . Scatterplots of significant relationships in regression modeling. In all Panels, units for outcome variables (predictands) are the centered ratings of Intentionality (left column) and Indexicality (right column), derived from the eight scales spanning the Aesthetic Perspective Space. In each of Panels (A, B, G) , the predictors are the latent factors Perspective and Action, respectively, in the original units from ratings scales, and centered. In Panels (C, D) , the predictor is MTLDMA (lexical diversity), in original units (see section 2.1.2 for details). In Panels (E, F) , the predictor is duration (logarithmic; see section 2.1.1 for details). Shape and color of symbols are the same as in Figure 4 .

For Intentionality, the significant predictors were the latent variables Perspective (ß = 0.48; Figure 5A ) and Action (ß = 0.16; Figure 5G ), together with the lexical diversity measure MTLD-MA (ß = −0.19; Figure 5C ); within the subset of projects involving visualization, logDuration was a significant predictor (ß = 0.42; Figure 5E ). In other words, the perceived “Ars Informatica vs. Ars Musica” dimension in sonifications was associated with the perspective of “objective vs. subjective” gleaned from the written descriptions. Projects that tended toward “Ars Musica” were described in simpler language yet produced more convincing characteristics that emphasized awareness, action, and change of behavior. When visualization was an integral part, “Ars Musica” projects were longer in duration.

For Indexicality, the significant predictors were Perspective (ß = 0.39; Figure 5B ) and logDuration (ß −0.29; Figure 5F ), and within the subset of projects involving visualization, also lexical diversity (ß = 0.28; Figure 5D ). In other words, the “Concrete vs. Abstract dimension,” which refers to the perception of sonic materials, was again associated with the perspective of “objective vs. subjective” in written descriptions. Sonifications with more concrete sonic materials (such as recognizable acoustic instruments or samples of natural sound sources) were longer in duration, and within the subset of projects that involved visualization, were described using simpler language.

4. Discussion

4.1. results, 4.1.1. regression models.

From the tables and scatter plots, we may identify and highlight relationships that the statistical analysis has revealed. We see that Perspective was a strong positive predictor for both Intentionality and Indexicality, even though the two dimensions were not significantly correlated (Spearman's rho = 0.16, p = 0.38, calculated on medians across raters). Recall that a high value on the latent factor labeled Perspective is mainly due to high ratings on the scale named Subjective, and low on the Objective scale. Projects with more author-oriented descriptions thus predicted a sonification output in the Concrete-Musical quadrant (in our corpus, p28 “Oceans Eat Cities” by Rolnick is the clearest example of this; see Discussion below for details). Across the corpus, duration (logarithm) was a negative predictor of Indexicality, so that longer sonifications were generally more abstract in terms of their sonic materials (as exemplified by p12 by Hamm and p16 by Lindborg; however, not in p01 by Aegypti). Higher lexical diversity, measured by MTLD-MA, predicted a lower value for Intentionality, i.e., sonifications perceived as Ars Informatica (p05 by Chafe and p26 by Renard are clearly science-oriented projects).

At the same time within the subset of 18 projects integrating visualization, higher lexical diversity predicted higher values of Indexicality, indicating that richer textual descriptions were associated with concrete rather than abstract sonic materials (e.g., p09 by Foo, which features marching band music). Similarly, logDuration was strongly positively associated with Intentionality, which is to say that visualization-sonification projects perceived as Ars Informatica were shorter (e.g., 26 by Renard). Note that these two effects were not significant when all 32 projects were taken under one, which might indicate that, across the whole corpus, sonification-only projects countered them: shorter projects might well be Ars Musica (e.g., p22 by Perera), and richer descriptions might indicate abstract sounds (e.g., p01 by Aegypti).

4.1.2. Topics and characteristics

As we read the project descriptions very closely to determine topics and subtopics, many other types of questions came to mind. For example: How rich or multidimensional is the source data? How large is the data set(s) being referred to? Are the data sets referred to available in the public domain? From the information given, to what extent is the project replicable? Such questions eventually boiled down to the 25 qualitative characteristics employed in the scale ratings, which generated the five essential aspects used in the analysis. In future work that attempts to further explore the notion of characteristics of complex projects, we would recommend a “minimalist approach” and develop a compact protocol for aesthetics (probably eight scales) and characteristics (for example 10 scales, i.e., the five essential aspects from our present findings, paired with “reverse coded” questions). The number of potential qualitative characteristics (i.e., qualia) of complex projects is quasi infinite. The set of 25 represented a compromise between the wish to cover as much terrain as possible, and a need to keep the time demanded of the raters reasonable. It still takes many hours to complete a full set of ratings. In future work, we will look into ways of speeding up the process.

A closer look at the relationship between the topics emerged from the analysis (see Figure 6 , Atmosphere, Biosphere, and Hydrosphere: in orange, green, and blue, respectively) and the other properties used in the classification of cases, borrowed from the metadata classification protocol of the DSA, shows that only one case that focuses on data of the Hydrosphere was created for educational purposes and the same topic along with biospheric data was the focus of research projects. Cases that used atmospheric data represent the biggest group that has art and public engagement as the main goals.

www.frontiersin.org

Figure 6 . Correlation between the three data foci (atmosphere, biosphere, and hydrosphere) and the goals of the selected projects.

Figure 7 illustrates correspondences between the three emerging topics and the media mix. Media mix is a specific metadata used in the DSA to classify projects by the medium used along with sound. Examples of media mix are “Sound only” (i.e., when the project only uses sonification), “Data viz” (i.e., when sonification and data visualization are combined), “Video” (i.e., sonification is used in combination with visual content in form of moving images), and “Artifact” (i.e., the sonification is created by interacting with a tangible object). In general, the three emerging topics (i.e., “Data Focus”) use a good mixture of media to represent data. Phenomena related to Biosphere and Atmosphere are mainly represented using sound only, while Hydrosphere datasets are evenly using sound alone and sound combined with data visualizations. Video content that does not replicate the sonified data, rather is used as a support for engagement, is used for all the topics. The only project that uses a physical, automatic artifact (p30; specifically, a music box, documented in a movie clip) to generate the sonification is found in the Biosphere group.

www.frontiersin.org

Figure 7 . Correlation between the three data foci (atmosphere, biosphere, and hydrosphere) and the mix of media used in the selected projects.

4.1.3. Aesthetic perspective space

The primary (horizontal) axis in the Aesthetic Perspective Space, Intentionality, is based on the theoretical proposition that there exists a continuous, bipolar, conceptual, psychological mechanism accessible by an intentional mode of listening that someone might apply when presented with an auditory object. The listener perceives the designed intention behind the object, and interprets its communicative focus on a scale between information-extraction and artfulness-experience. The secondary (vertical) dimension maps the perceived qualities of the sonic materials that the object is constituted by and their ontology, that is, whether the listener perceives the object as abstract-imaginary or as concrete-physical. This characterization of the sonic source material leans on Simon Emmerson's Language Grid, a framework that affords an analysis of electroacoustic music along two continuous dimensions: one describing the composer's perceptual attitude to the musical material, from Aural to Mimetic, and the other describing the composer's action on the material, from Phonographic to Constructed ( Emmerson, 1986 ; Fischman, 2007 ; also and importantly, Emmerson, 2013-14 ). Emmerson defines “mimetic” as “the imitation not only of nature but also of aspects of human culture not usually associated directly with musical material.” ( Emmerson, 1986 , p. 17). Within the context of auditory display, Vickers identifies sonification as a form of “mimetic discourse,” where “indexical” appears to be exactly the same as “mimetical.” One might therefore speak of “listening to concrete mimesis' in a situation where a sound object unequivocally denotes a physical source that is present in the environment,” and “listening to abstract mimesis” when a sound object associates with a non-present source or concept through metaphor. The association might be more or less graspable, hypothetical, or private.

We believe that the APS, as a conceptual tool, is very useful for research in audio design and auditory perception of both music and sonification: it merits thorough testing and further development. In their article presenting the Aesthetic Perspective Space, Vickers and Hogg (2006) had populated the circumplex with two handfuls of examples: some were specific pieces, others were generic, such as a genre or a composer. In subsequent papers, Vickers (2016) ; see also Vickers et al. (2017) included weblinks for most of the examples. It is not clear whether the positioning of these example works were empirical or hypothetical. Our present study is a step in a larger project of testing the underpinnings of APS, as a theoretical proposition, against empirical observations. We have used methods (e.g., corpora, experimental procedures, rating scales) that are replicable and extendable by other researchers. For example, it should be feasible to conduct a listening test to explore how people interpret a larger corpus of shorter clips that contain different types of sonic artifacts: composed EAM pieces, sound art, soundscape recordings, and data sonifications. A qualitative analysis of interviews with the test subjects might provide insights into their evaluation strategies (as in Lindborg and Friberg, 2015 ).

4.2. Observations of projects

Our analytical approach has been informed by the ongoing discussion in sound design and auditory perception research communities on the relationship between sonification and electroacoustic music composition. The authors are part of these fields in various ways. The first author is the main organizer of the DACA festival and a research-driven composer with a focus on multimedia experiences. The second author is active in the area of sound-driven design research and data sonification. She is the co-curator of the DSA and an evangelist of the potential of data sonification amid the information design and data visualization community. We will deepen this notion of “relationship” by giving examples of qualitative observations of a few of the corpus projects.

Looking at the projects with highest and lowest scores for Intentionality and Indexicality provides insights into the characteristics that create aesthetic perceptions: such as, the most “musical” or “informatics” sonification, or the one having the most “concrete” or “abstract” sonic materials.

The project that scored highest on Indexicality (concrete sound materials) was “Shifting Apple Blossom in Bremen” (p30) by Striedelmeyer. It stands out because data are sonified (as well as visualized) through a physical music box which the user (e.g., at an exhibition) would manually activate in order to hear the data.

The two projects that scored highest on Intentionality (“Ars Musica”) were both by Jamie Perera. A personal communication between the second author and Perera clarified that the composer's interest in sonification lies mainly in the potential of this translation method to increase public engagement on critical topics (such as climate change, but also the COVID-19 pandemic), support activism and overall take responsibility as artists toward society at large.

The 18 projects that integrate data visualization with sonification are diverse. With “Anthropocene in C Major” (p23), Pereira has composed a 45-min orchestral piece, accompanied by a visualization of the dataset so as to work as a performance guide for the public. Pereira released six sonification projects between 2017 and 2020, and three are included in the corpus (p22, p23, p24). Each project highlights a different consequence of climate change.

Some of the reviewed projects materialize as pieces with multiple movements or sequential parts. When the movements use different techniques and aesthetic style it becomes hard to evaluate the project as a whole. For example, Blazsek's project “Mongkut” (p04) was the sonification most clearly identified as Ars Informatica (see Figure 4 and Table 2 ). The work has three movements where the first and last use similar techniques and style (e.g., sinewave modulation) while the middle movement displays a very different sonic characteristic (e.g., concrete soundscape recordings). The durations of movements are widely different (short, very long, very short). This compositional diversity of approach poses challenges for the raters to judge the characteristics and thus to pinpoint the project as a whole. The fact that this project draws on the same event/phenomenon does hold things together as a scientific demonstration, and as the Intentionality score indicates, lends it to an appreciation as utilitarian rather than experiential.

Ekeberg's sonification installation “Ingenmansland” (p08) shows some similarities with p04. Here, there are two movements where the musical textures are similar though the sonic materials are distinctly opposite (first concrete, then abstract). As in Blazsek's tripartite piece, Ekeberg presents a diptych whose parts are held together since they refer to the same subject, deforestation in western Norway: first as a pseudo-documentary field recording, then as an electroacoustic, dystopian metaphor.

By contrast, in “Oceans Eat Cities” (p28), Rolnick lets the two movements have differing musical style (e.g., tempo, density, character) yet the characteristics of the sonic material remain the same throughout. Apparently, the way the data was used to determine musical materials is consistent across the movements of the piece.

Some project descriptions didactically present stages in a process, such as in “Polarseeds” (p31) by Tedesco and collaborators. In this case, the evaluation focused on the last published version of the project (ignoring or suppressing the many examples provided of stages in the process)—the last is clearly the most complex and accomplished in the series. The process stages leading up to the last are assumed to be presented as demonstrations of the method, rather than as movements in a finalized output.

Geere and Quick have specialized in making “sonification podcasts” for their series Loudnumbers ( https://www.loudnumbers.net/ ). Each program typically starts with a pedagogic explanation of the design strategy, which effortlessly translates into listening tips (i.e., providing a legend for how the listener can extract meaning from the sounds). In “The End of the Road” (p19), they built on Pape Møller's laboriously collected time series data on insect population density on a rural road in Denmark, and more recently (p10), they turned data from traditional ice measurements in a village in Alaska into techno music.

Crawford and George released projects in 2013 and 2015 that translated the global rise in temperature that in its sonic style leans on classical Western music. Included in the corpus is a cello solo piece (p06). Interestingly, the projects were published in the form of videos where, after an introduction by the authors, the sonification is performed while visualization of the same data appears on screen, as a sort of subtitle or visual support to help the public associate what they hear with what is perhaps a more familiar sensory modality.

The heritage of acoustic orchestral Western music is also the choice of Guda (p11), Twet (p32), and Sawe & Oakes (p29). A similar strategy is also used by Foo in “Too Blue—Mapping Coastal Louisiana's Land Loss with Music” (p09) though with creole-style marching band music.

Working with meteorological records and predictions covering large geographical areas between 2013 and 2016, Lindborg released several instances of “Locust Wrath,” adapting the original multi-channel immersive installation to different contexts, such as dance performance ( Lindborg, 2018 ), sculptural auditory display ( Lindborg, 2015 ), and participatory installation ( Lindborg and Liu, 2015 ). Two are included in the corpus (p15, p16) together with a recent installation piece, “Stairway to Helheim,” ( Lindborg, 2022 ) that fuses abstract and concrete sonic materials with cross-synthesis in a sonification of historical weather records of Hong Kong over 138 years (p17).

Many projects employ long historical time series. In p05, the most Abstract-Informatical in the corpus, Chafe used synthesized sounds to convey the correlation between rising CO 2 levels and the increase in temperature, using data records from five centuries, from 1666 to 2016. In p25, Quinn represented climate data from the last 110,000 years in a music composition with MIDI instruments.

4.3. Aesthetics of data art and scientific communication

Having a unified analysis method for a range of sonic artifacts from the fields of electroacoustic music (e.g., concert and multimedia compositions) and sonification (e.g., software earcons and sys- tem monitoring designs) facilitates interrogation of aesthetic and effectiveness. Vickers underlines that the principles of the former are applicable onto the latter; the primary concern lies with the design of auditory displays and the effectiveness of sonification for the discovery of meaning in data, and more generally for communication. He urges practitioners in the field of sonification to carefully study the principles of electroacoustic music composition, arguing that music and auditory display share important attributes: “it is at these intersections that dialogue and interrogation may take place.” However, he does not equate one with the other, noting that there are “artifacts present in each of music and sonification that are not present in the other… one such is the intellectual content of compositions” (quotes from Vickers and Hogg, 2006 , discussed in Lindborg, 2019 , p. 44).

Sonification–visualization techniques must not be aestheticized to the point that scientific criteria are neglected. In the context of science communication, researchers have pointed out that “data sonification need not necessarily be musical in nature, and many scientifically-useful auditory graphs are not particularly musical, or even pleasant to listen to. There are some rationales for abstracting the sonification… abstraction can bring some interesting choices to the communicator” ( Sawe et al., 2020 ). In this context, “interestingness” should be understood as a precise psychological concept ( Silvia, 2005 ). As pointed out by Bonet (2021) , the term aesthetics does not necessarily denote something “beautiful” or “pleasing,” and sonifications are not necessarily “pleasant,” and that the “aesthetics of a sonification must be linked to its meaning and purpose” (cit. p. 270). Thus, sonification involves several techniques and purposes that, while complying with Kramer's original definition, might also aims to satisfy aesthetic appreciation and to reify the attractiveness of discovery. Aesthetic sonification is not arbitrary. While a scientific approach that emphasizes systematicity and reproducibility is in our opinion fundamental for all data art, successful designs build on ecological perception, i.e., the principle that organisms learn patterns meaningful for survival through exposure and from interacting with the environment ( Gaver, 1993 ; Clarke, 2005 ; Lindborg, 2018 ) proposed an “embodied aesthetic framework” to rethink the “relationship between aesthetics and meaning-making in order to tackle the mapping problem” in sonification ( Roddy and Furlong, 2014 , p.70). The problem they raised becomes apparent in the public's understanding, for example, if they feel that the relationship between data and sound is arbitrary (cf. Vickers and Hogg, 2006 ). Ultimately, design guidelines are needed to achieve more engaging and effective sonifications.

In this article we have presented a systematic analysis of topics, perceptual characteristics, and aesthetics in a range of sonification and visualization projects. The study aims to contribute to the development of empirically founded design techniques, applicable to climate data communication and other fields. For scientific knowledge to reach people impervious to traditional dissemination methods, researchers in multimodal communication need to explore the relationship between intention strategies, meaning, and aesthetics enabled by extended communication techniques, such as sonification and visualization. This poses challenges for designers of data art aiming to stir audiences into action when faced with the hugely varied and complicated expressions that make up the climate crises. Public engagement with techno-scientific knowledge and its potential societal impact is still an open issue. In this situation, we believe that design informed by research in auditory perception and aesthetics play a central role in creating multisensory experiences that make scientific climate data both meaningful and exciting.

Data availability statement

The original contributions presented in the study are included in the article/ Supplementary material , further inquiries can be directed to the corresponding author.

Author contributions

PL and SL conceived the study and wrote the Introduction and Discussion Sections. PL conducted the statistical analysis and wrote the Methods and Results Sections. All authors contributed to the analysis, reviewed all parts of the manuscript, and approved the submitted version.

The research for this paper by PL was supported by a grant from the Research Grants Council of the Hong Kong Special Administrative Region, China (Project No. CityU 11605622). The research for this paper by SL was supported by a Design Grant from Politcnico di Milano, Italy.

Acknowledgments

The authors thank all the project creators and their collaborators for making media and texts about their creative work freely available, without which this review would not have been possible. We acknowledge the three researchers who contributed evaluations of the corpus, and finally, the peer reviewers for feedback that challenged us to go deeper.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg.2022.1020102/full#supplementary-material

Supplementary Data Sheet 1. Spreadsheet with details about the 32 projects, including web links to media and other information.

Supplementary Data Sheet 2. Spreadsheet in “long format” with individually rated scores on 25 rating scales for the 32 projects.

Supplementary Data Sheet 3. Spreadsheet with median ratings on 25 scales of the 32 projects.

Supplementary Data Sheet 4. Spreadsheet containing all post-processed ratings (33 scales × 6 raters × 32 projects), computed variables (7 variables × 6 raters × 32 projects), project descriptors (2 descriptors × 32 projects), and factors.

Acton, G. S., and Revelle, W. (2004). “Evaluation of ten psychometric criteria for circumplex structure,” in Methods of Psychological Research (Pabst Science Publishing), 9.

Google Scholar

Barrass, S., and Vickers, P. (2011). “Sonification design and aesthetics,” in The Sonification Handbook eds T. Hermann, A. Hunt, and J. G. Neuhoof (Logos Verlag), 145–164.

Berlyne, D. E. (1974). Studies in the New Experimental Aesthetics: Steps Toward an Objective Psychology Of Aesthetic Appreciation . Hemisphere.

Bertin, J. (1967). Sémiologie graphique. Les diagrammes Les réseaux Les cartes. Paris: Gauthier-Villars.

Boehringer, J. (2022). “Listening to design in an expanded field: problematising key aesthetic issues in sonification,” in Paper Presented at the International Conference on Auditory Display (ICAD).

Bonet, N. (2021). “Creating and evaluating aesthetics in sonification,” in Doing Research in Sound Design (Focal Press), 269–282. doi: 10.4324/9780429356360-16

CrossRef Full Text | Google Scholar

Caiola, V., Lenzi, S., and Riccò, D. (2022). Audiovisual sonifications. A design map for multisensory integration in data representation. Digit. Res. Soc . doi: 10.21606/drs.2022.380

Chandler, R., Anstey, E., and Ross, H. (2015). Listening to voices and visualizing data in qualitative research: hypermodal dissemination possibilities. Sage Open 5, 2158244015592166. doi: 10.1177/2158244015592166

Clarke, E. F. (2005). Ways of Listening: An Ecological Approach to the Perception of Musical Meaning . Oxford University Press.

Dubus, G., and Bresin, R. (2013). A systematic review of mapping strategies for the sonification of physical quantities. PLoS ONE 8, e82491. doi: 10.1371/journal.pone.0082491

PubMed Abstract | CrossRef Full Text | Google Scholar

Emmerson, S. (1986). “The relation of language to materials,” in The Language of Electroacoustic Music (Springer), 17–39. doi: 10.1007/978-1-349-18492-7_3

Emmerson, S. (2013-14). “Wandering uneasily in a familiar landscape,” in Orema . doi: 10.3943/001.2013.04.0102

Enge, K., Rind, A., Iber, M., Höldrich, R., and Aigner, W. (2021). “It's about time: adopting theoretical constructs from visualization for sonification,” in Audio Mostly 2021, 64–71. doi: 10.1145/3478384.3478415

Fischman, R. (2007). “Mimetic space: a conceptual framework for the discussion, analysis and creation of mimetic discourse and structure,” in Paper presented at the Proceedings of the EMS07 Conference, De Montfort University ) Leicester: Electroacoustic Music Studies Network). Available online at: http://www.ems-network.org/spip.php (accessed January 10, 2023).

Gaver, W. W. (1993). What in the world do we hear?: an ecological approach to auditory event perception. Ecol. Psychol. 5, 1–29. doi: 10.1207/s15326969eco0501_1

Hartmann, K., Krois, J., and Waske, B. (2018). E-Learning Project SOGA: Statistics and Geospatial Data Analysis. Department of Earth Sciences, Freie Universitaet Berlin, 33Available online at: https://www.geo.fu-berlin.de/en/v/soga/Geodata-analysis/factor-analysis/A-simple-example-of-FA/index.html (accessed January 10, 2023).

Hermann, T. (2008). Sonification-A Definition . Available online at: https://sonification.de/son/definition/ (accessed January 11, 2023).

Hermann, T., Hunt, A., and Neuhoff, J. G. (2011). The Sonification Handbook . Logos Verlag Berlin.

Jacobs, R., Howarth, C., and Coulton, P. (2017). Artist-scientist collaborations: maximising impact of climate research and increasing public engagement. Int. J. Clim. Change Impacts Resp. 9, 1–9. doi: 10.18848/1835-7156/CGP/v09i03/1-9

Juslin, P. N. (2013). From everyday emotions to aesthetic emotions: towards a unified theory of musical emotions. Phys. Life Rev. 10, 235–266. doi: 10.1016/j.plrev.2013.05.008

Juslin, P. N. (2019). Musical Emotions Explained: Unlocking the Secrets of Musical Affect . Oxford University Press, USA. doi: 10.1093/oso/9780198753421.001.0001

Kramer, G. (1994). “An introduction to auditory display,” in Auditory Display-Sonification, Audification and Auditory Interfaces , 1–77.

Lankow, J., Ritchie, J., and Crooks, R. (2012). Infographics: The Power of Visual Storytelling . John Wiley and Sons.

Lenzi, S. (2021). The design of data sonification. design processes, protocols and tools grounded in anomaly detection (Ph. D. thesis) . Politecnico di Milano, Polimi, Italy

Lenzi, S., and Ciuccarelli, P. (2020). Intentionality and design in the data sonification of social issues. Big Data Soc. 7, 2053951720944603. doi: 10.1177/2053951720944603

Liew, K., and Lindborg, P. (2020). A sonification of cross-cultural differences in happiness-related tweets. J. Audio Eng. Soc. 68, 25–33. doi: 10.17743/jaes.2019.0056

Lindborg, P. (2018). Interactive sonification of weather data for the locust wrath, a multimedia dance performance. Leonardo 51, 466–474. doi: 10.1162/leon_a_01339

Lindborg, P. (2019). How do we listen? Emille J. the Korean Electro-Acoustic Soc. 16, 43–49.

Lindborg, P. (2022). “Stairway to helheim,” in Proceedings|Catalogue of DACA (Data Art for Climate Action) (Hong Kong: City University of Hong Kong).

Lindborg, P., and Friberg, A. K. (2015). Colour association with music is mediated by emotion: Evidence from an experiment using a CIE Lab interface and interviews. PLoS ONE , 10, e0144013. doi: 10.1371/journal.pone.0144013

Lindborg, P., and Liu, D. Y. (2015). “Locust wrath: an iOS audience participatory auditory display,” in 21st International Conference on Auditory Display (Austria) 125–132.

Lindborg, P. M. (2015). LW24 [Sculptural Auditory Display . Singapore: National Gallery.

McCarthy, P. M., and Jarvis, S. (2010). MTLD, vocd-D, and HD-D: a validation study of sophisticated approaches to lexical diversity assessment. Behav. Res. Methods 42, 381–392. doi: 10.3758/BRM.42.2.381

Menninghaus, W., Wagner, V., Wassiliwizky, E., Schindler, I., Hanich, J., Jacobsen, T., et al. (2019). What are aesthetic emotions? Psychol. Rev. 126, 171. doi: 10.1037/rev0000135

Michalke, M., Brown, E., Mirisola, A., Brulet, A., and Hauser, L. (2021). koRpus v0.13-8 (package for R) .

Munzner, T. (2014). Visualization Analysis and Design. New York, NY: CRC Press. doi: 10.1201/b17511

Polli, A. (2011). Communicating air: Alternative pathways to environmental knowing through computational ecomedia (Ph. D. thesis). University of Plymouth, Plymouth, United Kingdom.

R Core Team (2022). R: A Language and Environment for Statistical Computing. Version 4.05 . R Foundation for Statistical Computing.

Ramakrishnan, C., and Greenwood, S. (2009). “Entropy sonification,” in Proceedings of the 15th International Conference on Auditory Display (Copenhagen: Georgia Institute of Technology).

Revelle, W. (2020). psych, Software Package for R. Version 1.9.12.31 .

Revelle, W. (In Press). An Introduction to Psychometric Theory With Applications in R . Available online at: https://www.personality-project.org/r/book/ (accessed January 10 2023).

Revelle, W., and Rocklin, T. (1979). Very simple structure: an alternative procedure for estimating the optimal number of interpretable factors. Multivariate Behav. Res. 14, 403–414. doi: 10.1207/s15327906mbr1404_2

Roddy, S., and Furlong, D. (2014). Embodied aesthetics in auditory display. Organised Sound 19, 70–77. doi: 10.1017/S1355771813000423

Ruddock, J., Macklin, M., and Harvey, J. (2012). SciArt The Confluence of Art and Science in Conveying the Uncertainties of Climate Change .

Sawe, N., Chafe, C., and Treviño, J. (2020). Using Data sonification to overcome science literacy, numeracy, and visualization barriers in science communication. Front. Commun. 5, 46. doi: 10.3389/fcomm.2020.00046

Schindler, I., Hosoya, G., Menninghaus, W., Beermann, U., Wagner, V., Eid, M., et al. (2017). Measuring aesthetic emotions: a review of the literature and a new assessment tool. PLoS ONE 12, e0178899. doi: 10.1371/journal.pone.0178899

Silvia, P. J. (2005). What Is Interesting? Exploring the Appraisal Structure of Interest. Emotion. 5, 89–102. doi: 10.1037/1528-3542.5.1.89

Tracey, T. J. (2000). “Analysis of circumplex models,” in Handbook of Applied Multivariate Statistics and Mathematical Modeling , 641–664. doi: 10.1016/B978-012691360-6/50023-9

Tuuri, K., and Eerola, T. (2012). “Formulating a Revised Taxonomy for Modes of Listening”, J. New Music Res. 41, 137–152, doi: 10.1080/09298215.2011.614951

Vickers, P. (2016). “Sonification and music, music and sonification,” in The Routledge Companion to Sounding Art (Taylor and Francis), 135–144.

Vickers, P., and Hogg, B. (2006). “Sonification Abstraite/Sonification Concrete: An'aesthetic persepctive space'for classifying auditory displays in the ars musica domain,” in Proceedings of the 12th International Conference on Auditory Display (London: Georgia Institute of Technology).

Vickers, P., Hogg, B., Worrall, D., and Wöllner, C. (2017). “The aesthetics of sonification,” in Body, Sound and Space in Music and Beyond: Multimodal Explorations (Routledge), 89–109. doi: 10.4324/9781315569628-6

Walker, B. N., and Nees, M. A. (2011). “Theory of sonification,” in The Sonification Handbook , eds T. Hermann, A. Hunt, and J. G. Neuhoof (Logos Verlag). 9–39.

Zanella, A., Harrison, C., Lenzi, S., Cooke, J., Damsma, P., and Fleming, S. (2022). Sonification and sound design for astronomy research, education and public engagement. Nat. Astron. 6, 1241–1248. doi: 10.1038/s41550-022-01721-z

Keywords: climate data, science communication, sonification, visualization, aesthetic perspective, circumplexity, exploratory factor analysis, lexical diversity

Citation: Lindborg P, Lenzi S and Chen M (2023) Climate data sonification and visualization: An analysis of topics, aesthetics, and characteristics in 32 recent projects. Front. Psychol. 13:1020102. doi: 10.3389/fpsyg.2022.1020102

Received: 15 August 2022; Accepted: 28 December 2022; Published: 25 January 2023.

Reviewed by:

Copyright © 2023 Lindborg, Lenzi and Chen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

'Climatic database' presentation slideshows

Climatic database - powerpoint ppt presentation.

Agrometshell

Agrometshell

Agrometshell. Workshop 15-17 September Rome. Peter Hoefsloot?. Dutch National Married, 2 children, 7 sheep Msc. In Agronomy/Meteorology/Comp. Science in Wageningen, The Netherlands Do management of geo-info projects for Dutch consultancy firm (Haskoning)

540 views ‱ 28 slides

View Climatic database PowerPoint (PPT) presentations online in SlideServe. SlideServe has a very huge collection of Climatic database PowerPoint presentations. You can view or download Climatic database presentations for your school assignment or business presentation. Browse for the presentations on every topic that you want.

TerraClimate: Global, high-resolution gridded temperature, precipitation, and other water balance variables

TerraClimate: Global, high-resolution gridded temperature, precipitation, and other water balance variables

TerraClimate is a global gridded dataset of meteorological  and water balance variables for 1958-present, available on a monthly timestep. Its relatively fine spatial resolution, global extent, and long length are a unique combination that fills a void in climate data. TerraClimate combines spatial climatology from WorldClim with time-varying information from the coarser resolution CRU TS4.0.

In addition to maximum and minimum temperatures and precipitation, TerraClimate provides derived variables including reference evapotranspiration , vapor pressure deficit, and PDSI. Water balance metrics including runoff, snow water equivalent, soil moisture, and climatic water deficit are calculated using a  Thornthwaite-Mather climatic water-balance model (e.g., Willmott et al., 1985; Dobrowski et al., 2013) and extractable soil water storage capacity data (Wang-Erlandsson  et al.  2016). 

These data can be used in species distribution modeling, to approximate local variability and changes where station-based data are lacking or derived variables are preferred, and for climate-impact analyses in ecological, agricultural, and hydrological systems cases where spatial attributes of climate may be preferred over coarser resolution data. TerraClimate inherits uncertainties from the input datasets used, and likewise does not improve the spatial scale of climate anomalies over the coarser resolution parent dataset.

Key Strengths

Combines fine spatial resolution climatology with temporal information for 1958-present

In addition to standard monthly climate summaries, TerraClimate provides variables that are of more immediate use to surface hydroclimate for ecology and hydrology including runoff, actual evapotranspiration, soil moisture, and climatic water deficit

Data readily available for download or online visualization

Key Limitations

climatically aided interpolation interpolates anomalies from the coarser resolution parent product (e.g., CRU Ts at 0.5 degree resolution) to the higher-spatial resolution climatology. As such, sharp gradients in climate anomalies in montane or near coastal environments, will not be realized as TerraClimate will not capture temporal variability in climate measures at finer scales than its parent product

TerraClimate should not be viewed to provide an independent estimate of trends in variables such as temperature

Underlying uncertainties in the core datasets used, including negative precipitation biases in mountains of the western US inherited from WorldClim, or inhomogeneities in CRU TS or reanalysis, are entrained into TerraClimate . Also, uses a very simple water balance model.

Data Access

Abatzoglou, J. T., S. Z. Dobrowski, S. A. Parks, and K. C. Hegewisch, 2018: TerraClimate, a high-resolution global dataset of monthly climate and climatic water balance from 1958–2015. Scientific Data, 5, https://doi.org/10.1038/sdata.2017.191 .

  • TerraClimate 1958-2015 direct download of entire dataset
  • TerraClimate homepage with links to THREDDS server to download individual years, aggregated years, or climatologies (1981-2010 or 1961-1990)
  • Terra Climate direct download, selectable by variable and year
  • TerraClimate visualization tool
  • TeeraClimate point data download tool

Expert Developer Guidance

The following was contributed by Dr. John Abatzoglou, July 2019:

The TerraClimate dataset was developed to fill a void in the availability of freely available moderate-to-high spatial resolution global climate data that spans multiple decades. While datasets with similar attributes are available for individual countries or even continents, the paucity in comparable data across global terrestrial surfaces can be a barrier for end-users who need estimates of place-based climate data such as those often required in ecology and hydrology.  TerraClimate provides a six-decade record of monthly climate data for global terrestrial surfaces on a ~4-km (1/24th degree) grid by blending desirable spatial attributes from WorldClimV2 with desirable temporal attributes from CRU Ts4.0 and reanalyses products through climatically aided interpolation. 

In addition to providing output of first-order climate measures such as temperature and precipitation, TerraClimate provides derived climate variables that are often of value for climate impact assessments including reference evapotranspiration (Penman-Monteith ASCE), vapor pressure deficit, and PDSI. A modified Thornthwaite-Mather climatic water-balance model (e.g., Willmott et al., 1985; Dobrowski et al., 2013).and extractable soil water storage capacity data (Wang-Erlandsson  et al.  2016) are used to compute water balance metrics runoff, snow water equivalent, soil moisture, actual evapotranspiration, and climatic water deficit.

These data can be used in species distribution modeling, to approximate local variability and changes where station-based data are lacking or derived variables are preferred, and for climate-impact analyses in ecological, agricultural, and hydrological systems cases where spatial attributes of climate may be preferred over coarser resolution data. Data are stored in Network Common Data Form (NetCDF) format. Both the files and THREDDS based access to data are available at the University of Idaho’s Northwest Knowledge Network  http://www.climatologylab.org/terraclimate.html . A point-based data extraction tool is also available for users who only need to access time series of data for a few locations:  https://climatetoolbox.org/tool/Point-Data-Download . Data are also available through Google Earth Engine. Data are updated annually when parent datasets become available. Metadata files are also produced which provide information on the number of stations (from CRU Ts4.0) that contribute to the temporal variability of TerraClimate for temperature, precipitation, and vapor pressure. Estimates from reanalyses (JRA-55) are used for anomalies in cases where zero stations contribute. Temporal variability on solar radiation and wind speed comes entirely from reanalyses.

  • Moderate-high spatial resolution monthly climate layers from 1958-present;
  • In addition to standard monthly climate summaries, TerraClimate provides variables that are of more immediate use to surface hydroclimate for ecology and hydrology including runoff, actual evapotranspiration, soil moisture, and climatic water deficit;

First, climatically aided interpolation interpolate anomalies from coarser resolution parent product (e.g., CRU Ts at 0.5 degree resolution) to the higher-spatial resolution climatology. As such, sharp gradients in climate anomalies in montane or near coastal environments, will not be realized as TerraClimate will not capture temporal variability in climate measures at finer scales than its parent product. Likewise, due to the reliance on other datasets, TerraClimate should not be viewed to provide an independent estimate of trends in variables such as temperature. Underlying uncertainties in the core datasets used, including negative precipitation biases in mountains of the western US inherited from WorldClim, or inhomogeneities in CRU TS or reanalysis, are entrained into TerraClimate (e.g., Abatzoglou et al., 2018).  

Therefore, while TerraClimate does not explicitly use station observations, it relies on products that make heavy use of station records, and model assimilation in reanalysis including that which spans the satellite era. Non-climatic features present in parent datasets may be incorporated into TerraClimate. Efforts have been made to correct for spurious trends in solar radiation and wind speeds in the JRA by incorporating two different reanalysis products to span the 1958-2017 period: JRA-55C 1958-1978; ERA-Interim 1979-2017. We bias corrected monthly mean data using an overlapping 10-year period (1979-1988) to facilitate a single time series of anomalies in downward shortwave flux and the surface and 10-m wind speed. Future updates will likely incorporate the single long-term data from ERA-5.

TerraClimate uses a very simple water-balance model to estimate surface water fluxes. This approach does not account for variability in vegetation type or abundance, or intricacies of water storage and fluxes found in more involved hydrologic models. Instead, TerraClimate uses a static reference vegetation layer, a one-dimensional soil water holding capacity derived from remote sensing, and a basic temperature-based snow model. ##

Cite this page

Acknowledgement of any material taken from or knowledge gained from this page is appreciated:

Abatzoglou, John & National Center for Atmospheric Research Staff (Eds). Last modified 2023-08-08 "The Climate Data Guide: TerraClimate: Global, high-resolution gridded temperature, precipitation, and other water balance variables.” Retrieved from https://climatedataguide.ucar.edu/climate-data/terraclimate-global-high-resolution-gridded-temperature-precipitation-and-other-water on 2024-08-30.

Citation of datasets is separate and should be done according to the data providers' instructions. If known to us, data citation instructions are given in the Data Access section, above.

Acknowledgement of the Climate Data Guide project is also appreciated:

Schneider, D. P., C. Deser, J. Fasullo, and K. E. Trenberth, 2013: Climate Data Guide Spurs Discovery and Understanding. Eos Trans. AGU, 94, 121–122, https://doi.org/10.1002/2013eo130001

Key Figures

1981-2010 average Dec-February climatic water deficit for Tasmania as represented in TerraClim (contributed by J. Abatzoglou)

As estimated by TerraClimate: (top) 1981-2010 average Dec-February climatic water deficit for Tasmania (units: mm). (bottom) Time series of DJF climatic water deficit near Mt. Field National Park (42.646S, 146.604E). (contributed by J. Abatzoglou)

average annual temperatures in 2017 according to TerraClimate

TerraClimate: average annual temperature in 2017. source, https://climate.northwestknowledge.net/TERRACLIMATE/index_animations.php (accessed 14 Dec 2019)

Other Information

Main variables & data classification.

WorldClim, CRUTS4.0

~4 km (1/24th degree)

  • Abatzoglou, J. T., S. Z. Dobrowski, S. A. Parks, and K. C. Hegewisch, 2018: TerraClimate, a high-resolution global dataset of monthly climate and climatic water balance from 1958–2015. Scientific Data, 5.
  • Willmott, C. J., C. M. Rowe, and Y. Mintz, 1985: Climatology of the terrestrial seasonal water cycle. Journal of Climatology, 5, 589–606.
  • Dobrowski, S. Z., J. Abatzoglou, A. K. Swanson, J. A. Greenberg, A. R. Mynsberge, Z. A. Holden, and M. K. Schwartz, 2012: The climate velocity of the contiguous United States during the 20th century. Global Change Biology, 19, 241–251.
  • Wang-Erlandsson, L., and Coauthors, 2016: Global root zone storage capacity from satellite-based evaporation. Hydrology and Earth System Sciences, 20, 1459–1481.

IMAGES

  1. Details of used climatic data in the research.

    climatic data capture handling and presentation

  2. Visual : Climate Risks: 1.5Âș vs 2ÂșC Global Warming (Source: WWF)

    climatic data capture handling and presentation

  3. Input of climatic data and site exposure

    climatic data capture handling and presentation

  4. Climatic data and thermal comfort considerations

    climatic data capture handling and presentation

  5. Climatic data recorded by the National Institute of Meteorology (INMET,...

    climatic data capture handling and presentation

  6. Climatic impact‐drivers (CID) capture select characteristics of climate...

    climatic data capture handling and presentation

VIDEO

  1. The Climate Data Power Hour

  2. Master Class on Digital Forensics & Electronic Evidence Handling & Presentation for Lawyers

  3. Montessori Apparatus Handling Presentation @shineacademychennai

  4. Montessori Apparatus Handling Presentation by our Student Teacher @ Shine Academy

  5. Montessori Apparatus handling presentation by our Student Teacher

  6. Montessori Teacher Training- Montessori Apparatus Handling Presentation Workshop by our Student 👏👏👏

COMMENTS

  1. Climate Data Analysis Tools & Methods

    View all of our climate data analysis tools & methods or use the list below to jump to a certain group. Climate Model Evaluation. Climate Data Processing & Visualization. Climate Data Formats. Statistical Methods.

  2. PDF Climatic data collection, presentation and analysis

    Today, the management of climate records requires a systematic approach that encompasses paper records, microfilm records and digital records, where the latter include image files as well as the traditional alphanumeric representation. 2. Climatic data collection Climatic data collection includes the following: a. Station identifiers Station ...

  3. Overview: Climate Data Processing

    Since climate, by definition, is the statistics of one or more variables over a period of time, it often means that there are many files to be processed. Further, some climate projection scenarios may span 50-100 years into the future and there may be multiple model runs (ensembles) to assess model sensitivity. Assorted data formats and a ...

  4. Climate Data: How to Overcome Collection and Analysis Challenges

    A useful climate data and analysis platform should be able to streamline access to vetted, curated, and current data critical for crafting climate strategies and road maps, monitoring and evaluating sustainability initiatives, and making accurate public climate disclosures. Modernize your data team.

  5. Climate Data Records: Overview

    Climate Data Records: Overview. As defined by a US National Research Council committee, a Climate Data Record (CDR) is "a time series of measurements of sufficient length, consistency and continuity to determine climate variability and change." In the US, agencies such as NASA and NOAA have sponsored operational and grant programs to create and ...

  6. Overview of 100+ Climate Data Platforms

    Data exists for many aspects of climate change, but with hundreds of platforms and countless datasets, it can be difficult to distinguish the best information for a particular need, or to find where data gaps exist. This interactive visual shows a matrix of more than 100 major climate data platforms, displayed by topic (x-axis) and geographic ...

  7. Visualizing Climate Data

    NOAA's Weather and Climate Toolkit (WCT) is free, platform independent software. The WCT allows the visualization and data export of weather and climate data, including Radar, Satellite and Model data. The WCT also provides access to weather/climate web services provided by NCDC and other organizations. The Weather and Climate Toolkit is best ...

  8. Introduction to Climate Data Analysis

    About This Course. This online workshop will outline how to access and process gridded climate data, focussing primarily on the use of the Copnernicus Climate Change Service Climate Data Store. The course will take the form of a series of pre-recorded, video-based lectures, each one focussed on a specific topic outlined below, with related ...

  9. The global climate monitor system: from climate data-handling to

    ABSTRACT. This paper summarizes our work on building a data model and a geovisualization tool that provides access to global climate data: the Global Climate Monitor Web Viewer. Linked to this viewer, a complete set of climate-environmental indicators capable of displaying climate patterns on a global scale that is accessible to any potential user (scientists and laypeople) will be built and ...

  10. Data and observations

    Data Distribution Centre of the IPCC hosting observational as well as scenario data for a wide variety of climatic, socio-economic and environmental variables. An overview by the Global Climate Observing System of international data centres and archives for data relating to atmosphere surface, atmosphere upper-air, atmosphere composition, oceans, terrestrial, and space

  11. PDF GUIDELINES ON CLIMATE DATA MANAGEMENT

    The designations employed and the presentation of material in this publication ... 3.2 Climate Data Management Systems: Desirable properties 10 3.2.1 Database model 10 ... and many meteorological and related data centres took the opportunity to directly capture and store these in their databases. Much was learnt about automatic methods of ...

  12. PDF CLIMATE ANALYSIS

    elements, its classification and the problems and presentation of climatic data. Part 2 deals with comfort and its relationship to the given climatic conditions, 1n order to delineate the nature of the climatic (thermal) problem and define the control task. Part 3 is devoted to climatic design decisions, to what the building can

  13. Capturing and Sharing Our Collective Expertise on Climate Data: The

    Abstract For users of climate services, the ability to quickly determine the datasets that best fit one's needs would be invaluable. The volume, variety, and complexity of climate data makes this judgment difficult. The ambition of CHARMe (Characterization of metadata to enable high-quality climate services) is to give a wider interdisciplinary community access to a range of supporting ...

  14. Climate Data Online (CDO)

    Climate Data Online (CDO) provides free access to NCDC's archive of global historical weather and climate data in addition to station history information. These data include quality controlled daily, monthly, seasonal, and yearly measurements of temperature, precipitation, wind, and degree days as well as radar data and 30-year Climate Normals. ...

  15. Climate Data

    Climate Data. The Climate Data Guide is a high-traffic expert knowledge portal providing concise and reliable information on the climate data that are essential for measuring and predicting physical climate risk. Currently, the Climate Data Guide curates expert insights on over 200 observational datasets and climate indices, searchable or ...

  16. How do weather observations become climate information?

    From Raw Data to Informative Products. The most common processing steps used to turn daily weather data into climate data products involve straightforward mathematics. Operations such as adding and averaging turn weather data into climate information. For instance, a station's daily average (mean) temperature is calculated by averaging the ...

  17. GitHub

    🌏 daily and monthly data treatment from general circulation models (gcms) by cmip6 phase The 2013 IPCC fifth assessment report (AR5) featured climate models from CMIP5, while the upcoming 2021 IPCC sixth assessment report (AR6) will feature new state-of-the-art CMIP6 models.

  18. Data Snapshots

    Data Snapshots. An up-to-date archive of freely available climate maps, ready to use in websites, presentations, or broadcasts. Click on the thumbnails below to access image galleries.

  19. Web based Visualization and Processing for Climate ...

    Web based Visualization and Processing for Climate and Weather. There are number of web-based sites that provide the cabability to visualize, perform limited proceessing tasks and, sometimes download data. Some are designed for a specific data product while others are cabable of creating graphics and performing computations on multible data sets.

  20. Big Data Challenges in Climate Science

    This material combines the contributions of those who participated in the 2014 Big Data From Space Conference (BiDS '14) session titled "Big Data Challenges in Climate Science" [ 6 - 8 ]. We use the work being done by the Intergovernmental Panel on Climate Change as the context for our presentation, with particular focus on the global ...

  21. Climate data sonification and visualization: An analysis of topics

    1. Introduction. Increasingly, researchers are asked to be more than knowledge-creators within their field of expertise, and "envisage the optimal processes and techniques for translating data into understandable, consumable modes of representation for audiences to digest" (Chandler et al., 2015).It has proven a hard challenge to present climate science to convince not only decision-makers ...

  22. 100+ Climatic database PowerPoint (PPT) Presentations, Climatic

    View Climatic database PowerPoint PPT Presentations on SlideServe. Collection of 100+ Climatic database slideshows.

  23. TerraClimate: Global, high-resolution gridded temperature

    TerraClimate is a global gridded dataset of meteorological and water balance variables for 1958-present, available on a monthly timestep. Its relatively fine spatial resolution, global extent, and long length are a unique combination that fills a void in climate data.