jagomart
digital resources
picture1_High Performance Python Pdf 189691 | Jiangdenolle19


 133x       Filetype PDF       File size 1.99 MB       Source: projects.iq.harvard.edu


File: High Performance Python Pdf 189691 | Jiangdenolle19
1 noisepy a new high performance python tool for 2 ambient noise seismology 1 1 3 chengxin jiang and marine a denolle 4 1department of earth and planetary sciences harvard ...

icon picture PDF Filetype PDF | Posted on 03 Feb 2023 | 2 years ago
Partial capture of text on file.
            1        NoisePy: a new high-performance python tool for
            2                           ambient noise seismology
                                                        1∗                            1
            3                        Chengxin Jiang        and Marine A. Denolle
            4            1Department of Earth and Planetary Sciences, Harvard University, MA, USA
            5                ∗Corresponding author: Chengxin Jiang (chengxinjiang@gmail.com)
                                                             1
             6  Abstract
             7  The fast-growing interests in high spatial resolution of seismic imaging and high temporal reso-
             8  lution of seismic monitoring pose great challenges for fast, efficient, and stable data processing
             9  in ambient noise seismology. This coincides with the explosion of available seismic data in the
            10  last few years. However, the current computational landscape of ambient seismic field seismol-
            11  ogy remains highly heterogeneous, with individual researchers building their own homegrown
            12  codes. Here, we present NoisePy, a new high-performance python tool designed specifically for
            13  large-scale ambient noise seismology. NoisePy provides most of the processing techniques for
            14  the ambient field data and the correlations found in the literature, along with parallel download
            15  routines, dispersion analysis, and monitoring subroutines. NoisePy takes advantage of ASDF, a
            16  parallel I/O enabled HDF5 data format designed for seismology, for a structured organization of
            17  the cross-correlation data. NoisePy obeys the embarrassing parallelism of computing the noise
            18  correlations over time windows using MPI. Thus, NoisePy observes a strong scaling with the num-
            19  ber of cores, a small memory overhead, and stable memory usage. Benchmark comparisons with
            20  the latest version of MSNoise demonstrate about 4-time improvement in compute time of the cross
            21  correlations, which is the slowest step of ambient noise seismology. NoisePy is suitable for ambi-
            22  ent noise seismology of various data sizes, and it has been tested successfully at handling data of
            23  size ranging from a few GBs to several tens of TBs.
            24  1    Introduction
            25  With more than two decades of flourishing developments both in methodologies and in scientific
            26  discoveries, the use of ambient seismic field in seismology is now well established. It has been a
            27  prime tool for structure imaging at a broad range of length scales, from reservoir scales (de Ridder
                                                                 2
                 28   and Dellinger, 2011; Lin et al., 2013; Mordret et al., 2013; Nakata et al., 2015; Chmiel et al.,
                 29   2019), to regional scales (Campillo and Paul, 2003; Shapiro et al., 2005; Sabra et al., 2005; Yao
                 30   et al., 2006; Brenguier et al., 2007; Lin et al., 2008; Gao et al., 2011; Porritt et al., 2011; Yang et al.,
                 31   2012; Ward et al., 2013; Chen et al., 2014; Jiang et al., 2014; Xie et al., 2015; Bao et al., 2015;
                 32   Obermannetal., 2016; Lynner and Porritt, 2017; Bowden et al., 2017; Li et al., 2017; Delph et al.,
                 33   2018;Jiangetal.,2018;Bergetal.,2018;Wangetal.,2018;Dengetal.,2019;Liuetal.,2019),and
                                                                                    ¨
                 34   continental scales (Yang et al., 2007; Ekstrom et al., 2009; Saygin and Kennett, 2010, 2012; Zhao
                 35   et al., 2016; Shen and Ritzwoller, 2016; Shen et al., 2016). Ambient-noise seismology is also used
                 36   to monitor tectonic, volcanic, and environmental processes (e.g., Brenguier et al., 2008a,b; Ermert
                 37   et al., 2015; Mordret et al., 2016; Viens et al., 2017; Wang et al., 2017a; Clements and Denolle,
                 38   2018; Taira et al., 2018; Yates et al., 2019; Mao et al., 2019a). Finally, other applications include
                 39   the prediction of long-period ground motions (Prieto and Beroza, 2008; Viens et al., 2015; Denolle
                 40   et al., 2014a,b, 2018). These standard approaches rely mainly on the processing of continuous
                 41   time series. To enable new scientific discovery, seismologists have to enhance spatial and temporal
                 42   resolution and thus they have to process larger amount of seismic data.
                 43   Thekeychallengesindataprocessingarethecomputationandthestorageoftheinter-stationcross
                 44   correlations, both of which scale quadratically with the number of stations (or channels) and lin-
                 45   early with time. Large-N arrays (Lin et al., 2013; Nakata et al., 2015; Wang et al., 2017b; Karplus
                 46   andSchmandt,2018;Ranasingheetal.,2018;Mordretetal.,2018;Keiferetal.,2019;Mengetal.,
                 47   2019), Distributed Acoustic Sensing with optic fibers (e.g., Dou et al., 2017; Zeng et al., 2017;
                 48   Martin et al., 2018; Yu et al., 2019; Williams et al., 2019), and whole network analysis (e.g., Shen
                 49   and Ritzwoller, 2016; Zhao et al., 2016; Bowden et al., 2017) are becoming the standards for data
                 50   collection and analysis. These studies often involve hundreds to thousands of sensors/channels.
                 51   The computational requirements increase further when studies track the temporal evolution of the
                 52   cross-correlation functions (Wang et al., 2017a). In general, seismic studies involving over 100s of
                                                                                            3
                 53   channels at moderate-to-high sampling rates (10-100 Hz) require more elaborate code designs and
                 54   data management, such as I/O strategies and choices in parallelization on clusters.
                 55   One companion, yet basic, problem with such large data sets is the organization of the database
                 56   that usually becomes an individual choice based on specific research projects. When studying the
                 57   temporal evolution of a local structure, seismologists may tend to organize the data per time pe-
                 58   riod. When imaging the spatial variations in the elastic structure of a wide area, which requires
                 59   informationburiedintheinter-stationcorrelations regardless of the time period, seismologists may
                 60   tend to organize the data per station/channel. Another challenge in data storage is the number of
                 61   files: a single cross-correlation function is usually small in size and individual studies may generate
                 62   millions of such small files. High-performance computing (HPC) and High Throughput comput-
                 63   ing (HTC) centers use filesystems that fail at handling a large number of small files efficiently.
                 64   Therefore, the data ought to be stored and organized in large, parallel I/O enabled data containers
                 65   such as HDF5 and NetCDF. The Adaptable Seismic Data Format (ASDF, Krischer et al., 2016)
                 66   is one of such file formats that uses the HDF5 container to store large time series and metadata.
                 67   ASDF is easily read in C++, Fortran, Julia, and python, leading to a transportable data format
                 68   betweenhigh-level languages (like Python) and high-performance computing languages (like C++
                 69   and Fortran).
                 70   Python has become the new standard, open-source, high-level language for seismic processing.
                 71   For example, ObsPy is now the most popular toolbox for seismology. Python is also the most
                 72   popular language for machine learning algorithms (e.g., Tensor Flow, PyTorch, and Keras), which
                 73   is also becoming standard practice in the seismology community (e.g. Kong et al., 2018; Bergen
                 74   et al., 2019).
                 75   Several generic software provide the functionality to compute the cross correlation of time series,
                 76   such as Seismic Analysis Code (SAC; Goldstein et al., 2003), Computer Programs in Seismol-
                 77   ogy (CPS; Herrmann, 2013), and ObsPy (Beyreuther et al., 2010; Megies et al., 2011). MSNoise
                                                                                            4
The words contained in this file might help you see if this file matches what you are looking for:

...Noisepy a new high performance python tool for ambient noise seismology chengxin jiang and marine denolle department of earth planetary sciences harvard university ma usa corresponding author chengxinjiang gmail com abstract the fast growing interests in spatial resolution seismic imaging temporal reso lution monitoring pose great challenges efcient stable data processing this coincides with explosion available last few years however current computational landscape eld seismol ogy remains highly heterogeneous individual researchers building their own homegrown codes here we present designed specically large scale provides most techniques correlations found literature along parallel download routines dispersion analysis subroutines takes advantage asdf i o enabled hdf format structured organization cross correlation obeys embarrassing parallelism computing over time windows using mpi thus observes strong scaling num ber cores small memory overhead usage benchmark comparisons latest vers...

no reviews yet
Please Login to review.