.. _example13: Example 13 - Using Unstructured Grids ===================================== This workbook provides an example workflow for running particles on an unstructured model output (i.e. flow variables are from a non-Cartesian grid). We make use of several of the geospatial functions in ``particle_track.py`` and others in ``routines.py``, in order to show how to grid hydrodynamic input files, convert UTM coordinates into (and out of) the array coordinates used in routing the particles, as well as how to compute exposure times to a region of interest. Hopefully this example in conjunction with other examples can provide information on how users can adapt these codes to their use-case. Full example workbook available :download:`here <../../../examples/unstructured_grid_anuga.ipynb>`. To demonstrate this functionality, we make use of outputs from the hydrodynamic model `ANUGA `_, which solves the 2D shallow-water equations on a triangular mesh. We’ve extracted outputs from a previous example model run and included these as text files in the repository, so as to avoid importing any dependencies not required by this distribution. If the user is also using ANUGA flow-fields, there is a a commented-out block of code below demonstrating how we extracted the input files for use in this workbook. Import necessary dependencies ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code:: ipython import numpy as np import scipy import matplotlib %matplotlib inline from matplotlib import pyplot as plt import json import dorado import dorado.particle_track as pt Load in model outputs ~~~~~~~~~~~~~~~~~~~~~ If we were starting directly from an ANUGA output file, this is where we would import the outputs from the model run. We have included these files directly with the distribution, but for anyone interested in repeating the steps we used to generate these files, uncomment the block of code below. Here, ``path2file`` should point to the ANUGA output file (e.g. ``./model_output.sww``). This output is a NetCDF file with flow variables (e.g. ``depth``, ``xmom``, ``stage``) listed by triangle index, along with the centroid coordinates (``x``, ``y``) of that triangle. In our case, those coordinates are in meters UTM, which will be relevant later. .. code:: ipython # # Import anuga to access functions (NOTE: Anuga requires Python 2.7!) # import anuga # # Folder name of run to analyze: # path2file = 'examples/example_model.sww' # # Extract the files from NetCDF using Anuga's `get_centroids` function: # swwvals = anuga.utilities.plot_utils.get_centroids(path2file, timeSlices = 'last') # # Query values: time, x, y, stage, elev, height, xmom, ymom, xvel, yvel, friction, vel, etc # # Here, since we are only interested in saving variables, we migrate variables to a dictionary: # # Make sure to filter out NaN's before converting to lists, if there are any # unstructured = dict() # unstructured['x'] = swwvals.x.tolist() # unstructured['y'] = swwvals.y.tolist() # unstructured['depth'] = swwvals.height[0].tolist() # unstructured['stage'] = swwvals.stage[0].tolist() # unstructured['qx'] = swwvals.xmom[0].tolist() # unstructured['qy'] = swwvals.ymom[0].tolist() # # And then we save this dictionary into a json (text) file for later import # json.dump(unstructured, open('unstructured_model.txt', 'w')) # # This generates the file imported in this workbook Here, we will skip the above step and just import the ``unstructured_model.txt`` dictionary. **Note**: We have chosen to save/import the variables ``depth``, ``stage``, ``qx``, and ``qy`` in this application. However, we could have chosen to save and use the fields ``topography``, ``u``, and ``v`` (or in ANUGA’s terminology, ``elev``, ``xvel``, and ``yvel``). The particle tracking code accepts any of these inputs, as long as you provide enough information to calculate the water surface slope, depth of the water column, and the two components of inertia. .. code:: ipython unstructured = json.load(open('unstructured_model.txt')) Convert data and coordinates for particle routing ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Now that we have the data we need, we can convert it into the format needed by ``dorado``. This will include gridding the hydrodynamic outputs and transforming our geospatial coordinates into “array index” coordinates. First, let’s combine our :math:`(x,y)` coordinates into a list of tuples. This is the expected format for coordinates in the following functions. .. code:: ipython # Use list comprehension to convert into tuples coordinates = [(unstructured['x'][i], unstructured['y'][i]) for i in list(range(len(unstructured['x'])))] # Let's see the extent of our domain print(min(unstructured['x']), max(unstructured['x']), min(unstructured['y']), max(unstructured['y'])) # As well as our number of data points print(len(unstructured['x'])) .. parsed-literal:: (624422.25, 625031.9375, 3346870.0, 3347107.75) 103558 Now, let’s grid our unstructured data into a uniform grid. For this, we make use of the function ``particle_track.unstruct2grid()``, which uses inverse-distance-weighted interpolation to create a Cartesian grid the same size as our model’s extent. To use this function, we need to provide: - Our list of ``coordinates`` (as tuples). - The unstructured data we want to be gridded (here we start with ``depth``). - The desired grid size of the resulting rasters (here we’re using :math:`1 m`, because the test model was on high-resolution lidar data). - The number of :math:`k` nearest neighbors to use in the interpolation. If :math:`k=1`, we use only the nearest datapoint, whereas higher values (default is :math:`k=3`) interpolate the data into a smoother result. The underlying code relies on ``scipy`` to build a ``cKDTree`` of our unstructured data, which maps the datapoints onto a uniform array. ``cKDTree`` is much faster than other gridding functions (e.g. ``scipy.interpolate.griddata``), but building the tree can still be very slow if the dataset is very large or if the desired grid size is very small. The outputs of ``unstruct2grid`` are: - The resulting interpolation function ``myInterp`` (after building the nearest-distance tree), which will be considerably faster than calling ``unstruct2grid`` again if we are gridding additional datasets. This function assumes data have the same coordinates, grid size, and :math:`k`. - A gridded array of our data. .. code:: ipython # Use IDW interpolation interpolate unstructured data into uniform grid myInterp, depth = pt.unstruct2grid(coordinates, unstructured['depth'], 1.0, 3) # Let's plot the resulting grid to see what the output looks like: plt.figure(figsize=(5,5), dpi=200) plt.imshow(depth, cmap='jet') plt.colorbar(fraction=0.018) plt.title('Gridded Depth Array') .. image:: images/example13/unstructured_grid_anuga_10_1.png Now, let’s use the new function ``myInterp`` to grid our additional datasets. If ``unstruct2grid`` took a while to grid the first dataset, this function will be considerably faster than re-running that process, because it re-uses most of the results of that first function call. This function only requires as input the new unstructured data to be gridded. All of these variables will have the same grid size as the first dataset, and we assume that they have all the same coordinates. .. code:: ipython # Grid other data products with new interpolation function stage = myInterp(np.array(unstructured['stage'])) qx = myInterp(np.array(unstructured['qx'])) qy = myInterp(np.array(unstructured['qy'])) # Should be very fast compared to the first dataset! # Let's plot one of these variables to see the new grid plt.figure(figsize=(5,5), dpi=200) plt.imshow(qy, vmin=-2, vmax=2, cmap='seismic') plt.colorbar(fraction=0.018) plt.title('Gridded Y-Discharge Array') .. image:: images/example13/unstructured_grid_anuga_12_1.png **Note:** In all these cases, if your unstructured data does not fill the full rectangular domain, IDW interpolation may still populate those exterior regions with data. If this has potential to cause problems when routing particles, make sure to do some pre-processing on these rasters to correct those exterior regions or crop the domain. Now, let’s figure out where we want to seed our particles. If you’re modeling a real domain, it may be easier to figure out a good release location by opening some GIS software and finding the coordinates of that location. Here, we will use the function ``particle_track.coord2ind()`` to convert your coordinates into array indices. This function requires: - Coordinates to be converted, as a list [] of :math:`(x,y)` tuples - The location of the lower left corner of your rasters (i.e. the origin). If you used ``unstruct2grid`` to generate rasters, this location will be ``[(min(x), min(y))]``. Otherwise, if you’re loading data from e.g. a GeoTIFF, the lower left corner will be stored in the .tif metadata and can be accessed by GIS software or gdalinfo (if the user has GDAL) - The dimensions of the raster, accessible via ``np.shape(raster)`` - The grid size of the raster (here :math:`1m`). **Note:** this coordinate transform flips the orientation of the unit vectors (i.e. :math:`y_{index} = x` and :math:`x_{index} = -y`) and returns raster indices. This is convenient for the internal functions of ``particle_tools.py``, but may cause confusion with plotting or interpreting later if locations are not translated back into spatial coordinates. (Don’t worry, we will convert back later!) We assume in all of these functions that the coordinates you’re using are (at least locally) flat. We do not account for the curvature of the Earth in very large domains. Hopefully you are using a projected coordinate system (here we are using meters UTM), or at least willing to accept a little distortion. Note that this ``coord2ind`` requires units of either meters or decimal degrees. .. code:: ipython # I have found a nice release location in GIS. Let's convert it to index notation: seedloc = [(624464, 3347078)] # Coordinates are in meters UTM # Call the coordinate transform function seedind = pt.coord2ind(seedloc, (min(unstructured['x']), min(unstructured['y'])), np.shape(depth), 1.0) print(seedind) # Visualize the location on our array plt.figure(figsize=(5,5), dpi=200) plt.scatter(seedind[0][1], seedind[0][0], c='r') plt.imshow(depth) plt.colorbar(fraction=0.03) plt.title('Gridded Depth Array') plt.xlim([seedind[0][1]-40, seedind[0][1]+100]) plt.ylim([seedind[0][0]+70, seedind[0][0]-30]) .. parsed-literal:: [(31, 42)] .. image:: images/example13/unstructured_grid_anuga_15_2.png Set up particle routing parameters ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Now that we have pre-converted the input data we need, let’s set up the particle routing to be run. We do this using the ``particle_track.modelParams`` class, in which we populate the attributes to suit our application. This includes the gridded hydrodynamic outputs from above, the grid size ``dx``, and tuning parameters which influence our random walk. .. code:: ipython # Create the parameters object and then assign the values params = pt.modelParams() # Populate the params attributes params.stage = stage params.depth = depth params.qx = qx params.qy = qy # Other choices/parameters params.dx = 1. # Grid size params.dry_depth = 0.01 # 1 cm considered dry # You can also tell it which model you're using, but this only matters if the answer is DeltaRCM: params.model = 'Anuga' In this application, we are using the default values for the parameters of the random walk (``gamma``, ``theta``, ``diff_coeff``). I encourage you to play with these weights and see how your solution is affected. Generate particles ~~~~~~~~~~~~~~~~~~ Now we instantiate the ``particle_track.Particles`` class, and generate some particles to be routed. Here we are using the ``'random'`` method to generate particles, which seeds them randomly within a specified region. If we knew exactly where we wanted particles, we could call the ``'exact'`` method instead. .. code:: ipython # Now we seed in the region +/- 1 cell of the seed location we computed earlier # Note that "xloc" and "yloc" are x and y in the particle coordinate system! params.seed_xloc = [seedind[0][0]-1, seedind[0][0]+1] params.seed_yloc = [seedind[0][1]-1, seedind[0][1]+1] # For this example, we model 50 particles: params.Np_tracer = 50 # Initialize particles and generate particles particles = pt.Particles(params) particles.generate_particles(Np_tracer, seed_xloc, seed_yloc) .. parsed-literal:: Theta parameter not specified - using 1.0 Gamma parameter not specified - using 0.05 Diffusion coefficient not specified - using 0.2 Cell Types not specified - Estimating from depth Using weighted random walk Run the particle routing ~~~~~~~~~~~~~~~~~~~~~~~~ Now we call on one of the routines, ``routines.steady_plots()``, to run the model. The core of the particle routing occurs in the ``particle_track.run_iteration()`` function, but for ease of use, we have provided several high-level wrappers for the underlying code in the ``routines.py`` script. These routines take common settings, run the particle routing, and save a variety of plots and data for visualization. Because our model is a steady case (i.e. flow-field is not varying with time), ``steady_plots`` will run the particles for an equal number of iterations and return the travel history to us in the ``walk_data`` dict. This dict is organized into ``['xinds']``, ``['yinds']``, and ``['travel_times']``, which are then indexed by particle ID, and then finally iteration number. (e.g. ``walk_data['xinds'][5][10]`` will return the xindex for the 6th particle’s 11th iteration). Note that, while this function returns a ``walk_data`` dictionary, this information is also stored as an attribute of the particles class, accessible via ``particles.walk_data``. .. code:: ipython # Using steady (time-invariant) plotting routine for 200 iterations walk_data = dorado.routines.steady_plots(particles, 200, 'unstructured_grid_anuga') # Outputs will be saved in the folder 'unstructured_grid_anuga' .. parsed-literal:: 100%|################################################################################| 200/200 [02:44<00:00, 1.21it/s] Because the particles take different travel paths, at any given iteration they are *not guaranteed* to be synced up in time. We can check this using the ``routines.get_state()`` function, which allows us to slice the ``walk_data`` dictionary along a given iteration number. This function logically indexes the dict like ``walk_data[:][:][iteration]``, except not quite as simple given the indexing rules of a nested list. By default, this function will return the most recent step (iteration number ``-1``), but we could ask it to slice along any given iteration number. .. code:: ipython xi, yi, ti = dorado.routines.get_state(walk_data) print([round(t, 1) for t in ti]) .. parsed-literal:: [284.1, 292.6, 301.7, 293.1, 294.2, 319.2, 275.1, 332.2, 302.7, 273.2, 303.0, 303.4, 297.2, 304.5, 281.1, 297.7, 270.8, 300.1, 299.0, 318.7, 274.5, 293.8, 288.5, 397.7, 332.9, 274.3, 271.1, 302.7, 298.6, 314.4, 317.9, 284.3, 332.5, 294.8, 313.7, 302.1, 291.5, 311.6, 326.2, 302.1, 274.0, 307.0, 275.0, 273.7, 317.2, 367.2, 272.9, 307.9, 294.5, 280.6] **Note:** There exists an equivalent function, ``get_time_state()``, which slices ``walk_data`` along a given travel time, in case there is interest in viewing the particles in sync. As a brief aside, the particle routing can also be run in an *unsteady* way, in which each particle continues taking steps until each has reached a specified ``target_time``. This can be useful if you want to visualize particle travel times in “real time”, or if you want to sync up their propagation with an unsteady flow field that updates every so often (e.g. every 30 minutes). This can be done either with the ``unsteady_plots()`` routine, or by interacting with ``run_iteration()`` directly. The commented-out block of code below shows an example of what an unsteady case might look like, had we used more timesteps from the model output. .. code:: ipython # # Specify folder to save figures: # path2folder = 'unstructured_grid_anuga' # # Let's say our model outputs update minute: # model_timestep = 60. # Units in seconds # # Number of steps to take in total: # num_steps = 20 # # Create vector of target times # target_times = np.arange(model_timestep, # model_timestep*(num_steps+1), # model_timestep) # # Iterate through model timesteps # for i in list(range(num_steps)): # # The main functional difference with an unsteady model is re-instantiating the # # particle class with updated params *inside* the particle routing loop # # Update the flow field by gridding new time-step # # We don't have additional timesteps, but if we did, we update params here: # params.depth = myInterp(unstructured['depth']) # params.stage = myInterp(unstructured['stage']) # params.qx = myInterp(unstructured['qx']) # params.qy = myInterp(unstructured['qy']) # # Define the particle class and continue # particle = pt.Particles(params) # # Generate some particles # if i == 0: # particle.generate_particles(Np_tracer, seed_xloc, seed_yloc) # else: # particle.generate_particles(0, [], [], 'random', walk_data) # # Run the random walk for this "model timestep" # walk_data = particle.run_iteration(target_times[i]) # # Use get_state() to return original and most recent locations # x0, y0, t0 = dorado.routines.get_state(walk_data, 0) # Starting locations # xi, yi, ti = dorado.routines.get_state(walk_data) # Most recent locations # # Make and save plots and data # fig = plt.figure(dpi=200) # ax = fig.add_subplot(111) # ax.scatter(y0, x0, c='b', s=0.75) # ax.scatter(yi, xi, c='r', s=0.75) # ax = plt.gca() # im = ax.imshow(particle.depth) # plt.title('Depth at Time ' + str(target_times[i])) # cax = fig.add_axes([ax.get_position().x1+0.01, # ax.get_position().y0, # 0.02, # ax.get_position().height]) # cbar = plt.colorbar(im, cax=cax) # cbar.set_label('Water Depth [m]') # plt.savefig(path2folder + '/output_by_dt'+str(i)+'.png') # plt.close() Analyze the outputs ~~~~~~~~~~~~~~~~~~~ Now that we have the walk history stored in ``walk_data``, we can query this dictionary for features of interest. For starters, we can convert the location indices back into geospatial coordinates using the function ``particle_track.ind2coord()``. This will append the existing dictionary with ``['xcoord']`` and ``['ycoord']`` fields in the units we started with (meters or decimal degrees). **Note:** Particle locations are only known to within the specified grid size (i.e. +/- dx/2) .. code:: ipython # Convert particle location indices back into UTM coordinates walk_data = pt.ind2coord(walk_data, (min(unstructured['x']), min(unstructured['y'])), np.shape(depth), 1.0) # To check that things have worked, print starting location of first particle. # Should be within +/- dx from seedloc = (624464, 3347078) print(walk_data['xcoord'][0][0], walk_data['ycoord'][0][0]) .. parsed-literal:: (624465.25, 3347079.0) For something a little more interesting, let’s measure the amount of time particles spent “exposed” to a specific sub-region within our domain. For this we make use of the functions ``particle_track.exposure_time()`` and ``routines.plot_exposure_time()``. If we input a binary array (same size as input arrays) delineating our region of interest (ROI) with 1’s, these functions will compute and plot the exposure time distribution (ETD) of particles in this sub-region. For those familiar with the metric, the ETD is equivalent to the residence time distribution (RTD) for steady flows, with the only difference being that if particles make multiple excursions into our ROI, all those times are counted. **Note:** For a representative ETD, it is important to run a *lot* of particles. A large sample size is needed to obtain a realistic distribution (and smooth plots). Depending on the domain, we recommend at least :math:`O(10^3)`. First, let’s generate and visualize the ROI: .. code:: ipython # Create the array regions = np.zeros_like(depth, dtype='int') regions[:,100:200] = 1 # Include anywhere above sea level # Visualize the region plt.figure(figsize=(5,5), dpi=200) plt.imshow(depth) plt.imshow(regions, cmap='bone', alpha=0.3) .. image:: images/example13/unstructured_grid_anuga_29_1.png Then compute. ``exposure_time()`` outputs a list of exposure times by particle index, and ``plot_exposure_time()`` will use those values to generate plots of the cumulative and differential forms of the ETD (i.e. the CDF and PDF, respectively). .. code:: ipython # Measure exposure times exposure_times = pt.exposure_time(walk_data, regions) # Then generate plots and save data exposure_times = dorado.routines.plot_exposure_time(walk_data, exposure_times, 'unstructured_grid_anuga/figs', timedelta = 60, nbins=20) # Changing 'timedelta' will change the units of the time-axis. # Units are seconds, so 60 will plot by minute. # Because we are using fewer particles than ideal, smooth the plots with small 'nbins' .. parsed-literal:: 100%|#################################################################################| 50/50 [00:00<00:00, 1190.48it/s] .. image:: images/example13/unstructured_grid_anuga_31_2.png .. image:: images/example13/unstructured_grid_anuga_31_3.png **Note:** If any particles are still in the ROI at the end of their travel history, they are excluded from plots. These particles are not done being “exposed,” so we need to run more iterations in order to capture the tail of the distribution.