{ "cells": [ { "cell_type": "markdown", "id": "0f4d78e0-119f-4875-8780-6752003875ce", "metadata": {}, "source": [ "# BioSCape Data Access" ] }, { "cell_type": "markdown", "id": "3f87d843-edc9-43fc-96fc-b068f5e29896", "metadata": {}, "source": [ "**The Bioscape data has undergone reprocessing, and Version 2 is now available.** This data is stored in an S3 bucket associated with the SMCE environment. You can access the data through various methods:\n" ] }, { "cell_type": "markdown", "id": "2e43378a-9407-434a-b261-a0d993f7572c", "metadata": {}, "source": [ "## 1. Intake Catalog \n", "\n", "The simplest and fastest method of access is through the BioSCape intake catalog. This method offers the quickest read times, with entire scenes being loaded in around 20 to 40 seconds. The catalog is optimized with reference files using the `virtualizarr` library, which significantly enhances read performance. You can access **reflectance**, **radiance**, and **observation (obs)** data through this method.\n", "\n", "**Support for additional datasets, such as PRISM or LLIS data, is under development**\n", "\n", "### Dependencies\n", "\n", "Intake is currently undergoing significant changes. To ensure compatibility, please pin the following versions in your conda environment:\n", "\n", "- `intake=2.0.7`\n", "- `intake-xarray=2.0.0`\n", "- `xarray=2024.11.0`\n", "- `zarr=2.18.4`\n", "- `fsspec=2024.12.0`\n", "- `dask=2024.12.1`\n", "- `s3fs=2024.12.0`\n", "\n" ] }, { "cell_type": "code", "execution_count": 1, "id": "6ca7cc94-9fc9-466e-b78e-1ddd768a05d1", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.Dataset> Size: 3GB\n",
       "Dimensions:                    (y: 1795, x: 906, wavelength: 425)\n",
       "Coordinates:\n",
       "    transverse_mercator        object 8B ...\n",
       "  * wavelength                 (wavelength) float32 2kB 377.2 ... 2.501e+03\n",
       "  * x                          (x) float64 7kB 2.992e+05 2.992e+05 ... 3.049e+05\n",
       "  * y                          (y) float64 14kB 6.297e+06 ... 6.286e+06\n",
       "Data variables:\n",
       "    aerosol_optical_thickness  (y, x) float32 7MB dask.array<chunksize=(256, 256), meta=np.ndarray>\n",
       "    fwhm                       (wavelength) float32 2kB dask.array<chunksize=(425,), meta=np.ndarray>\n",
       "    reflectance                (wavelength, y, x) float32 3GB dask.array<chunksize=(10, 256, 256), meta=np.ndarray>\n",
       "    water_vapor                (y, x) float32 7MB dask.array<chunksize=(256, 256), meta=np.ndarray>\n",
       "Attributes: (12/23)\n",
       "    Conventions:                       CF-1.6\n",
       "    creator_name:                      Jet Propulsion Laboratory/California I...\n",
       "    creator_url:                       aviris.jpl.nasa.gov\n",
       "    date_created:                      2024-11-25T19:57:23Z\n",
       "    flight_line:                       ang20231022t092801\n",
       "    identifier_product_doi_authority:  https://doi.org\n",
       "    ...                                ...\n",
       "    sensor:                            Airborne Visible / Infrared Imaging Sp...\n",
       "    software_build_version:            002\n",
       "    summary:                           The Airborne Visible / Infrared Imagin...\n",
       "    time_coverage_end:                 2023-10-22T09:33:34Z\n",
       "    time_coverage_start:               2023-10-22T09:33:34Z\n",
       "    title:                             AVIRIS-NG L2A Surface reflectance (fli...
" ], "text/plain": [ " Size: 3GB\n", "Dimensions: (y: 1795, x: 906, wavelength: 425)\n", "Coordinates:\n", " transverse_mercator object 8B ...\n", " * wavelength (wavelength) float32 2kB 377.2 ... 2.501e+03\n", " * x (x) float64 7kB 2.992e+05 2.992e+05 ... 3.049e+05\n", " * y (y) float64 14kB 6.297e+06 ... 6.286e+06\n", "Data variables:\n", " aerosol_optical_thickness (y, x) float32 7MB dask.array\n", " fwhm (wavelength) float32 2kB dask.array\n", " reflectance (wavelength, y, x) float32 3GB dask.array\n", " water_vapor (y, x) float32 7MB dask.array\n", "Attributes: (12/23)\n", " Conventions: CF-1.6\n", " creator_name: Jet Propulsion Laboratory/California I...\n", " creator_url: aviris.jpl.nasa.gov\n", " date_created: 2024-11-25T19:57:23Z\n", " flight_line: ang20231022t092801\n", " identifier_product_doi_authority: https://doi.org\n", " ... ...\n", " sensor: Airborne Visible / Infrared Imaging Sp...\n", " software_build_version: 002\n", " summary: The Airborne Visible / Infrared Imagin...\n", " time_coverage_end: 2023-10-22T09:33:34Z\n", " time_coverage_start: 2023-10-22T09:33:34Z\n", " title: AVIRIS-NG L2A Surface reflectance (fli..." ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import intake\n", "import xarray as xr\n", "import numpy as np\n", "\n", "# Load the catalog\n", "catalog = intake.open_catalog('s3://bioscape-data/bioscape_avng_v2.yaml')\n", "\n", "# Each flightline is divided into smaller scenes. Each scene has a reflectance, radiance and observation file associated with it\n", "data = [catalog.ang20231022t092801.ang20231022t092801_005_RFL, catalog.ang20231022t092801.ang20231022t092801_005_RDN, catalog.ang20231022t092801.ang20231022t092801_005_OBS]\n", "\n", "# Use read_chunked() or to_dask() to access the data via xarray. Other methods might load the entire scene into memory\n", "# The crs should already be encoded and can be accessed via ds.rio.crs\n", "ds = data[0].read_chunked()\n", "ds" ] }, { "cell_type": "code", "execution_count": 2, "id": "c42711e9-bf62-4902-9d2a-4e6a2a83c0c7", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 20.8 s, sys: 3.04 s, total: 23.8 s\n", "Wall time: 26.8 s\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'reflectance' (wavelength: 425, y: 1795, x: 906)> Size: 3GB\n",
       "array([[[nan, nan, nan, ..., nan, nan, nan],\n",
       "        [nan, nan, nan, ..., nan, nan, nan],\n",
       "        [nan, nan, nan, ..., nan, nan, nan],\n",
       "        ...,\n",
       "        [nan, nan, nan, ..., nan, nan, nan],\n",
       "        [nan, nan, nan, ..., nan, nan, nan],\n",
       "        [nan, nan, nan, ..., nan, nan, nan]],\n",
       "\n",
       "       [[nan, nan, nan, ..., nan, nan, nan],\n",
       "        [nan, nan, nan, ..., nan, nan, nan],\n",
       "        [nan, nan, nan, ..., nan, nan, nan],\n",
       "        ...,\n",
       "        [nan, nan, nan, ..., nan, nan, nan],\n",
       "        [nan, nan, nan, ..., nan, nan, nan],\n",
       "        [nan, nan, nan, ..., nan, nan, nan]],\n",
       "\n",
       "       [[nan, nan, nan, ..., nan, nan, nan],\n",
       "        [nan, nan, nan, ..., nan, nan, nan],\n",
       "        [nan, nan, nan, ..., nan, nan, nan],\n",
       "        ...,\n",
       "...\n",
       "        [nan, nan, nan, ..., nan, nan, nan],\n",
       "        [nan, nan, nan, ..., nan, nan, nan],\n",
       "        [nan, nan, nan, ..., nan, nan, nan]],\n",
       "\n",
       "       [[nan, nan, nan, ..., nan, nan, nan],\n",
       "        [nan, nan, nan, ..., nan, nan, nan],\n",
       "        [nan, nan, nan, ..., nan, nan, nan],\n",
       "        ...,\n",
       "        [nan, nan, nan, ..., nan, nan, nan],\n",
       "        [nan, nan, nan, ..., nan, nan, nan],\n",
       "        [nan, nan, nan, ..., nan, nan, nan]],\n",
       "\n",
       "       [[nan, nan, nan, ..., nan, nan, nan],\n",
       "        [nan, nan, nan, ..., nan, nan, nan],\n",
       "        [nan, nan, nan, ..., nan, nan, nan],\n",
       "        ...,\n",
       "        [nan, nan, nan, ..., nan, nan, nan],\n",
       "        [nan, nan, nan, ..., nan, nan, nan],\n",
       "        [nan, nan, nan, ..., nan, nan, nan]]],\n",
       "      shape=(425, 1795, 906), dtype=float32)\n",
       "Coordinates:\n",
       "    transverse_mercator  object 8B '0.0'\n",
       "  * wavelength           (wavelength) float32 2kB 377.2 382.2 ... 2.501e+03\n",
       "  * x                    (x) float64 7kB 2.992e+05 2.992e+05 ... 3.049e+05\n",
       "  * y                    (y) float64 14kB 6.297e+06 6.297e+06 ... 6.286e+06\n",
       "Attributes:\n",
       "    _QuantizeBitGroomNumberOfSignificantDigits:  5\n",
       "    long_name:                                   Surface hemispherical direct...\n",
       "    orthorectified:                              True
" ], "text/plain": [ " Size: 3GB\n", "array([[[nan, nan, nan, ..., nan, nan, nan],\n", " [nan, nan, nan, ..., nan, nan, nan],\n", " [nan, nan, nan, ..., nan, nan, nan],\n", " ...,\n", " [nan, nan, nan, ..., nan, nan, nan],\n", " [nan, nan, nan, ..., nan, nan, nan],\n", " [nan, nan, nan, ..., nan, nan, nan]],\n", "\n", " [[nan, nan, nan, ..., nan, nan, nan],\n", " [nan, nan, nan, ..., nan, nan, nan],\n", " [nan, nan, nan, ..., nan, nan, nan],\n", " ...,\n", " [nan, nan, nan, ..., nan, nan, nan],\n", " [nan, nan, nan, ..., nan, nan, nan],\n", " [nan, nan, nan, ..., nan, nan, nan]],\n", "\n", " [[nan, nan, nan, ..., nan, nan, nan],\n", " [nan, nan, nan, ..., nan, nan, nan],\n", " [nan, nan, nan, ..., nan, nan, nan],\n", " ...,\n", "...\n", " [nan, nan, nan, ..., nan, nan, nan],\n", " [nan, nan, nan, ..., nan, nan, nan],\n", " [nan, nan, nan, ..., nan, nan, nan]],\n", "\n", " [[nan, nan, nan, ..., nan, nan, nan],\n", " [nan, nan, nan, ..., nan, nan, nan],\n", " [nan, nan, nan, ..., nan, nan, nan],\n", " ...,\n", " [nan, nan, nan, ..., nan, nan, nan],\n", " [nan, nan, nan, ..., nan, nan, nan],\n", " [nan, nan, nan, ..., nan, nan, nan]],\n", "\n", " [[nan, nan, nan, ..., nan, nan, nan],\n", " [nan, nan, nan, ..., nan, nan, nan],\n", " [nan, nan, nan, ..., nan, nan, nan],\n", " ...,\n", " [nan, nan, nan, ..., nan, nan, nan],\n", " [nan, nan, nan, ..., nan, nan, nan],\n", " [nan, nan, nan, ..., nan, nan, nan]]],\n", " shape=(425, 1795, 906), dtype=float32)\n", "Coordinates:\n", " transverse_mercator object 8B '0.0'\n", " * wavelength (wavelength) float32 2kB 377.2 382.2 ... 2.501e+03\n", " * x (x) float64 7kB 2.992e+05 2.992e+05 ... 3.049e+05\n", " * y (y) float64 14kB 6.297e+06 6.297e+06 ... 6.286e+06\n", "Attributes:\n", " _QuantizeBitGroomNumberOfSignificantDigits: 5\n", " long_name: Surface hemispherical direct...\n", " orthorectified: True" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%time\n", "# read the data into memory\n", "ds.reflectance.compute()" ] }, { "cell_type": "markdown", "id": "b1909fcd-6113-497b-8b57-e3c9984bf58f", "metadata": {}, "source": [ "The **observation** and **radiance** data are not orthorectified, but the GLT tables are provided. The following code can be used to generate an orthorectified data array, if required.\n" ] }, { "cell_type": "code", "execution_count": 5, "id": "9956e066-989c-4d33-8b64-c58180aae1cd", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 1.37 s, sys: 151 ms, total: 1.52 s\n", "Wall time: 3.04 s\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.Dataset> Size: 7MB\n",
       "Dimensions:              (northing: 1795, easting: 906)\n",
       "Coordinates:\n",
       "  * northing             (northing) float64 14kB 6.297e+06 ... 6.286e+06\n",
       "    transverse_mercator  object 8B ...\n",
       "  * easting              (easting) float64 7kB 2.992e+05 2.992e+05 ... 3.049e+05\n",
       "    spatial_ref          int64 8B 0\n",
       "Data variables:\n",
       "    elev                 (northing, easting) float32 7MB nan nan nan ... nan nan\n",
       "Attributes: (12/23)\n",
       "    Conventions:                       CF-1.6\n",
       "    creator_name:                      Jet Propulsion Laboratory/California I...\n",
       "    creator_url:                       aviris.jpl.nasa.gov\n",
       "    date_created:                      2024-11-25T19:33:58Z\n",
       "    flight_line:                       ang20231022t092801\n",
       "    identifier_product_doi_authority:  https://doi.org\n",
       "    ...                                ...\n",
       "    sensor:                            Airborne Visible / Infrared Imaging Sp...\n",
       "    software_build_version:            002\n",
       "    summary:                           The Airborne Visible / Infrared Imagin...\n",
       "    time_coverage_end:                 2023-10-22T09:34:38Z\n",
       "    time_coverage_start:               2023-10-22T09:33:34Z\n",
       "    title:                             AVIRIS-NG L1B Observation Parameters (...
" ], "text/plain": [ " Size: 7MB\n", "Dimensions: (northing: 1795, easting: 906)\n", "Coordinates:\n", " * northing (northing) float64 14kB 6.297e+06 ... 6.286e+06\n", " transverse_mercator object 8B ...\n", " * easting (easting) float64 7kB 2.992e+05 2.992e+05 ... 3.049e+05\n", " spatial_ref int64 8B 0\n", "Data variables:\n", " elev (northing, easting) float32 7MB nan nan nan ... nan nan\n", "Attributes: (12/23)\n", " Conventions: CF-1.6\n", " creator_name: Jet Propulsion Laboratory/California I...\n", " creator_url: aviris.jpl.nasa.gov\n", " date_created: 2024-11-25T19:33:58Z\n", " flight_line: ang20231022t092801\n", " identifier_product_doi_authority: https://doi.org\n", " ... ...\n", " sensor: Airborne Visible / Infrared Imaging Sp...\n", " software_build_version: 002\n", " summary: The Airborne Visible / Infrared Imagin...\n", " time_coverage_end: 2023-10-22T09:34:38Z\n", " time_coverage_start: 2023-10-22T09:33:34Z\n", " title: AVIRIS-NG L1B Observation Parameters (..." ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%time\n", "# 2D example\n", "\n", "# OBS data\n", "ds = data[2].read_chunked()\n", "\n", "# This assumes nan is the nodata value\n", "line = ds.line.where(~ds.line.isnull(), -9999).astype(int)\n", "sample = ds.sample.where(~ds.sample.isnull(), -9999).astype(int)\n", "\n", "# Generate a nodata mask\n", "mask = (line != -9999) & (sample != -9999)\n", "\n", "# The tables have negative values where a nearest neighbor value was inserted, We need to switch these to posivitive in order to perform the broadcasting operation\n", "line = xr.where((line < 0) & (line != -9999), -line, line) \n", "sample = xr.where((sample < 0) & (sample != -9999), -sample, sample)\n", "\n", "valid_glt = mask.values\n", "\n", "# Create an output dataset, since this is a 2D array the shape of line or sample will be the shape of our output\n", "out_ds = np.zeros((line.shape[0], line.shape[1]), dtype=np.float32) + np.nan\n", "\n", "# load the data we want to orthorectify into memory\n", "ds_array = ds.elev.values\n", "\n", "# Peform the broadcasting operation. The larger chunk_size is the faster the operation will go, but more memory will be required.\n", "chunk_size = 500\n", "for x in range(0, valid_glt.shape[0], chunk_size):\n", " x = slice(x, min(x + chunk_size, valid_glt.shape[0]))\n", " y = valid_glt[x,:]\n", " out_ds[x][y] = ds_array[line.values[x][y] -1, sample.values[x][y] -1]\n", "\n", "\n", "# Use the outputs to create an Xarray dataset\n", "coords = {\n", " 'northing': ds.northing,\n", " 'easting': ds.easting,\n", " 'transverse_mercator': ds.transverse_mercator\n", "}\n", "\n", "data_vars = {\n", " 'elev': (['northing', 'easting'], out_ds),\n", "}\n", "\n", "ds_out = xr.Dataset(data_vars, coords=coords, attrs=ds.attrs)\n", "ds_out.rio.write_crs(ds_out.transverse_mercator.attrs['crs_wkt'], inplace=True)\n", "ds_out" ] }, { "cell_type": "code", "execution_count": 30, "id": "f9e699d0-b781-489a-8f52-0448f9cc5742", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 9.92 s, sys: 2.58 s, total: 12.5 s\n", "Wall time: 16.7 s\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.Dataset> Size: 3GB\n",
       "Dimensions:              (wavelength: 425, northing: 1795, easting: 906)\n",
       "Coordinates:\n",
       "    transverse_mercator  object 8B ...\n",
       "  * wavelength           (wavelength) float32 2kB 377.2 382.2 ... 2.501e+03\n",
       "  * northing             (northing) float64 14kB 6.297e+06 ... 6.286e+06\n",
       "  * easting              (easting) float64 7kB 2.992e+05 2.992e+05 ... 3.049e+05\n",
       "    spatial_ref          int64 8B 0\n",
       "Data variables:\n",
       "    radiance             (wavelength, northing, easting) float32 3GB nan ... nan\n",
       "    fwhm                 (wavelength) float32 2kB dask.array<chunksize=(425,), meta=np.ndarray>\n",
       "Attributes: (12/23)\n",
       "    Conventions:                       CF-1.6\n",
       "    creator_name:                      Jet Propulsion Laboratory/California I...\n",
       "    creator_url:                       aviris.jpl.nasa.gov\n",
       "    date_created:                      2024-11-25T19:40:38Z\n",
       "    flight_line:                       ang20231022t092801\n",
       "    identifier_product_doi_authority:  https://doi.org\n",
       "    ...                                ...\n",
       "    sensor:                            Airborne Visible / Infrared Imaging Sp...\n",
       "    software_build_version:            002\n",
       "    summary:                           The Airborne Visible / Infrared Imagin...\n",
       "    time_coverage_end:                 2023-10-22T09:34:38Z\n",
       "    time_coverage_start:               2023-10-22T09:33:34Z\n",
       "    title:                             AVIRIS-NG L1B Calibrated Radiance (fli...
" ], "text/plain": [ " Size: 3GB\n", "Dimensions: (wavelength: 425, northing: 1795, easting: 906)\n", "Coordinates:\n", " transverse_mercator object 8B ...\n", " * wavelength (wavelength) float32 2kB 377.2 382.2 ... 2.501e+03\n", " * northing (northing) float64 14kB 6.297e+06 ... 6.286e+06\n", " * easting (easting) float64 7kB 2.992e+05 2.992e+05 ... 3.049e+05\n", " spatial_ref int64 8B 0\n", "Data variables:\n", " radiance (wavelength, northing, easting) float32 3GB nan ... nan\n", " fwhm (wavelength) float32 2kB dask.array\n", "Attributes: (12/23)\n", " Conventions: CF-1.6\n", " creator_name: Jet Propulsion Laboratory/California I...\n", " creator_url: aviris.jpl.nasa.gov\n", " date_created: 2024-11-25T19:40:38Z\n", " flight_line: ang20231022t092801\n", " identifier_product_doi_authority: https://doi.org\n", " ... ...\n", " sensor: Airborne Visible / Infrared Imaging Sp...\n", " software_build_version: 002\n", " summary: The Airborne Visible / Infrared Imagin...\n", " time_coverage_end: 2023-10-22T09:34:38Z\n", " time_coverage_start: 2023-10-22T09:33:34Z\n", " title: AVIRIS-NG L1B Calibrated Radiance (fli..." ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%time\n", "# 3D example\n", "\n", "# Radiance data\n", "ds = data[1].read_chunked()\n", "\n", "# This assumes nan is the nodata value\n", "line = ds.line.where(~ds.line.isnull(), -9999).astype(int)\n", "sample = ds.sample.where(~ds.sample.isnull(), -9999).astype(int)\n", "\n", "# Generate a nodata mask\n", "mask = (line != -9999) & (sample != -9999)\n", "\n", "# The tables have negative values where a nearest neighbor value was inserted, We need to switch these to posivitive in order to perform the broadcasting operation\n", "line = xr.where((line < 0) & (line != -9999), -line, line) \n", "sample = xr.where((sample < 0) & (sample != -9999), -sample, sample)\n", "\n", "valid_glt = mask.values\n", "\n", "# Create an output dataset, since this is for radiance, we want 425 bands and then the shape of line or sample\n", "out_ds = np.zeros((425, line.shape[0], line.shape[1]), dtype=np.float32) + np.nan\n", "\n", "# load the data we want to orthorectify into memory\n", "ds_array = ds.radiance.values\n", "\n", "# Peform the broadcasting operation. The larger chunk_size is the faster the oepration will go, but more memory will be required.\n", "chunk_size = 500\n", "for x in range(0, valid_glt.shape[0], chunk_size):\n", " x = slice(x, min(x + chunk_size, valid_glt.shape[0]))\n", " y = valid_glt[x,:]\n", " valid_y = np.where(valid_glt[slice(0, 10), :])[1]\n", " out_ds[:, x][:, y] = ds_array[:, line.values[x][y] -1, sample.values[x][y] -1]\n", "\n", "\n", "# Use the outputs to create an Xarray dataset\n", "coords = {\n", " 'wavelength': ds.wavelength,\n", " 'northing': ds.northing,\n", " 'easting': ds.easting,\n", " 'transverse_mercator': ds.transverse_mercator\n", "}\n", "\n", "data_vars = {\n", " 'radiance': (['wavelength', 'northing', 'easting'], out_ds),\n", " 'fwhm': ds.fwhm\n", "}\n", "\n", "ds_out = xr.Dataset(data_vars, coords=coords, attrs=ds.attrs)\n", "ds_out.rio.write_crs(ds_out.transverse_mercator.attrs['crs_wkt'], inplace=True)\n", "ds_out" ] }, { "cell_type": "markdown", "id": "280cb6f9-a18c-4af4-821e-3b934d2723dc", "metadata": {}, "source": [ "## 2. BioSCape Cropping Web Application\n", "\n", "**For reflectance data only**\n", "\n", "Users can perform the following actions with **BioSCAPE or EMIT data**:\n", "\n", "1. **Submit a GeoJSON**: This request returns the overlapping flightlines.\n", "2. **Retrieve Data Cropped**: This request returns cropped data in NetCDF format. Provide a flightline, subsection number, a GeoJSON, and an output file name.\n", "\n", "Check it out at [crop.bioscape.io](https://crop.bioscape.io).\n", "\n", "**Note**: A BioSCape SMCE username and password are required.\n", "\n", "**This application is in beta phase. The current user interface is basic and will be improved. Please report any issues via GitHub or Slack.**\n", "\n", "For more detailed information, visit the [User Guide](pages/cropping_app)\n" ] }, { "cell_type": "markdown", "id": "1eee5e61-4e4c-416b-b71c-c34bac518841", "metadata": {}, "source": [ "## 3. BioSCape Tools Python Library\n", "\n", "**For reflectance data only**\n", "\n", "The BioSCape Tools library allows users to perform the following actions with **BioSCAPE or EMIT data**:\n", "\n", "1. **Submit a GeoJSON**: This request returns the overlapping flightlines.\n", "2. **Retrieve Data Cropped**: This request returns cropped data in NetCDF format. Provide a flightline, subsection number, a GeoJSON, and an output file name.\n", "\n", "The BioSCape Tools library can be used outside of the SMCE. **A BioSCape SMCE username and password are required.**\n", "\n", "### Installation\n", "\n", "The library can be installed via pip:\n", "\n", "```bash\n", "pip install bioscape_tools\n", "```\n", "\n", "It can also be installed via the Conda Store:\n", "\n", "1. Select and edit your desired environment.\n", "2. Choose YAML editing mode.\n", "3. Add the following lines:\n", "\n", "```yaml\n", "- pip\n", " - bioscape-tools\n", "```\n", "\n", "4. Build **Note: It will not show up in the Conda Store UI, but it will still be installed**\n", "\n", "Please report any bugs via github issues or via Slack" ] }, { "cell_type": "code", "execution_count": 27, "id": "a7ff9dfe-a8ac-4b5a-bf7f-174832106e85", "metadata": {}, "outputs": [], "source": [ "from bioscape_tools import Bioscape, Emit\n", "\n", "OUTPATH = 'test.nc'\n", "GEOJSON_FILE = \"path_to_your_geojson\"" ] }, { "cell_type": "markdown", "id": "05d39d06-c37a-4998-a375-8bbfc00019a1", "metadata": {}, "source": [ "Use your BioSCape SMCE username and password to get credentials." ] }, { "cell_type": "code", "execution_count": 7, "id": "4d3e7364-91e4-43f1-8099-bee96a6a166f", "metadata": {}, "outputs": [], "source": [ "b = Bioscape(persist=True)" ] }, { "cell_type": "markdown", "id": "8d0bcd38-2bd5-4c52-af7d-d9e1769d6cdf", "metadata": {}, "source": [ "Find overlapping data." ] }, { "cell_type": "code", "execution_count": 8, "id": "db9f619c-e507-444d-9ad3-c6c09fbab549", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
geometryflightlinesubsection
0POLYGON ((18.75585 -32.97929, 18.75674 -32.944...ang20231022t092801000
1POLYGON ((18.78096 -33.00205, 18.78218 -32.953...ang20231022t094938035
2POLYGON ((18.77505 -32.96264, 18.77627 -32.913...ang20231022t094938036
3POLYGON ((18.71476 -32.98757, 18.71623 -32.930...ang20231029t120919045
4POLYGON ((18.73772 -32.9587, 18.73861 -32.9237...ang20231029t123011001
5POLYGON ((18.74498 -32.98879, 18.74588 -32.953...ang20231029t123011002
\n", "
" ], "text/plain": [ " geometry flightline \\\n", "0 POLYGON ((18.75585 -32.97929, 18.75674 -32.944... ang20231022t092801 \n", "1 POLYGON ((18.78096 -33.00205, 18.78218 -32.953... ang20231022t094938 \n", "2 POLYGON ((18.77505 -32.96264, 18.77627 -32.913... ang20231022t094938 \n", "3 POLYGON ((18.71476 -32.98757, 18.71623 -32.930... ang20231029t120919 \n", "4 POLYGON ((18.73772 -32.9587, 18.73861 -32.9237... ang20231029t123011 \n", "5 POLYGON ((18.74498 -32.98879, 18.74588 -32.953... ang20231029t123011 \n", "\n", " subsection \n", "0 000 \n", "1 035 \n", "2 036 \n", "3 045 \n", "4 001 \n", "5 002 " ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "flightlines = b.get_overlap(GEOJSON_FILE)\n", "flightlines" ] }, { "cell_type": "markdown", "id": "81261b0a-fe8e-4716-afb3-3a40d7111ff0", "metadata": {}, "source": [ "Crop and retrieve the data." ] }, { "cell_type": "code", "execution_count": 5, "id": "6c9c0134-bb0e-452b-8656-bec96c8cae66", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.Dataset> Size: 500MB\n",
       "Dimensions:      (wavelength: 425, x: 479, y: 307)\n",
       "Coordinates:\n",
       "  * wavelength   (wavelength) float32 2kB 377.2 382.2 ... 2.496e+03 2.501e+03\n",
       "  * x            (x) float64 4kB 2.909e+05 2.909e+05 ... 2.939e+05 2.939e+05\n",
       "  * y            (y) float64 2kB 6.352e+06 6.352e+06 ... 6.35e+06 6.35e+06\n",
       "    spatial_ref  int64 8B ...\n",
       "Data variables:\n",
       "    reflectance  (y, wavelength, x) float32 250MB ...\n",
       "    uncertainty  (y, wavelength, x) float32 250MB ...\n",
       "Attributes: (12/19)\n",
       "    description:          L2A Analytyical per-pixel surface retrieval\n",
       "    samples:              719\n",
       "    lines:                615\n",
       "    bands:                425\n",
       "    header offset:        0\n",
       "    file type:            ENVI Standard\n",
       "    ...                   ...\n",
       "    band names:           ['channel_0', 'channel_1', 'channel_2', 'channel_3'...\n",
       "    masked pixel noise:   2.753511428833008\n",
       "    ang pge input files:  bad_element_file=/scratch/achlus/airborne_sds/ang_l...\n",
       "    ang pge run command:  /scratch/achlus/airborne_sds/ang_l1b_radiance/emit-...\n",
       "    bbl:                  ['0', '1', '1', '1', '1', '1', '1', '1', '1', '1', ...\n",
       "    data ignore value:    -9999
" ], "text/plain": [ " Size: 500MB\n", "Dimensions: (wavelength: 425, x: 479, y: 307)\n", "Coordinates:\n", " * wavelength (wavelength) float32 2kB 377.2 382.2 ... 2.496e+03 2.501e+03\n", " * x (x) float64 4kB 2.909e+05 2.909e+05 ... 2.939e+05 2.939e+05\n", " * y (y) float64 2kB 6.352e+06 6.352e+06 ... 6.35e+06 6.35e+06\n", " spatial_ref int64 8B ...\n", "Data variables:\n", " reflectance (y, wavelength, x) float32 250MB ...\n", " uncertainty (y, wavelength, x) float32 250MB ...\n", "Attributes: (12/19)\n", " description: L2A Analytyical per-pixel surface retrieval\n", " samples: 719\n", " lines: 615\n", " bands: 425\n", " header offset: 0\n", " file type: ENVI Standard\n", " ... ...\n", " band names: ['channel_0', 'channel_1', 'channel_2', 'channel_3'...\n", " masked pixel noise: 2.753511428833008\n", " ang pge input files: bad_element_file=/scratch/achlus/airborne_sds/ang_l...\n", " ang pge run command: /scratch/achlus/airborne_sds/ang_l1b_radiance/emit-...\n", " bbl: ['0', '1', '1', '1', '1', '1', '1', '1', '1', '1', ...\n", " data ignore value: -9999" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bioscape_data = b.crop_flightline(flightline=\"ang20231022t092801\", subsection=000, geojson=GEOJSON_FILE, output_path=None, mask_and_scale=True)\n", "bioscape_data" ] }, { "cell_type": "markdown", "id": "ccaf953f-6c79-4b29-ac9d-e32c8ec84aa2", "metadata": {}, "source": [ "Optionally, you can download the data by providing an output path." ] }, { "cell_type": "code", "execution_count": 9, "id": "4fa10e0e-be33-4169-a856-895cfb6498d1", "metadata": {}, "outputs": [], "source": [ "b.crop_flightline(flightline=\"ang20231022t092801\", subsection=000, geojson=GEOJSON_FILE, output_path=OUTPATH, mask_and_scale=True)" ] }, { "cell_type": "markdown", "id": "ebb50eb4-a0eb-435b-9650-557c8b4cf38a", "metadata": {}, "source": [ "The same operations can be preformed on EMIT data." ] }, { "cell_type": "code", "execution_count": null, "id": "bae8c361-9e9d-44dd-9030-e33ff74abfc2", "metadata": {}, "outputs": [], "source": [ "e = Emit(persist=True)" ] }, { "cell_type": "code", "execution_count": 19, "id": "a9ef3c6c-f6dd-47f9-8906-4a665055e9d9", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "
\n", " \n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", " \"Data\n", "
\n", "
\n", "
\n", "
\n", " " ], "text/plain": [ "Collection: {'ShortName': 'EMITL2ARFL', 'Version': '001'}\n", "Spatial coverage: {'HorizontalSpatialDomain': {'Geometry': {'GPolygons': [{'Boundary': {'Points': [{'Longitude': 18.697702407836914, 'Latitude': -32.20246887207031}, {'Longitude': 18.177148818969727, 'Latitude': -32.81927490234375}, {'Longitude': 18.829687118530273, 'Latitude': -33.36997985839844}, {'Longitude': 19.35024070739746, 'Latitude': -32.753173828125}, {'Longitude': 18.697702407836914, 'Latitude': -32.20246887207031}]}}]}}}\n", "Temporal coverage: {'RangeDateTime': {'BeginningDateTime': '2024-02-23T13:28:51Z', 'EndingDateTime': '2024-02-23T13:29:03Z'}}\n", "Size(MB): 3580.370819091797\n", "Data: ['https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20240223T132851_2405409_006/EMIT_L2A_RFL_001_20240223T132851_2405409_006.nc', 'https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20240223T132851_2405409_006/EMIT_L2A_RFLUNCERT_001_20240223T132851_2405409_006.nc', 'https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20240223T132851_2405409_006/EMIT_L2A_MASK_001_20240223T132851_2405409_006.nc']" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "emit_data = e.get_overlap(GEOJSON_FILE, temporal_range=(\"2024-01-01\", \"2024-10-01\"), cloud_cover=(0,10))\n", "emit_data[0]" ] }, { "cell_type": "code", "execution_count": 20, "id": "8af72157-c846-4f99-acca-ca70323109d2", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.Dataset> Size: 2MB\n",
       "Dimensions:           (latitude: 33, longitude: 61, wavelengths: 285)\n",
       "Coordinates:\n",
       "  * wavelengths       (wavelengths) float32 1kB 381.0 388.4 ... 2.493e+03\n",
       "    fwhm              (wavelengths) float32 1kB ...\n",
       "    good_wavelengths  (wavelengths) float64 2kB ...\n",
       "  * latitude          (latitude) float64 264B -32.95 -32.95 ... -32.97 -32.97\n",
       "  * longitude         (longitude) float64 488B 18.76 18.76 18.76 ... 18.79 18.8\n",
       "    elev              (latitude, longitude) float32 8kB ...\n",
       "    spatial_ref       int64 8B ...\n",
       "Data variables:\n",
       "    reflectance       (latitude, longitude, wavelengths) float32 2MB ...\n",
       "Attributes: (12/41)\n",
       "    ncei_template_version:             NCEI_NetCDF_Swath_Template_v2.0\n",
       "    summary:                           The Earth Surface Mineral Dust Source ...\n",
       "    keywords:                          Imaging Spectroscopy, minerals, EMIT, ...\n",
       "    Conventions:                       CF-1.63\n",
       "    sensor:                            EMIT (Earth Surface Mineral Dust Sourc...\n",
       "    instrument:                        EMIT\n",
       "    ...                                ...\n",
       "    geotransform:                      [ 1.87625957e+01  5.42232520e-04  0.00...\n",
       "    day_night_flag:                    Day\n",
       "    title:                             EMIT L2A Estimated Surface Reflectance...\n",
       "    granule_id:                        EMIT_L2A_RFL_001_20240223T132851_24054...\n",
       "    subset_downtrack_range:            [840 877]\n",
       "    subset_crosstrack_range:           [414 450]
" ], "text/plain": [ " Size: 2MB\n", "Dimensions: (latitude: 33, longitude: 61, wavelengths: 285)\n", "Coordinates:\n", " * wavelengths (wavelengths) float32 1kB 381.0 388.4 ... 2.493e+03\n", " fwhm (wavelengths) float32 1kB ...\n", " good_wavelengths (wavelengths) float64 2kB ...\n", " * latitude (latitude) float64 264B -32.95 -32.95 ... -32.97 -32.97\n", " * longitude (longitude) float64 488B 18.76 18.76 18.76 ... 18.79 18.8\n", " elev (latitude, longitude) float32 8kB ...\n", " spatial_ref int64 8B ...\n", "Data variables:\n", " reflectance (latitude, longitude, wavelengths) float32 2MB ...\n", "Attributes: (12/41)\n", " ncei_template_version: NCEI_NetCDF_Swath_Template_v2.0\n", " summary: The Earth Surface Mineral Dust Source ...\n", " keywords: Imaging Spectroscopy, minerals, EMIT, ...\n", " Conventions: CF-1.63\n", " sensor: EMIT (Earth Surface Mineral Dust Sourc...\n", " instrument: EMIT\n", " ... ...\n", " geotransform: [ 1.87625957e+01 5.42232520e-04 0.00...\n", " day_night_flag: Day\n", " title: EMIT L2A Estimated Surface Reflectance...\n", " granule_id: EMIT_L2A_RFL_001_20240223T132851_24054...\n", " subset_downtrack_range: [840 877]\n", " subset_crosstrack_range: [414 450]" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "e.crop_scene(geojson=GEOJSON_FILE, granule_ur=emit_data[0].granule_ur, out_path=None, mask_and_scale=True)" ] }, { "cell_type": "code", "execution_count": 23, "id": "622cdbc8-4caa-49e2-9bd4-1aeb1b1b0cce", "metadata": {}, "outputs": [], "source": [ "OUTPATH = 'test.nc'\n", "e.crop_scene(geojson=GEOJSON_FILE, granule_ur=emit_data[0].granule_ur, output_path=OUTPATH, mask_and_scale=True)" ] }, { "cell_type": "markdown", "id": "b8dffc91-d1d3-4fa3-88c4-922007854b4f", "metadata": {}, "source": [ "## 4. Direct S3 Access\n", "\n", "**All data can be accessed via direct S3 access, but please note that read times may be significantly slower, especially for larger files. Therefore, this method is not recommended for working with large datasets.**\n" ] }, { "cell_type": "code", "execution_count": 1, "id": "51f0d61d-a367-4447-9680-772e1c692e7f", "metadata": {}, "outputs": [], "source": [ "import rioxarray as rxr\n", "import os\n", "import s3fs" ] }, { "cell_type": "code", "execution_count": 2, "id": "d0e4d420-ab27-4bdd-ad90-1c39bc07dbff", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['bioscape-data/AVNG',\n", " 'bioscape-data/AVNG_V2',\n", " 'bioscape-data/BioSCapeVegPolys2023_10_18',\n", " 'bioscape-data/BioSCapeVegPolys2023_10_18.geoparquet',\n", " 'bioscape-data/LVIS',\n", " 'bioscape-data/PRISM',\n", " 'bioscape-data/bioscape_avng.yaml',\n", " 'bioscape-data/bioscape_avng_v2.yaml']" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s3 = s3fs.S3FileSystem(anon=False)\n", "files = s3.ls('bioscape-data/')\n", "files" ] }, { "cell_type": "code", "execution_count": 15, "id": "375b30ec-d052-47c7-98e6-798f19620279", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray (band: 246, y: 6449, x: 918)> Size: 6GB\n",
       "[1456364772 values with dtype=float32]\n",
       "Coordinates:\n",
       "    wavelength   (band) float64 2kB 350.6 353.4 356.2 ... 1.043e+03 1.046e+03\n",
       "    fwhm         (band) float64 2kB 3.332 3.332 3.332 ... 3.327 3.314 3.326\n",
       "  * band         (band) int64 2kB 1 2 3 4 5 6 7 ... 240 241 242 243 244 245 246\n",
       "    xc           (y, x) float64 47MB 3.306e+05 3.306e+05 ... 3.261e+05 3.261e+05\n",
       "    yc           (y, x) float64 47MB 6.24e+06 6.24e+06 ... 6.208e+06 6.208e+06\n",
       "    spatial_ref  int64 8B 0\n",
       "Dimensions without coordinates: y, x\n",
       "Attributes: (12/263)\n",
       "    wavelength_units:   Nanometers\n",
       "    Band_1:             350.5548293 Nanometers\n",
       "    Band_2:             353.3850859 Nanometers\n",
       "    Band_3:             356.21539889999997 Nanometers\n",
       "    Band_4:             359.045768 Nanometers\n",
       "    Band_5:             361.8761934 Nanometers\n",
       "    ...                 ...\n",
       "    file_type:          ENVI\n",
       "    data_type:          4\n",
       "    interleave:         bil\n",
       "    byte_order:         0\n",
       "    smoothing_factors:   1.0 , 1.0 , 1.0 , 1.0 , 1.0 , 1.0 , 1.0 , 1.0 , 1.0 ...\n",
       "    data_ignore_value:  -9999
" ], "text/plain": [ " Size: 6GB\n", "[1456364772 values with dtype=float32]\n", "Coordinates:\n", " wavelength (band) float64 2kB 350.6 353.4 356.2 ... 1.043e+03 1.046e+03\n", " fwhm (band) float64 2kB 3.332 3.332 3.332 ... 3.327 3.314 3.326\n", " * band (band) int64 2kB 1 2 3 4 5 6 7 ... 240 241 242 243 244 245 246\n", " xc (y, x) float64 47MB 3.306e+05 3.306e+05 ... 3.261e+05 3.261e+05\n", " yc (y, x) float64 47MB 6.24e+06 6.24e+06 ... 6.208e+06 6.208e+06\n", " spatial_ref int64 8B 0\n", "Dimensions without coordinates: y, x\n", "Attributes: (12/263)\n", " wavelength_units: Nanometers\n", " Band_1: 350.5548293 Nanometers\n", " Band_2: 353.3850859 Nanometers\n", " Band_3: 356.21539889999997 Nanometers\n", " Band_4: 359.045768 Nanometers\n", " Band_5: 361.8761934 Nanometers\n", " ... ...\n", " file_type: ENVI\n", " data_type: 4\n", " interleave: bil\n", " byte_order: 0\n", " smoothing_factors: 1.0 , 1.0 , 1.0 , 1.0 , 1.0 , 1.0 , 1.0 , 1.0 , 1.0 ...\n", " data_ignore_value: -9999" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "rxr.open_rasterio(os.path.join('s3://', 'bioscape-data/PRISM/L2/prm20231022t141344_rfl_ort'))" ] }, { "cell_type": "code", "execution_count": null, "id": "c8209458-1f2c-4207-9c74-c70544abe21a", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "edlang-edlang-read_test", "language": "python", "name": "conda-env-edlang-edlang-read_test-py" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.1" } }, "nbformat": 4, "nbformat_minor": 5 }