Aligning Data with Different Resolutions

This example demonstrates how to align data from different sources (EMIT, HLS, SHIFT) for analysis. Using rioxarray we can reproject the different data sources to the same CRS and shape or resolution.

This example uses code from the EMIT-Data-Resources repository.

[2]:
import requests
import s3fs
import sys
import rasterio as rio
import xarray as xr
sys.path.append('/efs/edlang1/EMIT-Data-Resources/modules/')
from emit_tools import emit_xarray
sys.path.append('/efs/SHIFT-Python-Utilities/')
from shift_python_utilities.intake_shift import shift_catalog
import geopandas as gpd
from shapely.geometry import Polygon


import warnings
warnings.filterwarnings('ignore')

Generate temporary credentials for Earth Data Search S3 access

[3]:
s3_cred_endpoint = {
'podaac':'https://archive.podaac.earthdata.nasa.gov/s3credentials',
'gesdisc': 'https://data.gesdisc.earthdata.nasa.gov/s3credentials',
'lpdaac':'https://data.lpdaac.earthdatacloud.nasa.gov/s3credentials',
'ornldaac': 'https://data.ornldaac.earthdata.nasa.gov/s3credentials',
'ghrcdaac': 'https://data.ghrc.earthdata.nasa.gov/s3credentials'
}


def get_temp_creds(provider):
    return requests.get(s3_cred_endpoint[provider]).json()

temp_creds_req = get_temp_creds('lpdaac')

fs_s3 = s3fs.S3FileSystem(anon=False,
                          key=temp_creds_req['accessKeyId'],
                          secret=temp_creds_req['secretAccessKey'],
                          token=temp_creds_req['sessionToken'])

Create a Geodataframe with a polygon

[4]:
shp = Polygon([(-120.44686132950059, 34.44238828271541),
               (-120.44686132950059, 34.48046721892582),
               (-120.41425043549059, 34.48046721892582),
               (-120.41425043549059, 34.44238828271541)])

geodf = gpd.GeoDataFrame(geometry=[shp], crs=4326)
geodf
[4]:
geometry
0 POLYGON ((-120.44686 34.44239, -120.44686 34.4...
  • Using Earth Data Search find the S3 link associated with the imagery of interest

  • Retrieve orthorectified the data using fsspec and the emit_xarray module and reorder the dimensions to (band, y, x)

    • Note: If you are getting an error related to not being able to identify the file type you may need to update the version of xarray you are using. The EMIT documentation recommends xarray 2022.12.0 or newer. In python running

      • xr.__version__ will give you your xarray version

      • running pip install xarray==2022.12.0 will install the minimum required version

      • Another work around is to rerun the cell and see if it works the second time. This has been reported to work.

  • Clip the orthorectified data using the Geodataframe

[6]:
s3_url = "s3://lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20230422T195924_2311213_002/EMIT_L2A_RFL_001_20230422T195924_2311213_002.nc"
# Open s3 url
fp = fs_s3.open(s3_url, mode='rb')
# Open dataset with xarray
ds_emit = emit_xarray(fp, ortho=True).swap_dims({"bands":"wavelengths"}).transpose("wavelengths","latitude","longitude")
ds_emit = ds_emit.rio.clip(geodf.to_crs(ds_emit.rio.crs).geometry.values, all_touched=True)
ds_emit
[6]:
<xarray.Dataset>
Dimensions:           (latitude: 71, longitude: 61, wavelengths: 285)
Coordinates:
  * latitude          (latitude) float64 34.48 34.48 34.48 ... 34.44 34.44 34.44
  * longitude         (longitude) float64 -120.4 -120.4 -120.4 ... -120.4 -120.4
    fwhm              (wavelengths) float32 8.415 8.415 8.415 ... 8.807 8.809
    good_wavelengths  (wavelengths) float32 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0
  * wavelengths       (wavelengths) float32 381.0 388.4 ... 2.486e+03 2.493e+03
    spatial_ref       int64 0
Data variables:
    reflectance       (wavelengths, latitude, longitude) float32 0.0259 ... 0...
Attributes: (12/38)
    ncei_template_version:             NCEI_NetCDF_Swath_Template_v2.0
    summary:                           The Earth Surface Mineral Dust Source ...
    keywords:                          Imaging Spectroscopy, minerals, EMIT, ...
    Conventions:                       CF-1.63
    sensor:                            EMIT (Earth Surface Mineral Dust Sourc...
    instrument:                        EMIT
    ...                                ...
    southernmost_latitude:             33.98409945295017
    spatialResolution:                 0.000542232520256367
    spatial_ref:                       GEOGCS["WGS 84",DATUM["WGS_1984",SPHER...
    geotransform:                      [-1.20995667e+02  5.42232520e-04 -0.00...
    day_night_flag:                    Day
    title:                             EMIT L2A Estimated Surface Reflectance...
  • Retrieve the HLS bands of interest and concatenate them into one Xaray Dataset

  • Clip the data with the Geodataframe

[7]:
s3_urls =  [
    "s3://lp-prod-protected/HLSS30.020/HLS.S30.T10SGD.2023121T183919.v2.0/HLS.S30.T10SGD.2023121T183919.v2.0.B04.tif",
    "s3://lp-prod-protected/HLSS30.020/HLS.S30.T10SGD.2023121T183919.v2.0/HLS.S30.T10SGD.2023121T183919.v2.0.B03.tif",
    "s3://lp-prod-protected/HLSS30.020/HLS.S30.T10SGD.2023121T183919.v2.0/HLS.S30.T10SGD.2023121T183919.v2.0.B02.tif",
]

ds_hls = []
for url in s3_urls:
    fp = fs_s3.open(url, mode='rb')
    ds_hls += [xr.open_dataset(fp, engine='rasterio')]

ds_hls = xr.concat(ds_hls, 'band')
ds_hls = ds_hls.rio.clip(geodf.to_crs(ds_hls.rio.crs).geometry.values, all_touched=True)
ds_hls
[7]:
<xarray.Dataset>
Dimensions:      (y: 145, x: 104, band: 3)
Coordinates:
  * y            (y) float64 3.818e+06 3.818e+06 ... 3.814e+06 3.814e+06
  * x            (x) float64 7.345e+05 7.345e+05 ... 7.375e+05 7.376e+05
  * band         (band) int64 1 1 1
    spatial_ref  int64 0
Data variables:
    band_data    (band, y, x) float32 nan nan nan nan nan ... nan nan nan nan
  • Using the SHIFT Python Utilites library retireve the gridded data, select a date and reorder the dimensions (band, y, x)

  • Assign the CRS from the metadata to the dataset

  • Clip the data with the Geodataframe

[8]:
cat = shift_catalog()
ds_shift = cat.aviris_v1_gridded.read_chunked().sel(time='2022-04-29').transpose('wavelength', 'y', 'x')
ds_shift.rio.write_crs(rio.CRS.from_wkt(",".join(ds_shift.attrs['coordinate system string'])), inplace=True)
ds_shift = ds_shift.rio.clip(geodf.to_crs(ds_shift.rio.crs).geometry.values, all_touched=True)
ds_shift
[8]:
<xarray.Dataset>
Dimensions:      (y: 861, x: 622, wavelength: 425)
Coordinates:
  * y            (y) float64 3.818e+06 3.818e+06 ... 3.814e+06 3.814e+06
  * x            (x) float64 7.345e+05 7.345e+05 ... 7.376e+05 7.376e+05
    time         datetime64[ns] 2022-04-29
  * wavelength   (wavelength) float32 377.2 382.2 387.2 ... 2.496e+03 2.501e+03
    spatial_ref  int64 0
Data variables:
    reflectance  (wavelength, y, x) float32 dask.array<chunksize=(425, 1, 622), meta=np.ndarray>
Attributes: (12/13)
    description:               flight_products/20220224/box_mosaics/box_rfl_p...
    samples:                   13739
    lines:                     12023
    bands:                     425
    header offset:             0
    file type:                 ENVI Standard
    ...                        ...
    interleave:                bil
    byte order:                0
    map info:                  ['UTM', '1', '1', '717720', '3865865', '5', '5...
    coordinate system string:  ['PROJCS["WGS_1984_UTM_Zone_10N"', 'GEOGCS["GC...
    wavelength:                ['377.1956495', '382.20564950000005', '387.215...
    fwhm:                      ['5.57', '5.58', '5.58', '5.58', '5.5900000000...

Reproject the HLS dataset and the SHIFT data to the same CRS and resolution as the EMIT data

[9]:
ds_hls = ds_hls.rio.reproject(dst_crs=ds_emit.rio.crs, resolution=ds_emit.rio.transform()._scaling)
ds_shift = ds_shift.rio.reproject(dst_crs=ds_emit.rio.crs, resolution=ds_emit.rio.transform()._scaling)
ds_shift
[9]:
<xarray.Dataset>
Dimensions:      (x: 65, y: 73, wavelength: 425)
Coordinates:
  * x            (x) float64 -120.4 -120.4 -120.4 ... -120.4 -120.4 -120.4
  * y            (y) float64 34.48 34.48 34.48 34.48 ... 34.44 34.44 34.44 34.44
  * wavelength   (wavelength) float32 377.2 382.2 387.2 ... 2.496e+03 2.501e+03
    time         datetime64[ns] 2022-04-29
    spatial_ref  int64 0
Data variables:
    reflectance  (wavelength, y, x) float32 3.403e+38 3.403e+38 ... 3.403e+38
Attributes: (12/13)
    description:               flight_products/20220224/box_mosaics/box_rfl_p...
    samples:                   13739
    lines:                     12023
    bands:                     425
    header offset:             0
    file type:                 ENVI Standard
    ...                        ...
    interleave:                bil
    byte order:                0
    map info:                  ['UTM', '1', '1', '717720', '3865865', '5', '5...
    coordinate system string:  ['PROJCS["WGS_1984_UTM_Zone_10N"', 'GEOGCS["GC...
    wavelength:                ['377.1956495', '382.20564950000005', '387.215...
    fwhm:                      ['5.57', '5.58', '5.58', '5.58', '5.5900000000...

Reproject the HLS dataset and the SHIFT data to the same CRS and shape as the EMIT data

[10]:
shape = ds_emit.dims[ds_emit.rio.x_dim], ds_emit.dims[ds_emit.rio.y_dim]
ds_hls = ds_hls.rio.reproject(dst_crs=ds_emit.rio.crs, shape=shape)
ds_shift = ds_shift.rio.reproject(dst_crs=ds_emit.rio.crs, shape=shape)
ds_shift
[10]:
<xarray.Dataset>
Dimensions:      (x: 71, y: 61, wavelength: 425)
Coordinates:
  * x            (x) float64 -120.4 -120.4 -120.4 ... -120.4 -120.4 -120.4
  * y            (y) float64 34.48 34.48 34.48 34.48 ... 34.44 34.44 34.44 34.44
  * wavelength   (wavelength) float32 377.2 382.2 387.2 ... 2.496e+03 2.501e+03
    time         datetime64[ns] 2022-04-29
    spatial_ref  int64 0
Data variables:
    reflectance  (wavelength, y, x) float32 3.403e+38 3.403e+38 ... 3.403e+38
Attributes: (12/13)
    description:               flight_products/20220224/box_mosaics/box_rfl_p...
    samples:                   13739
    lines:                     12023
    bands:                     425
    header offset:             0
    file type:                 ENVI Standard
    ...                        ...
    interleave:                bil
    byte order:                0
    map info:                  ['UTM', '1', '1', '717720', '3865865', '5', '5...
    coordinate system string:  ['PROJCS["WGS_1984_UTM_Zone_10N"', 'GEOGCS["GC...
    wavelength:                ['377.1956495', '382.20564950000005', '387.215...
    fwhm:                      ['5.57', '5.58', '5.58', '5.58', '5.5900000000...