select catalog sources should check

Currently sdetect.catalog.select checks for objects that lie within a WCS region.

However, in some datasets the MUSE cubes contain large numbers of NaN (e.g. because the field was not observed at [0,90,180...] degree rotations.

I propose changing sdetect.catalog.select to accept a mask (i.e. white light image or exposure map) instead. Given a source coordinate, and a cutout region size, check the fraction of valid (non-zero / non-nan) values.

I propose the following code (which is relatively efficient, as long the cutout size is not huge > 50)

from scipy.spatial import cKDTree

    def select(self, mask, ra='RA', dec='DEC', box_size=1, cov_frac=1):
        """Select all sources from catalog within observed field
        and return a new catalog.

        Parameters
        ----------
        mask : `~mpdaf.obj.Image`
            Mask image with WCS (Non-zero values are valid)
            e.g. white light image, exposure map
        ra : str
            Name of the column that contains RA values in degrees.
        dec : str
            Name of the column that contains DEC values in degrees.
        box_size : int
            Width/height of cutout region box (pixels).
        cov_frac : float (0,1] (optional)
            Minimum covering fraction required

        Returns
        -------
        `mpdaf.sdetect.Catalog`
            The catalog with selected rows.

        """
        
        #find non-zero non-NaN values in mask 
        valid = (mask.data.data != 0) & ~np.isnan(mask.data.data)
    
        #generate KDTree of valid pixel coords
        pix_valid_y, pix_valid_x = np.where(valid)
        pix_valid = np.column_stack([pix_valid_y, pix_valid_x])
        tree_valid = cKDTree(pix_valid)

        #generate KDTree of test pixel coords
        sky_test = np.column_stack([self[dec].data, self[ra].data])
        pix_test = mask.wcs.sky2pix(sky_test, unit=u.deg)
        tree_test = cKDTree(pix_test)

        # inf Minkowski distance (i.e. square box)
        pairs = tree_test.query_ball_tree(tree_valid, box_size/2., np.inf)

        pairs = np.array(pairs)
        if pairs.ndim == 2:
            #every test point has the same number of pairs
            n_pairs = np.full(pairs.shape[0], pairs.shape[1], dtype=float)
        else:
            #pairs is a 1d array of lists
            len_ = np.vectorize(len)
            n_pairs = len_(pairs).astype(float)

        sel = (n_pairs / box_size**2.) >= cov_frac

        return self[sel]
Edited by David Carton