Message ID | cover.1558030802.git.matvore@google.com (mailing list archive) |
---|---|
Headers | show |
Series | implement composite filters | expand |
> Here is a first stab at composite filters. It does not actually support omits, > but the biggest difficulties of the implementation are already addressed. So I > decided to send out an early version to give interested people an idea of just > what is needed to implement it, and then give them a chance to change my mind > (Jonathan T. especially was concerned about code complexity). Thanks - seeing these patches reduces my concerns significantly. A composite filter has LHS and RHS filters, either of which can be composite themselves, so we support compositing arbitrary numbers of filters. I see that the approach is to run LHS and RHS in lockstep, with our own handling of the result flags: - LOFR_MARK_SEEN is tracked for LHS and RHS separately. To support an arbitrary number of filters, we don't use object flags to track this, so we use oidsets instead. I don't think that the extra memory usage will be a problem (we already allocate more for all the struct object). If this is an issue in the future, we can switch to using object flags for the first N filters, and oidsets thereafter. - LOFR_SKIP_TREE is simulated if only one filter wants to skip the tree. - I haven't fully figured out LOFR_DO_SHOW yet. It seems to me that if an object appears twice in the walk, and the LHS says LOFR_DO_SHOW on the first occurrence, if the RHS says LOFR_DO_SHOW on the second occurrence, the object will be shown twice. But perhaps this isn't a problem - as it is, I think that a filter can call LOFR_DO_SHOW multiple times on the same object anyway. In any case, if this turns out to be a problem, it seems surmountable to me.
> On 2019/05/16, at 15:41, Jonathan Tan <jonathantanmy@google.com> wrote: > > Thanks - seeing these patches reduces my concerns significantly. A Thank you for taking a look :) > > - LOFR_MARK_SEEN is tracked for LHS and RHS separately. To support an > arbitrary number of filters, we don't use object flags to track this, > so we use oidsets instead. I don't think that the extra memory usage > will be a problem (we already allocate more for all the struct > object). If this is an issue in the future, we can switch to using > object flags for the first N filters, and oidsets thereafter. Yup. Another possibility that comes to mind is that when both the lhs and rhs seen sets contain the same object, we promote it to a combined set, and remove it from the individual ones at that time. > > - LOFR_SKIP_TREE is simulated if only one filter wants to skip the tree. > > - I haven't fully figured out LOFR_DO_SHOW yet. It seems to me that if > an object appears twice in the walk, and the LHS says LOFR_DO_SHOW on > the first occurrence, if the RHS says LOFR_DO_SHOW on the second > occurrence, the object will be shown twice. But perhaps this isn't a LOFR_DO_SHOW is only propagated upward from the combine: filter if both children indicate LOFR_DO_SHOW for the same object at the same point in the traversal (see the line that has "combined_result |= LOFR_DO_SHOW"). In the scenario you draw out, the object won’t be shown at all, since the first occurrence is filtered out for one reason, and the second is filtered out for a separate reason. This may happen when a sparse: filter includes a "deep" blob but excludes the same blob at a shallower point in the tree, and a tree: filter does the opposite with the same blob. Just thinking about it now for a moment, this seems intuitive and reasonable.